That was my experience at first, but then it gets corrupted somehow and you have to delete it and start over. Happened to me multiple times with RAID 1, so pretty sure it's a software error -- I eventually just gave up.
Agreed, exactly matches my experience over SMB. It works at first, then eventually refuses to work until you delete it start again from scratch. Eventually I just gave up.
I'm genuinely surprised that there isn't column-level shared-dictionary string compression built into SQLite, MySQL/MariaDB or Postgres, like this post is describing.
SQLite has no compression support, MySQL/MariaDB have page-level compression which doesn't work great and I've never seen anyone enable in production, and Postgres has per-value compression which is good for extremely long strings, but useless for short ones.
There are just so many string columns where values and substrings get repeated so much, whether you're storing names, URL's, or just regular text. And I have databases I know would be reduced in size by at least half.
Is it just really really hard to maintain a shared dictionary when constantly adding and deleting values? Is there just no established reference algorithm for it?
It still seems like it would be worth it even if it were something you had to manually set. E.g. wait until your table has 100,000 values, build a dictionary from those, and the dictionary is set in stone and used for the next 10,000,000 rows too unless you rebuild it in the future (which would be an expensive operation).
Strings in textual index are already compressed, with common prefix compression or other schemes. They are perfectly queryable. Not sure if their compression scheme is for index or data columns.
Global column dictionary has more complexity than normal. Now you are touching more pages than just the index pages and data page. The dictionary entries are sorted, so you need to worry about page expansion and contraction. They sidestep the problems by making it immutable, presumably building it up front by scanning all the data.
Not sure why using FSST is better than using a standard compression algorithm to compress the dictionary entries.
Storing the strings themselves as dictionary IDs is a good idea, as they can be processed quickly with SIMD.
There are some databases that can move an entire column into the index. But that's mostly going to work for schemas where the number of distinct values is <<< rowcount, so that you're effectively interning the rows.
1, complicates and slows down update, which is typically more important in OLTP than OLAP
2, is generally bad for high cardinality columns, which requires tracking cardinality to make decisions, which further complicates things.
lastly, additional operational complexity (like the table maintenance system you described in last paragraph) could reduce system reliability, and they might decide it's not worth the price or against their philosophy.
What evidence is there that it's hurting their brand?
Outside of HN I see zero complaints. And the situation has been going on for a while. I might not like it, but it seems perfectly fine for their brand as far as I can tell.
I see a lot of complaints outside of HN. For starters, not every nerd, and not even every Apple nerd, is on HN. Outside of that, even my non-tech-savvy acquaintances have been complaining as of late.
Is it enough of a damaged brand to hurt their profits as of now? Clearly not. But cracks are forming. Apple’s brand isn’t damaged when they’re seen as bad, but when they’re seen as the same as everyone else.
> Outside of that, even my non-tech-savvy acquaintances have been complaining as of late.
What do they say? I'm genuinely curious.
Because e.g. I don't have the slightest idea what percentage Microsoft takes on Xbox games, nor would it ever occur to me to complain about it. I know there's a business model there, but it's not something I think about. And I feel like that's the way people outside of tech feel about whatever percentage Apple takes out of its App Store. But what am I missing?
The important bit is “trading net profits for user happiness”. Non-tech-savvy users aren’t complaining about Apple’s margins for apps, but about the things Apple is doing to degrade their experience in the name of profit. Namely excessive ads on the App Store, on System Settings, and on Apple Apps themselves such as Music and Wallet (F1).
To the contrary. Time Machine is for consumers. Most people use it either with an external hard drive (good for iMacs that stay in one place) or a NAS (good for MacBooks). Apple even sold the AirPort Time Capsule at one point. Since that was discontinued, Synology NAS is the main consumer-friendly alternative. It comes with dedicated Time Machine support. It's supposed to be easy setup and forget. That's the whole point of using Synology instead of alternatives that require more technical expertise, that aren't designed for Time Machine support straight out of the box.
> [Synology] comes with dedicated Time Machine support
Your umbrance is with Synology, not Apple.
Apple raised security default configurations in Tahoe. That led to a config breakage with NAS devices which rely on relaxed security configurations.
I agree Apple should publish a technical note / changelog of config changes such as this one, but Apple has never implied to users they'd carry a support burden for any/all third-party hardware vendors. To the contrary, they've notified users that you're meant to consult with your NAS vendor for configuration steps:
> Check the documentation of your NAS device for help setting it up for use with Time Machine
I wasn't even assigning blame, did you mean to reply to someone else?
I was just replying to your point that a Synology NAS "is not what most users would consider 'consumer technology.'" It's firmly in the consumer technology category.
> TYCO Print is a printing service where professors can upload course files for TYCO to print out for students as they order. Shorter packets can cost around $20, while longer packets can cost upwards of $150 when ordered with the cheapest binding option.
This made sense a couple of decades ago. Today, it's just bizarre to be spending $150 on a phonebook-sized packet of reading materials. So much paper and toner.
No, the cost of the paper, toner, and binding is the cost of providing of a provably distraction-free environment.
To make it more palpable for an IT worker: "It's just bizarre to give a developer a room with a door, so much sheetrock and wood! Working with computers is what open-plan offices are for."
What kind of distraction are you getting on your Kindle...?
Also, the university isn't covering the cost here. The students are. And buying the Kindle would be cheaper than the printing cost of the packet itself.
So I stand by my point. If you don't want distraction, get Kindles.
And even iPads are pretty good. They tend to sit flat so you're not "hiding" your screen the way you can with a laptop or phone, and people often aren't using messaging or social apps on them so there are no incoming distractions.
> That means the article contained a plausible-sounding sentence, cited to a real, relevant-sounding source. But when you read the source it’s cited to, the information on Wikipedia does not exist in that specific source. When a claim fails verification, it’s impossible to tell whether the information is true or not.
This has been a rampant problem on Wikipedia always. I can't seem to find any indicator that this has increased recently? Because they're only even investigating articles flagged as potentially AI. So what's the control baseline rate here?
Applying correct citations is actually really hard work, even when you know the material thoroughly. I just assume people write stuff they know from their field, then mostly look to add the minimum number of plausible citations after the fact, and then most people never check them, and everyone seems to just accept it's better than nothing. But I also suppose it depends on how niche the page is, and which field it's in.
There was a fun example of this that happened live during a recent episode of the Changelog[1]. The hosts noted that they were incorrectly described as being "from GitHub" with a link to an episode of their podcast which didn't substantiate that claim. Their guest fixed the citation as they recorded[2].
The problems I've run into is both people giving fake citations (the citations don't actually justify the claim that's being made in the article), and people giving real citations, but if you dig into the source you realize it's coming from a crank.
It's a big blind spot among the editors as well. When this problem was brought up here in the past, with people saying that claims on Wikipedia shouldn't be believed unless people verify the sources themselves, several Wikipedia editors came in and said this wasn't a problem and Wikipedia was trustworthy.
It's hard to see it getting fixed when so many don't see it as an issue. And framing it as a non-issue misleads users about the accuracy of the site.
A common source of error is in articles for movies where it gives plot summaries. The plot summaries are very often written by people who didn't watch the movie but are trying to re-resemble the plot like a jigsaw puzzle from little bits they glean from written reviews, or worse just writing down whatever they assume to be the plot. Very often it seems like the fuck ups came from people who either weren't watching the movie carefully, or were just listening to the dialogue while not watching the screen, or simply lacked media literacy.
Example [SPOILERS]: the page for the movie Sorcerer claims that rough terrain caused a tire to pop. The movie never says that, the movie shows the tire popping (which results in the trucks cargo detonating). The next scene reveals the cause, but only to those paying attention; the bloody corpse of a bandito laying next to a submachine gun is shown in the rubble beside the road, and more banditos are there, very upset and quite nervous, to hijack the second truck. The obvious inference is that the first truck's tire was shot by the bandit to hijack/rob the truck. The tire didn't pop from rough terrain, the movie never says it did, it's just a conclusion you could get from not paying attention to the movie.
To me that sounds a bit like summaries made on the base of written movie scripts. A long time ago, I read a few scripts to movies I had never watched, and that's exactly the outcome: You get a rough idea what it's about and even get to recognise some memorable quotes, but there's little cohesion to it, for lack of all the important visual aspects and clues that tie it all together.
> The problems I've run into is both people giving fake citations (the citations don't actually justify the claim that's being made in the article), and people giving real citations, but if you dig into the source you realize it's coming from a crank.
Citations have become heavily weaponized across a lot of spaces on the internet. There was a period of time where we all learned that citations were correlated with higher quality arguments and Wikipedia’s [Citation Needed] even became a meme.
But the quacks and the agenda pushers realized that during casual internet browsing readers won’t actually read, let alone scrutinize the citation links, so it didn’t matter what you linked to. As long as the domain and title looked relevant it would be assumed correct. Anyone who did read the links might take so much time that the comment section would be saturated with competing comments by the time someone can respond with a real critique.
This has become a real problem on HN, too. Often when I see a comment with a dozen footnoted citations from PubMed they’re either misunderstandings what the study says or some times they even say the opposite of what the commenter claims.
The strategy is to just quickly search PubMed or other sources for keywords and then copy those into the post with the HN footnote citation format, knowing that most people won’t read or question it.
> but if you dig into the source you realize it's coming from a crank.
It is a dark sunday afternoon, Bob Park is sitting on his sofa as usual, drunk as usual, suddenly the TV reveals to him there to be something called the Paranormal (Twilight Zone music) ..instantly Bob knows there are no such things and adds a note to the incomprehensible mess of notes that one day will become his book. He downs one more Budweiser. In the distance lightning strikes a tree, Bob shouts You don't scare me! and shakes his fist. After a few more beers a miracle of inspiration descends and as if channeling, in the time span of 10 minutes he writes notes about Cold Fusion, Alternative Medicine, Faith Healing, Telepathy, Homeopathy, Parapsychology, Zener cards, the tooth fairy and father xmas. With much confidence he writes that non of them are real. It's been a really productive afternoon. It reminds him of times long gone back when he actually published many serious papers. He counts the remaining beers in his cooler and says to himself, in the next book I will need to take on god himself. The world needs to know, god is not real. I too will be the authority on that subject.
Curious what the point you're making here is. I don't know anything at all about Bob Park and whether he is a crank. But if you make your career doing the admirable work of debunking pseudo-science and nonsense theories, you would necessarily be linked to in discussions of those theories very, very frequently.
So maybe that's not a good description of him. But the link you posted is hardly dispositive.
True, but humans got a 20 year head start and I am willing to wager the overwhelming majority of extant flagrant errors are due to humans making shit up and no other human noticing and correcting it.
My go too example was the SDI page saying that brilliant pebble interceptors were to be made out of tungsten (completely illogical hogwash that doesn't even pass a basic sniff test.) This claim was added to the page in February of 2012 by a new wikipedia user, with no edit note accompanying the change nor any change to the sources and references. It stayed in the article until October 29th, 2025. And of course this misinformation was copied by other people and you can still find it being quoted, uncited, in other online publications. With an established track record of fact checking this poor, I honestly think LLMs are just pissing into the ocean.
> I am willing to wager the overwhelming majority of extant flagrant errors are due to humans making shit up
In general, I agree, but I wouldn't want to ascribe malfeasance ("making shit up") as the dominant problem.
I've seen two types of problems with references.
1. The reference is dead, which means I can't verify or refute the statement in the Wikipedia article. If I see that, I simply remove both the assertion and the reference from the wiki article.
2. The reference is live, but it almost confirms the statement in the wikipedia article, but whoever put it there over-interpreted the information in the reference. In that case, I correct the statement in the article, but I keep the ref.
Those are the two types of reference errors that I've come across.
And, yes, I've come across these types of errors long before LLMs.
Perhaps so. On the other hand, there's probably a lot of low hanging fruit they can pick just by reading the article, reading the cited sources, and making corrections. Humans can do this, but rarely do because it's so tedious.
I don't know how it will turn out. I don't have very high hopes, but I'm not certain it will all get worse either.
The entire point of the article is that LLMs cannot make accurate text, but ironically you claiming LLMs can do accurate texts illustrates your point about human reliability perfectly.
I guess the conclusion is there simply is no avenues to gain knowledge.
At some point you're forced to either believe that people have never heard of the concept of a force multiplier, or to return to Upton Sinclair's observation about getting people to believe in things that hurt their bottom line.
Because a difference in scale can become a difference in category. A handful of buggy crashes can be reduced to operator error, but as the car becomes widely adopted and analysis matures, it becomes clear that the fundamental design of the machine and its available use cases has fundamental flaws that cause a higher rate of operator error than desired. Therefore, cars are redesigned to be safer, laws and regulations are put in place, license systems are issued, and traffic calming and road design is considered.
Linkrot is a problem and edited articles are another. Because you can cite all you want, but if the underlying resource changes your foundation just melted away.
> Applying correct citations is actually really hard work
Not disagreeing - many existing articles on wikipedia have barely any references or citation at all and in some cases wrong citation or wrong conclusions. Like when an article says water molecules behave oddly and then the wikipedia article concluding that water molecules behave properly.
> This has been a rampant problem on Wikipedia always. I can't seem to find any indicator that this has increased recently? Because they're only even investigating articles flagged as potentially AI. So what's the control baseline rate here?
...y'know, I don't want to be that guy, but this actually seems like something AI could check for, and then flag for human review.
When I've checked Wikipedia citations I've found so much brazen deception - citations that obviously don't support the claim - that I don't have confidence in Wikipedia.
> Applying correct citations is actually really hard work, even when you know the material thoroughly.
Why do you find it hard? Scholarly references can be sources for fundamental claims, review articles are a big help too.
Also, I tend to add things to Wikipedia or other wikis when I come across something valuable rather than writing something and then trying to find a source (which also is problematic for other reasons). A good thing about crowd-sourcing is that you don't have to write the article all yourself or all at once; it can be very iterative and therefore efficient.
It's more like, a lot of stuff in Wikipedia articles is somewhat "general" knowledge in a given field, where it's not always exactly obvious how to cite it, because it's not something any specific person gets credit for "inventing". Like, if there's a particular theorem then sure you cite who came up with it, or the main graduate-level textbook it's taught in. But often it's just a particular technique or fact that just kind of "exists" in tons of places but there's no obvious single place to cite it from.
So it actually takes some work to find a good reference. Like you say, review articles can be a good source, survey articles or books. But it can take a surprising amount of effort to track down a place that actually says the exact thing. I literally just last week was helping a professor (leader in their field!) try to find a citation during peer review for their paper for an "obvious fact" in the field, that was in their introduction section. It was actually really challenging, like trying to produce a citation for "the sky is blue".
I remember, years ago, creating a Wikipedia article for a particular type of food in a particular country. You can buy it at literally every supermarket there. How the heck do you cite the food and facts about it? It just... is. Like... websites for manufacturers of the food aren't really citations. But nobody's describing the food in academic survey articles either. You're not going to link to Allrecipes. What do you do? It's not always obvious.
If you can buy the food at a supermarket, can't you cite a product page? Presumably that would include a description of the product. Or is that not good enough of a citation?
Retail product listing URLs change constantly. They're not great.
And then you usually want to describe how the food is used. E.g. suppose it's a dessert that's mainly popular at children's birthday parties. Everybody in the country knows that. But where are you going to find something written that says that? Something that's not just a random personal blog, but an actual published valid source?
Ideally you can find some kind of travel guide or book for expats or something with a food section that happens to list it, but if it's not a "top" food highly visible to tourists, then good luck.
I found several that were contradicting the claim they were supposed to support (in popular articles). I will never regain faith in wikipedia. Being an editor or just verifying information from wikipedia makes you hate it
What Google account? Is it personal Gmail? Or your academic account? Are you using this for personal reasons or professional or commercial reasons? What kind of payment method is attached? What was your level of usage? Any idea why you were suspended initially?
Because it could be that Google is reviewing your appeal and simply shadow-denying it, and you haven't provided the right information to make it look legit. E.g. if they think you're a spammer or mining crypto or they think you're creating additional free accounts to use free credits, they're obviously not going to tell you what makes them think that.
But if this is for university-related work, and your university purchases IT+cloud services from Google (as they probably do), talk to your IT department so they can get you in touch with their institution-level support. Obviously, for the attached Google sales rep, the last thing they want is a CS researcher losing access to GCP.
reply