I remember back when I was in secondary school, something commonly heard was
"Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong".
Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.
In the same sense, I can see that "Don't trust LLMs" will slowly fade away and people will blindly trust them.
> "Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong"
This comes from decades of teachers misremembering what the rule was, and eventually it morphed into the Wikipedia specific form we see today - the actual rule is that you cannot cite an encyclopaedia in an academic paper. full stop.
Wikipedia is an encyclopaedia and therefore should not be cited.
Wikipedia is the only encyclopaedia most people have used in the last 20 years, therefore Wikipedia = encyclopaedia in most people's minds.
There's nothing wrong with using an encyclopaedia for learning or introducing yourself to a topic (in fact this is what teachers told students to do). And there's nothing specifically wrong about Wikipedia either.
I remember all of our encyclopedias being decades out of date growing up. My parents bought a set of Encyclopedia Brittanica in 1976 or something like that, so by the time I was reading the Encyclopedia for research on papers in the late 90s and early 00s, it was without a doubt less factual than even the earliest incarnation of Wikipedia was.
Either way, you are correct, we weren't allowed to cite any encyclopedia, but they were meant to be jumping off points for papers. After Wikipedia launched when I was in 9th grade, we weren't allowed to even look at it (blocked from school computers).
I agree about blocking ChatGPT though. Kids (and most adults, honestly) aren't "smart" enough to understand the limitations, and trust it, and wikipedia, without question.
The original rule when I was a lad (when wikipedia was a baby) was, "don't trust stuff on the internet, especially Wikipedia where people can change it at will."
Today they might have better trust for Wikipedia-- and I know I use it as a source of truth for a lot of things-- but back in my day teachers were of the opinion that it couldn't be trusted. This was for like middle and high school, not college or university, so we would cite encyclopedias and that sort of thing, since we weren't reading cutting edge papers back then (maybe today kids read them, who knows).
Edit: Also, I think the GP comment was proven correct by all of the replies claiming that Wikipedia was never controversial because it was very clear to everyone my age when Wikipedia was created/founded that teachers didn't trust the internet nor Wikipedia at the time.
There was a period of time where Wikipedia was more scrutinized than print encyclopedias because people did not understand the power of having 1000s of experts and the occasional non-experts editing an entry for free instead of underpaying one sudo-expert. They couldn't comprehend how an open source encyclopedia would even work or trust that humans could effectively collaborate on the task. They imagined that 1000s of self-interested chaos monkeys would spend all of their energy destroying what 2-3 hard working people has spent hours creating instead of the inverse. Humans are very pessimistic about other humans. In my experience when humans are given the choice to cooperate or fight, most choose to cooperate.
All of that said, I trust Wikipedia more than I trust any LLMs but don't rely on either as a final source for understanding complex topics.
> the power of having 1000s of experts and the occasional non-experts editing an entry
When Wikipedia was founded, it was much easier to change articles without notice. There may not have been 1000s of experts at the time, like there are today. There's also other things that Wikipedia does to ensure articles are accurate today that they may not have done or been able to do decades ago.
I am not making a judgment of Wikipedia, I use it quite a bit, I am just stating that it wasn't trusted when it first came out specifically because it could be changed by anyone. No one understood it then, but today I think people understand that it's probably as trustworthy or moreso than a traditional encyclopedia is/was.
> In my experience when humans are given the choice to cooperate or fight, most choose to cooperate.
Personally, my opinion of human nature falls somewhere in the middle of those two extremes.
I think when humans are given the choice to cooperate or fight, most choose to order a pizza.
A content creator I used to follow was fond of saying "Chill out, America isn't headed towards another civil war. We're way too fat and lazy for that."
Sure but I hope you get my point. Fighting takes effort, cooperation takes effort. Most people have other things to worry about and don't care about whatever it is you're fighting or cooperating over. People aren't motivated enough to try and sabotage the wikipedia articles of others. Even if they could automate it. There's just nothing in it for them.
> "They imagined that 1000s of self-interested chaos monkeys would spend all of their energy destroying what 2-3 hard working people has spent hours creating instead of the inverse."
Isn't that exactly what happens on any controversial Wikipedia page?
There's not that many controversial topics at any given time. One of Wikipedia's solutions was to lock pages until a controversy subsided. Perma-controversy has been managed in other ways, like avoiding the statement of opinion as fact, the use of clear and uncontroversial language, using discussion pages to hash out acceptable and unacceptable content, competent moderators... Rage burns itself and people get bored with vandalism.
It doesn't always work. There are many topics that are perpetual edit wars because both (multiple) sides see the proliferation of their perspective as a matter of life and death. In many cases, one side is correct in this assessment and the others are delusional, but it's not always easy to align the side that's correct with the people who effectively control the page, because editors indeed do have their own biases (whether because of ideology, a philosophy, a political party, a nation, or whatever else). For those topics, Wikipedia can never be a source of "truth".
More colloquially, people would say that Wikipedia could not be trusted because "anyone can edit the pages or write whatever they want."
Of course that's demonstrative of the genesis fallacy. Anyone can write or publish a book, too. So it always comes down to "how can you trust information?" That's where individual responsibility to think critically comes in. There's not really anything you can do about the fact that a lot of people will choose to not think.
Yeah you weren't allowed to cite encyclopedias when I was a kid because:
1) encyclopedias are a tertiary source. They cite information collected by others. (Primary source: the actual account/document etc, Secondary source: books or articles about the subject, Tertiary source: Summaries of secondary sources.)
2) The purpose of writing a research paper was.. doing research and looking up an entry in an encylopedia is a very superficial form of research.
Also the overall quality of Wikipedia articles has improved over the years. I remember when it was much more like HHG with random goofy stuff in articles, poor citations, etc. Comparing it to, for instance, Encarta was often fun.
Encyclopedias are tertiary sources, compilations of information generated by others. They are neither sources of first hand information (primary sources) nor original analysis (secondary sources). You can't cite encyclopedias because there's nothing to cite. The encyclopedia was not the first place the claim was made, even if it was the first place you happened to read it. You don't attribute a Wayne Gretsky quote to Michael Scott no matter how clearly he told you Wayne Gretsky said it.
What about scholarly encyclopedias? For example, the Stanford Encyclopedia of Philosophy. The articles are written in the style of a survey article, and if they're merely tertiary, I can't tell. If the intention behind a citation is a reference for a concept (an "existence proof" of it) rather than identifying its source or providing evidence, then a tertiary source such as to a textbook seems adequate.
There is some nuance. Wikipedia is a tertiary source for the subjects of its articles. However, it is a primary source for what is on wikipedia. You can cite an encyclopedia the same way you would cite the dictionary (which is also a tertiary source) as a way of establishing that information is in circulation.
Likewise, primary sources for some claims may be tertiary sources for others. If you read the memoirs of a soldier in WW1 who is comparing his exploits to those of a roman general from antiquity, he is a primary source for the WW1 history and a tertiary source for the roman history.
Survey articles and textbooks are generally tertiary. They may include analysis which is secondary and citable, but even then only the parts which are original are citable.
As a more general rule, you can't cite a piece of information from a work which is itself citing that piece of information (or ought to be).
You gave some good context I missed - The (even) more technical (read: pretentious) explanation is that it's a tertiary source. As a general rule of thumb secondary sources are preferred over primary sources, but both are acceptable in the right academic context.
I do understand the "latest version" argument, and it is a weakness, but it's also a double edged sword - it means Wikipedia can also be more up-to-date than (almost) any other source for the information. Thats why I say there's "nothing specifically wrong about Wikipedia either" it can be held in similar regard to other tertiary sources and encyclopaedias - with all the problems that come with those.
> (Wikipedia has the additional problem that, by default, the version cited is the ever-changing "latest" version, not a fixed and identified version.)
Only citing means copying the URL directly. If you use Wikipedia's "Cite this page" or an external reference management tool (e.g. Zotero), the current page ID will be appended to the URL.
> Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.
That's a different scenario. You shouldn't _cite wikipedia in a paper_ (instead you should generally use its sources), but it's perfectly fine in most circumstances to link it in the course of an internet argument or whatever.
Well also years of Wikipedia proving to be more accurate than anything in print and rarely and not for very long misrepresenting source materials. For LLMs to get that same respect they would have to pull off all of the same reassuring qualities.
There’s also the fact that both Wikipedia and LLMs are non-stationary. The quality of wikipedia has grown immensely since its inception and LLMs will get more accurate (if not explicitly “smarter”)
I think you would need a complicated set of metrics to claim something like "improved" that wasn't caveated to death. An immediate conflict being total number of articles vs impressions of articles labeled with POV biases. If both go up has the site improved?
I find I trust Wikipedia less these days, though still more than LLM output.
I can't think of a better accidental metric than that!
I'll go ahead and speculate that the number of incoherent sentences per article has gone down substantially over the last decade, probably due to the relevant tooling getting better over the same period.
> I can see that "Don't trust LLMs" will slowly fade away and people will blindly trust them.
That's already happening. I don't even think we had a very long "Don't trust LLMs" phase, if we did it was very short.
The "normies" already trust whatever they spit out. At leadership meetings at my work, if I say anything that goes against the marketing hype for LLMs, such as talking about "Don't trust LLMs", it's met with eye rolls and I'm not forward thinking enough, blah blah.
Management-types have 100% bought into the hype and are increasingly more difficult to convince otherwise.
I can’t speak to your specific experience, but I do some of this kind of eye-rolling when people bring short term limitations on LLMs into long term strategy.
I’m reminded of when people at work assured me the internet was never going to impact media consumption because 28.8kbps is not nearly enough for video.
Problem is they also included newspapers in authoritative sources - except foreign ones that is - and Wikipedia at least has some kind of peer review process.
It's genuinely as authoritative as most other things called authoritative.
Except when they glaringly get things wrong like "character X on show Y said catchphrase Z", and two queries produce two different values of X, one right, one wrong. The more I use gemini summaries for things I know a bit about, the worse my opinion of them..
I know you are not serious, but what would constitute as an acceptable source?
I could paste its content into an LLM for rephrasing or summarizing or whatever, or just simply ask an LLM about it and put it on my personal website. Would that be an acceptable source?
What even is an acceptable source for such things?
I don't think the cases are really the same. With Wikipedia people have learned to trust that the probability of the information being at least reasonably good is pretty high because there's an editing crucible around it and the ability to correct misinformation surgically. No one can hotpatch a LLM in 5mins.
The best LLM powered solutions are as little LLM and as much conventional search engine / semantic database lookups and handcrafted coaxing as possible. But even then, the conversational interface is nice and lets you do less handcrafting in the NLP department.
Using Perplexity or Claude in "please source your answer" mode is much more like a conventional search engine than looking up data embedded in 5 trillion (or whatever) parameters.
A big reason for this is that Wikipedia's source is often a book or a journal article that is either offline or behind an academic paywall. Checking the source is effectively impossible without visiting a college campus's library. The likelihood that the cited information is wrongly summarizing the contents is low enough and the cost is high enough that doing so regularly would be irrational.
A bigger problem in this respect with Wikipedia is it often cites secondary sources hidden behind an academic fire/paywall. It very often cites review articles and some of these aren't necessary entirely accurate.
It wasn't just Wikipedia, which was a relatively recent addition to the web, everything online was a 'load of rubbish'.
In turn-of-the-century boomer world, reality was what you saw on TV. If you saw something with your own eyes that contradicted the world view presented by the media, then one's eyes were to be disbelieved. The only reputable sources of news were the mainstream media outlets. The only credible history books would be those with reviews from the mainstream media, with anything else just being the 'ramblings of a nutter'.
In short, we built a beautiful post-truth world and now we are set on outsourcing our critical thinking to LLMs.
"Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong".
Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.
In the same sense, I can see that "Don't trust LLMs" will slowly fade away and people will blindly trust them.