Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Perl articles are being memory wiped from Wikipedia (reddit.com)
43 points by leejo 22 days ago | hide | past | favorite | 51 comments


The new rule of notability: if it’s no longer in Google’s index, it basically doesn't meet Wikipedia's notability criteria

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Deletion...

"From a Google search, I wasn’t able to find" appears multiple times on that page alone.


The relevant part is before that:

> This article is exclusively sourced on primary sources.

The Google search is the nominator looking for an alternative source that could make it notable, something earlier editors failed to establish.


This « rule » is infuriating. Google searches are tailored to serve us content that might interest us. In this case, Google search first page returns plenty of notable results for me. Might not be the case for a person interested in geology and dogs, though.

How could such a biased thing be a valid WikiPedia criteria?


In short, Google decides what stays in Wikipedia.

Neat. Not.


Not really,if that thing is cited on notable papers or books, it stays too.


Except if Google decides otherwise.

And then Wikipedia follows suit.


I wasn't defending wikipedia or engaging on a penis fight on the internet for no reason. I added context, because it seemed you misunderstood this specific Wikipedia rule, and considering how cryptic wikipedia is, and how often i myself misunderstood rules on wikipedia (or stackoverflow) or even in general, i thought it was the same to you and adding more information would have cleared things out.

If your original post does not come from a misunderstanding but some culture war bullshit or whatever, my bad probably, but i'd rather you go on reddit or something else, i'll probably still read you, but assume it's culture war or ragebait and leave you alone.


No.

The Google search wouldn't even have happened had the article had sources listed for the claim.


Thus Google gets the final word on whether an article is deleted.


No. The author gets the final word, by including citations as they should.


"Self-Organizing Social Learning Through the Monastery Gates" ( Rose M. Baker & David L. Passmore : The Pennsylvania State University ; 2005 )

"Abstract

An example of an emergent, self-organizing on-line social learning system is available at the PerlMonks site at http://perlmonks.org/. Perl is a scripting language commonly used to as an interface between databases and web pages. Provided in this paper is a review of principles of emergent, self-organizing systems from a perspective of learning systems as well as case study of PerlMonks as self-organizing eLearning."

PDF: https://www.researchgate.net/profile/Rose-Baker/publication/...

via google scholar:

https://scholar.google.com/scholar?q=%22perlmonks.org%22


The reasons for deletion don't seem that outlandish to me. I'd rather not see them deleted, but I also don't think this outcome is that surprising, nor would I describe it as a "memory wipe."


The CPAN page on Wikipedia has existed for 24 years, has dozens of references, yet an editor nominated it for deletion - I can't help but feel that as hostile. Fortunately this one has been voted "keep", but still...


The person who nominated it for deletion changed their opinion after suitable sources were found, and the article was thus kept within a day. That hardly seems hostile to me, but rather just someone trying to uphold Wikipedia's sourcing and notability requirements.


I'm sorry, but I just don't believe that. As stated below in several other comments, none of this makes sense and the Wikipedia editors hiding behind "this is the policy, you do not get to question it" stinks.

The original user withdrew their deletion suggestion and added the "This article relies excessively on references to primary sources. Please improve this article by adding secondary or tertiary sources." banner, sure. Why didn't they just do that in the first place?

Instead they looked at an article that had existed for twenty years, with a comprehensive history of changes, had lots of information, links, and [albeit primary] sources; they did some cursory Googling, then suggested it for deletion - with a deadline of 7 days: https://en.wikipedia.org/w/index.php?title=CPAN&diff=1327587...

Wikipedia literally has its own page to suggest that you don't do that: https://en.wikipedia.org/wiki/Wikipedia:Chesterton%27s_fence...

Wikipedia's own policies around deletion mean it's easy to delete articles you don't particularly like - if they are old enough they probably lack secondary sources. You can't inform users who would be able to contribute off-Wikipedia: https://en.wikipedia.org/wiki/Wikipedia:Canvassing#Stealth_c... which means it's unlikely they will be updated before the deadline passes. Many of these articles were contributed by people who have long moved on, and few of us are paying attention to every possible thing on Wikipedia. Twenty years of history deleted in a week. That's wrong.

This feels like the actions of a newly promoted editor, inexperienced, and eager to start "cleaning up" Wikipedia, which it is damaging. It also feels like the actions of an editor who, when editing another article, saw that the thing they were adding didn't point to what they expected on Wikipedia: https://en.wikipedia.org/w/index.php?title=White_Camel_award... # instead of adding a page to disambiguate, they decided to go on a crusade to purge articles that had existed for twenty years. And because these were mostly articles that predate Wikipedia's sourcing policies, they knew it was likely they would succeed.

As I've stated in one of the talk threads: https://en.wikipedia.org/wiki/User_talk:Left_guide#c-Leejeba... # I'm not particularly concerned about the restoration of some of the articles, instead I'm more concerned about the blunt application of policies that means important reference, history, and culture are being deleted.


> Why didn't they just do that in the first place?

Because they didn't find any. If secondary sources don't seem to exist, then there's no hope to begin with. Someone else found them, and that instantly made it clear that it's fixable.

> they did some cursory Googling, then suggested it for deletion - with a deadline of 7 days:

I fail to see the problem here, unless you have a problem with only getting 7 days. That's policy as far as I know,[1] and it would be nice to at least get two weeks, but I can't blame the individual proposer.

[1] https://en.wikipedia.org/wiki/Wikipedia:Deletion_policy#Prop...

Chesterton's fence goes both ways. Wikipedia policies are there for non-obvious reasons in some cases.

> You can't inform users who would be able to contribute off-Wikipedia:

Canvassing is one thing. Productively finding sources for an article is another. Wikipedia has had many a talk page drown in off-wiki people come in and make an account to prevent some thing from happening, none of them understanding why the thing is happening, none of them understanding Wikipedia policy, and none of them caring. Inviting these people to discussions is not productive and just a bad time for everyone.

If you invite one or two people to actually improve the article, so it can survive on it's own merit, I can't imagine anyone having a problem with that. The equivalent on-wiki thing of just pinging relevant editors is common and encouraged.

> This feels like the actions of a newly promoted editor, inexperienced, and eager to start "cleaning up" Wikipedia,

The user in question has no special user rights (they were automatically updated to an extended confirmed user over a year ago), and has a few decently long articles under their belt.

> It also feels like the actions of an editor who, when editing another article, saw that the thing they were adding didn't point to what they expected on Wikipedia:

> instead of adding a page to disambiguate, they decided to go on a crusade to purge articles that had existed for twenty years. And because these were mostly articles that predate Wikipedia's sourcing policies, they knew it was likely they would succeed.

This is a rather unfavourable view of the situation, and not really one made in good faith. I can agree that that article shouldn't have just been turned into a redirect, but that redirect was made by a different user to the one who's been nominating articles for deletion, and I can't see any obvious connection.

Articles being older than the sourcing requirements also do not exclude them from those requirements. They usually get a break because of that, but it's been well over 10 years by this point.

> As I've stated in one of the talk threads: https://en.wikipedia.org/wiki/User_talk:Left_guide#c-Leejeba... # I'm not particularly concerned about the restoration of some of the articles, instead I'm more concerned about the blunt application of policies that means important reference, history, and culture are being deleted.

Owen who responded to your post makes a good point. I'd argue that if Wikipedia deleting an article about something amounts to the deletion or destruction of history, reference or culture, then thing in question probably has some notability problems. Wikipedia makes it easier to find information about a particular topic, but can't be the only reasonable source of things. There has to at least have been reliable sources for the article to have been based on, even if they aren't easily available at this point.


You're basically reinforcing my arguments - these are the policies, deal with it.

I believe the 7 day deadline to avoid the deletion of 20+ years of history is destructive because most of the people that would be notified of this have long since moved on, no longer care, or are off Wikipedia.

The cursory Googling by those who have the power to delete is also concerning. As stated elsewhere in this discussion, Google hasn't been great for search for a long time.


The policies are there for good reason most of the time, and rarely without there having been a lot of talk about what said policy should be. I found them very helpful during my time editing, since they accurately reflect what happens and why, with the whole process being transparent. Maybe I'm just biased.

Google isn't the end-all-be-all of sourcing, as has been shown by the articles that have been kept. If you can find reliable sources, it will be kept. Google is just the final nail in the coffin.


There it is, right? Seven days and twenty years, gone. To quote, it is "the slow decline, the emptying out, and the long, long process of forgetting".

Wikipedia's deletion proposals are the online equivalent of putting a small poster on a village noticeboard and being surprised that the entire world doesn't see it.

It's disgraceful.


It isn't Wikipedia's job nor mission to remember. The Internet Archive took on that mission. Hence why you can still find the article there. The article isn't gone. It's a bit less accessible. I love them both, but they work in very different ways.


Deleted pages are no longer accessible, meaning the history of changes is gone. “Memory wipe” seems reasonable.


The changelog is still there on the servers and can be accessed by the Wikipedia administrators. The page can also be restored with it's full changelog, although I don't think that's done very often.


I makes very hard to re-start a new article. Why start from scratch when we could re-use and improve the old article? This is discouraging.

I am moderately tech-savvy and had a WikiPedia account for years. But going into the deletion-review process WikiPedia bureaucracy is a lot of work. Pretty honestly I looked at the process and it looks so complicated that I think I would rather write a brand new article.


Articles are usually deleted for good reasons, so it's usually discouraged to do this for those same reasons. If it's just due to notability, you could probably ask an admin to give you a hand and give you the text of the old revision in the draft space, although I've never seen it done. It's usually a better bet to start of a blank slate, since that doesn't carry with it the smell of a previously deleted article, even if that deletion might not have been made with good reason.

> But going into the deletion-review process WikiPedia bureaucracy is a lot of work. Pretty honestly I looked at the process and it looks so complicated that I think I would rather write a brand new article.

The new article part of that is probably somewhat intended behaviour. The deletion-review process isn't as bad as it seems from all the pages. They're just very verbose just to have everything documented. People are usually nice enough to point in the right direction if something is amiss. They just want things done correctly and will guide the process thusly.


So this is about PerlMonks, which I knew nothing about until today.

I searched it, the site is down The Wikipedia article is deleted

This is pure loss of information somehow.

I and a lot of other people in the future will never know what "perlmonks" is/are, how important it was?, etc. etc.

The logic seems to be: if tomorrow Stack Exchange disappears, the Wikipedia article will be deleted? If yes, then that makes zero sense.


That article wouldn't be deleted, because I can find 20 or so references in paper publications saying things like "The Stack Exchange family of websites is a Q&A service for developers and other technical roles. If you’re stuck on a problem, the chances are someone else was too and turned to Stack Exchange for help", or in some cases doing a quick bio of Jeff Atwood or mentioning codinghorror.

Hmm, OTOH I can also find multiple paper references to perlmonks, such as "Perlmonks is a web bulletin board dedicated to Perl. It’s not specifically a help desk, but if you’ve done your homework and ask a good question, you’re likely to get top-notch help very quickly" - that's from the O'Reilly Perl book. Sometimes I'm overoptimistic about these things, because I want to keep every obscure article.

Well, Perlmonks is still mentioned on the articles for Perl, Outline of the Perl programming language, Perl language structure, and Perl Foundation. (This is because deletionists are lazy and don't actually like doing a thorough job.) So I could see Perlmonks becoming a redirect to one of those pages, which could describe it in a section. Similarly, if Stack Exchange faded into obscurity, it might be rolled into a section of Jeff Atwood's page (or vice versa).


I'll never understand the amount of vitriol Wikipedia volunteers must receive. Why is the deletion (or even deletion proposal) regarded as such a heinous act that people feel the need to attack and bully others?

I find this kind of behaviour and rethoric wholly unacceptable.


> Why is the deletion (or even deletion proposal) regarded as such a heinous act that people feel the need to attack and bully others?

FWIW I don't see this as an attack (with, perhaps, the exception of a couple of comments in the linked thread) and posted the link to the reddit thread as I see it more as an interesting observation around the myriad issues facing "legacy" languages and communities. To wit:

* Google appears to be canon for finding secondary sources, according to the various arguments in the deletion proposals, yet we're all aware of how abysmal Google's search has been for a while now.

* What's the future of this policy given the fractured nature of the web these days, walled gardens, and now LLMs?

* An article's history appears to be irrelevant in the deletion discussion: the CPAN page (now kept) had 24 years of history on Wikipedia, with dozens of sources, yet was nominated for deletion.

* Link rot is pervasive, we all knew this, but just how much of Wikipedia is being held up by the waybackmachine?

* Doesn't this become a negative feedback cycle? Few sources exist, therefore we remove sources, therefore fewer sources exist.


> Google appears to be canon for finding secondary sources, according to the various arguments in the deletion proposals, yet we're all aware of how abysmal Google's search has been for a while now.

Nobody is forcing you to use Google. If you can provide an acceptable source without the help of Google, go ahead. But the burden of proof is on the one who claims sources exist.

> An article's history appears to be irrelevant in the deletion discussion: the CPAN page (now kept) had 24 years of history on Wikipedia, with dozens of sources, yet was nominated for deletion.

Such is life when anyone can nominate anything at any moment... and when many articles that should have never been submitted in the first place slip through cracks of haphazard volunteer quality control. (Stack Overflow also suffers from the latter.)

The sources is the only part that matters. And they sufficed to keep the CPAN article on site, so the system works.

> Doesn't this become a negative feedback cycle? Few sources exist, therefore we remove sources, therefore fewer sources exist.

It was wrong to submit the article without sourcing in the first place. Circular sourcing is not allowed.


> The sources is the only part that matters. And they sufficed to keep the CPAN article on site, so the system works.

The system works if the sources remain available, and in an environment predisposed to link rot that can be a problem. Imagine the hypothetical situation of archive.org disappearing overnight? Should we then delete all pages with it as their sole source if they're not updated within a week?

And the system works if intentions are pure - it seems here the user that suggested the deletion of several Perl related pages is a fan of film festivals[1] and clearly wasn't happy that the "White Camel Award" is a Perl award, since the late 90s, and not a film festival award (since the early 00s). At least according to Google. So they went on a bit of a rampage against Perl articles on Wikipedia.

You could argue "editor doing their job", but I would argue "conflict of interest".

[1]: https://en.wikipedia.org/w/index.php?title=Sahara_Internatio... # amongst many in their edit history


These are all bad-faith takes. What are you doing?

24 years ago, some people wrote on Wikipedia instead of elsewhere. So the wiki page itself became a primary source.

"The page shouldn't have been submitted..." This was a Wiki! If you're unfamiliar with the origin of the term, it was a site mechanism designed to lean in to quick capture and interweaving of documents. Volunteers wrote; the organization of the text arose through thousands of hands shaping it. Most of them were software developers at the time. At a minimum, the software-oriented pages should get special treatment for that alone.

You're acting as though this is producing the next edition of Encyclopedia Britannica, held to a pale imitation of its standards circa the 1980s. The thing is, Britannica employed people to go do research for its articles.

Wikipedia is not Britannica, and this retroactive "shame on them" is unbelievable nonsense.


Verifiability is a core policy on Wikipedia, and with time, citing your sources has become more and more important. Wikipedia isn't was it once was in 2001. Articles can't survive on being verified by their own primary sources, for the same reason we don't want Wikipedia to become a dumping ground for advertisers who then cite their own site in an attempt to gain legitimacy. Secondary sources provide a solid ground truth that the subject in question has gained recognition and thus notability. If those secondary sources don't exist, we can't assume notability based on nothing.

Wikipedia isn't Britannica, because by this point it's probably a lot better than Britannica. They were comparable already in 2005,[1] and I have little reason to believe that Wikipedia is doing much worse on that front nowadays, even though they have vastly more content than Britannica.

[1] https://www.cnet.com/tech/tech-industry/study-wikipedia-as-a...


Some of the deleted pages never had the « sources missing » tag set for a significative time. It has been straight to deletion point.

Some pages that survived the deletion (e.g. TPRF) had the « missing sources » tag set since 15 years… What, I have to admit, can justify some action. But it was not the case for the PerlMonks and Perl Mongers pages: those just got deleted on an extremely short notice, making it impossible for the community to attempt any improvement.


7 days is policy for a deletion proposal,[1] which I can agree is not really enough time, although it's usually extended if talks are still ongoing.

There aren't really any rules about putting up notices and such before proposing deletion, and if you can't find anything other than primary sources, it doesn't seem unreasonable to propose deletion than propose a fix which can't be implemented. Thankfully, someone did find reliable sources for some of the articles.

[1] https://en.wikipedia.org/wiki/Wikipedia:Deletion_policy#Prop...


> If you can provide an acceptable source....

https://arstechnica.com/gadgets/2021/08/the-perl-foundation-...

https://www.theregister.com/2021/04/13/perl_dev_quits/

20 seconds.

If I ran Wikipedia I would ban everyone involved in this spectacle.


> And they sufficed to keep the CPAN article on site, so the system works.

This is such an absurd take. “It this one example the system worked so clearly it’s fine.”


People get extremely frustrated and upset about arbitrary rules, especially when they are imposed inconsistently.

From the talk page it seems like exactly three people were involved in deciding if this was worth deleting and they indicated they could not find evidence of notability. Meanwhile I found a Register article about PerlMonks in minutes and there are pointers here to Google Scholar references as well.

When the bar for deletion is “a couple of people who didn’t try very hard didn’t find notability” is it any wonder that there’s pushback? This feels entirely arbitrary.


Consider the other perspective: how should Perl programmers feel when Google's index becomes the main criterion for what is considered important or not? This creates a circular dependency that can erase genuine technical contributions from the historical record.


Google index is tailored for each individual. Persons with interest in breeding cats won’t be served Perl results.

If Google index becomes a criterion of notability, we are in a deep deep shit.


Because it puts the history of the article behind a lock

I wonder if there are any privileged Wikipedia accounts who have defected and are doing a sci-hub thing.


> Why is the deletion (or even deletion proposal) regarded as such a heinous act

"Those who control the past, control the future"



Wikipedia has a page for an Egyptian King that ruled for perhaps only 10 years 5000 years ago. https://en.wikipedia.org/wiki/Anedjib

Why is that still relevant?

Or to put it another way when does the contemporary move into interesting history?


When did the Perl Monks run a kingdom?

Apples and oranges.


It’s not a kingdom, but a monestary, and that’s exactly what the WikiPedia article explained.

https://www.perlmonks.com/?node_id=3559


Or more importantly, when did the Perl Monks build a pyramid? Plenty of rulers are long forgotten.


A building that wasn't a pyramid, but had simple step sided walls is only significant in retrospect!

ie it only became of historic interest after the fact as people retrospectively thought it might have influenced later more significant buildings.

While I agree a page about Perl Monks isn't likely to be that significant - I was making a general philosophical point.

Eg how do you know that PerlMonks doesn't end up being one of the earliest examples for self-organising elearning - a movement that ends up replacing Universities in the future?

In terms of the details of this page leejo posts are more substantive.


The deletion proposals do not mention "interesting" anywhere.


Correct, the cited factor is lack of significant coverage.


The PerlMonks page was in death as it was in life: completely unreadable.


Still active and relevant for some people from those communities, though. Without mentioning the historic value.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: