Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I wonder how big a content snapshot is, ie no article histories and no meta-material like talk pages or WP:xxx pages, just the user-facing content

I don't know how big it is uncompressed, but they do have a dump of just that part:

  2010-03-16 08:44:40 done Articles, templates, image descriptions, and primary meta-pages.
  2010-03-16 08:44:40: enwiki 9654328 pages (255.402/sec), 9654328 revs (255.402/sec), 82.9% prefetched, ETA 2010-03-17 03:08:26 [max 26568677]
  This contains current versions of article content, and is the archive most mirror sites will probably want.
  pages-articles.xml.bz2 5.7 GB


Well spotted. This has great possibilities for education in the 3rd world.


Perhaps this is a good time to point to this?

http://thewikireader.com/index.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: