Hacker News new | past | comments | ask | show | jobs | submit login

I thought the declassification of US government documents was 25 years. Or I guess there's an exemption when it comes to military intelligence? I wonder where and when the data was collected.



What do you mean by "where and when the data was collected"? As the article says, the photos are from the 60s and 70s in Iraq and Syria.

The source of the photos was the CORONA and HEXAGON satellites: https://www.cambridge.org/core/journals/antiquity/article/wa...

The journal article cites its sources which you can use to understand more: https://www.tandfonline.com/doi/full/10.1080/00934690.2020.1...

The data has been available for quite a while, but available is different from usable:

> While challenges involved in spatially correcting these unusual panoramic film images has long served as a stumbling block to researchers, an online tool called “Sunspot” now offers a straightforward process for efficient and accurate orthorectification of CORONA, helping to unlock the potential of this historical imagery for global-scale archaeological prospection. With these new opportunities come significant new challenges in how best to search through large imagery datasets like that offered by CORONA.


You're right I should've looked more closely, but I was just wondering about what it said towards the end of the article about U2 spy photographs and what's becoming declassified.


> After 25 years, declassification review is automatic with nine narrow exceptions that allow information to remain as classified. At 50 years, there are two exceptions, and classifications beyond 75 years require special permission.[0]

The nine exemptions can be found at the Justice Department's website[1].

[0] https://en.wikipedia.org/wiki/Declassification [1] https://www.justice.gov/archives/open/declassification/decla...


The journal article is much more interesting than the Guardian article, IMO. Thank you for sharing the link.


Why HN users insist on submitting Guardian articles is beyond me.


The amount of available documents has skyrocketed in recent past, especially for present-day history, and they're not always easily usable. For instance, if your interested in the Stalin administration, there are millions of orders, notes, studies and transmissions stored in boxes somewhere. If you were a historian in that time period, studying new documents, a lifetime would only let you see a very tiny fraction of existing sources.

Remember these movies where a small-firm lawyer is hammered with tons of document boxes in a discovery process against a big corporation? Well, historians are like that, but they have less money and they don't know how many boxes there are. Also they have to look for the boxes themselves rather than them being delivered at their office.

In older, well-studied fields there are few boxes, they are already referenced, and historians have a chance to see everything over their career. In more recent, less studied fields, there are countless unopened boxes.


I'm not a big proponent of LLM proliferation, but I was thinking that mass review of tons of scanned documents might be exactly the sort of thing they're really useful for. Given an AI that hasn't been ruthlessly tuned to be as politically neutral as possible, you could have a huge database and query it in plain English like "were there any documents that made overt reference to extremely corrupt behavior?"


People with the knowhow to do this kind of stuff are mostly busy trading eyeballs or stock, and college history departments are not exactly rolling in it.

Still, there is an effort made to make these collections more easily avaialble. For instance, in the case of soviet archives, [1] describes the work done and the conditions to access. That work is far from exhaustive though, and a large part of the stuff still needs to be done the slow way, or require special requests in order to be accessed.

[1]: https://www.ucl.ac.uk/ceelbas/state-archive-russian-federati...


To answer a query, your LLM needs to "read" the documents first. The context window will not be big enough for this, so you have to fine tune the model.

Problem is, you need to cross-check with the reference material in case it's subject to hallucinations.


Oh, I was thinking that the cross-checking is the point. You'd use the LLM as a "hazily thinking search function" to narrow your examination of old documents, not as a replacement for reading the documents.

I don't know what to do about the context window, though.


I don't understand, can't you feed it one page at a time and ask it "is there relevant information here?"


Or load it all into a RAG system. Give it a few months and it'll be something you can buy off the shelf.


Maybe it's just analog to digital conversion. Some stuff only gets used for research after some digitization project since it's not really searchable on a more global scale otherwise. Could be completely wrong here of course.


> 25 years

If that was true we'd have all the Kennedy assassination docs.


Not that you're incorrect but in addition to withholding documentation, sometimes they just destroy it, too.

https://nsarchive2.gwu.edu/nsa/DOCUMENT/940228.htm

" In August 1974, the Joint Chiefs of Staff destroyed all the minutes and transcripts of their meetings going back to 1947, and in 1978 essentially stopped keeping any such records. Only 30 pages of notes have survived, much to the dismay of military historians and scholars of the Cold War."

So, like, we will never get the discussions around, say, them using smallpox against North Korea.


The outgoing Nixon administration had to cover their tracks for some reason.


They're probably exempted under:

>25X7 – reveal information that would impair the current ability of U.S. government officials to protect the President, Vice President, and other protectees for whom protection services, in the interest of national security, are authorized;


Like that scene from the JFK movie where Donald Sutherland's character is going through a long check list of all the things the secret service would have done.

Like, snipers on roof tops, planning the route so there's no slow downs, etc.


Which you have to trust them about, because nobody can verify that.

Basically back to step one: they tell you what they want.


Maybe it's this one:

> (6) reveal information, including foreign government information, that would cause serious harm to relations between the United States and a foreign government, or to ongoing diplomatic activities of the United States;




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: