It's 2:30 AM, so I may forget some details in these steps. I've spent all night researching and testing free tools that could accomplish this objective. Feel free to substitute whatever tools you prefer.
The instructions below are for Windows and Edge. You can use Chrome too.
These are the steps for downloading all the JFK documents. This is the easy part.
-
Create two Windows folders to store the PDF's. The first you can name "JFK Files." Within that folder, create another named "OCR."
-
Install the Chrono Download Manager extension for either Edge or Chrome:
Chrome: https://chromewebstore.google.com/detail/chrono-download-manager/mciiogijehkdemklbdcbfkefimifhecn
-
After the extension is installed, change your browser's download location to the "JFK Files" folder you just created. Chrono Downloader conveniently pops up a window with a link to change this along with another setting.
-
Open the browser and pin the Chrono sniffer to the toolbar. To do this in Edge, first click the Extensions button to the right of the URL bar. You'll see the Chrono Downloader extension with three little dots next to it. Click the dots, then click "Pin to toolbar." You'll then see the Chrono icon next to the URL bar. This is the Chrono Sniffer.
-
Go to the National Archives page for the JFK files. Where the dropdown says "Show 10 entries," change the dropdown to "Show All Entries." You want all 1,123 PDF links to be visible. Don't worry, the Archives website is FAST.
https://www.archives.gov/research/jfk/release-2025
- Click the Chrono Sniffer Icon. Click the Document tab. You will see a lot of links listed in the sniffer, so below that list, click the PDF filter. Only the PDF links will be highlighted and have a check mark, meaning only those files will be downloaded. Click the Download All button. This will start the download of all 1,123 files. It took me less than 10 minutes to complete all the downloads over a fiber connection.
You can view the download progress in the Chrono dashboard if you want, but you can also view it in your JFK Files folder.
After all the PDF's are downloaded, you'll want to enhance the files with OCR (Optical Character Recognition) so they can be searchable. If you don't have Adobe Acrobat, there's a free tool that can do this in bulk.
- Go to the PDF24 Creator site and download the latest Windows revision, 11.23.0. Note that this free application offers capabilities that other developers charge a lot of money for, like batch processing of PDF's.
https://tools.pdf24.org/en/creator
-
When you first launch the tool, you will need to register it by creating an account with your email and password (Booooo!), then you will need to register it using the code you received in your email.
-
After registration, when see the large menu of PDF options, click PDF OCR.
-
You'll see a page with many lines like a ledger. On the right, change the Output directory to the "OCR" folder you created within the "JFK Files" folder.
-
On the left, you can now click the Add Files button. Add as many PDF's as you want. For now, I'm doing 10 at a time, and tomorrow I'll test 50 at a time. I'm not sure if the software has a limit. Converting all 1,123 files should take a few hours.
Here's the beautiful part: Once you've enhanced all the PDF's with OCR, you can now search all of them through Windows explorer. To do this, click into the OCR folder and enter your search term in the Windows explorer search field. For example, if I enter "Oswald" in the search field, Windows will list every PDF that contains that word along with some preview text. So your OCR folder is now a database of declassified JFK files.
Alternative for Mac and Linux users:
Here is a script to download all the files using curl (should work on Linux and Mac as long as curl is installed)
https://pastebin.com/raw/jtNkkNWz
Thank you!
Instructions for those who don't know how to use this:
chmod +x download.sh
./download.sh
And it will download everything.
Thanks fren!
Curl is also installed with windows10/11 usually as well, for information.
Yes, it should work if you saveit as a batch file, but was not too sure whether curl on windows works exactly the same as on linux.
I used the powershell script Cats5 posted earlier and changed the directory to a linux path. Then I did "sudo snap install powershell --classic" and ran the ps script inside of linux. Powershell is crap and convoluted, but in linux I just did "powershell scriptname.ps1" and it worked. I was surprised.
I actually like PS on windows - even though its slow its very powerful. You can do anything that you can do via GUI, which makes it very good for scripting. Never knew thre was a powershell on linux!
Bless you for this script.
TBH, its creation must have been a lot of tedium...
Not really. I used a macro on emacs. I have a motto for coding. "If it cant be done with emacs macros, you better be getting paid for doing it" !
As long as you're parsing it through the HTML, I agree that it wouldn't be too bad. Just the act of stripping away all the crud to get just the file names. Then "for line in file, append "curl phrase" . Just tedium...
Thanks for this. I spent like 4 hours trying to get wget to work with different options and searching online for which options to use because I kept getting a 404 on accessing the ../2025/0318/ subdirectory for some reason.
I didn't think to use curl cuz I thought if wget couldn't work, curl probably won't work.
Thanks loads. Gonna have hubby do it. I have to vacuum my downstairs tomorrow!
You have the best name on GAW since "Libtards R Stoopid (really stoopid)".
Thank you. I truly believe my name. Was very easy to pick! My hubby got his feelings hurt for about 20 seconds!
Your hubby is the real hero in all this!
Get a robot vacuum and save yourself some free time.
OCR will not work on a significant portion of the files. I took a random sample of a few dozen files and the text is often blurry, extremely faint, covered in handwritten scribbles, or simply hand written.
I noticed that too. I intend to use Photoshop filters on some of the blurry ones.
There's a PDF with almost 700 pages that I believe is of great significance and is meant for us to be the detectives. Most of it is newspaper clippings and handwritten notes, so while your method is useful it will miss so much of significance. Much of this will have to be done by hand.
If there was some way to crowd source this, so we can optimize the time spent and reduce redundancy, and pool the work together into an online database. I'm not that technologically savvy to organize it.
This is the document I mention at the beginning https://www.archives.gov/files/research/jfk/releases/2025/0318/157-10014-10242.pdf
Thank you. It is easier this way.
Yep, this is why old fashioned human intelligence will never be obsolete.
Or maybe just download them all and then torrent them and put the magnet link to the zip file here?
This ^
This isn't anywhere near all of them. They are digitizing and uploading daily for the foreseeable future.
If one person can download and prepare the data properly, then put it on a torrent and people can click and get everything.
That is likely already happening somewhere. We just like to remain ahead of the normie curve.
Thanks for this!!!
u/#YouAreAmazing
This is awesome! Thank you!
Appreciate your hard work, I'll bookmark this post
Me too
I've been trying to download everything with the wget command but I kept getting an error (404). Tried many different options but nothing, other than the fact that I was able to download the entire website. Had to terminate that before my HDD gets full cuz I'm sure it would have resulted in me downloading about 50 TB of files due to wget trying to download from other subdirectories.
Since I'm on Linux, I guess I'll try curl.
But thank you for this, this would be great for others who are not familiar with wget/curl but I'm a bit miffed that I couldn't get wget to work or maybe I successfully did it but archives.gov is blocking wget commands?
Can someone who has done this please search for that death certificate/coroner's report that was posted the other day that stated there was an entry wound from the front.
It doesn't say that, it says "Multiple gunshot wounds of the head and neck" and "Shot by a high powered rifle"
Multiple wounds from a high powered rifle? Hmmmm.....
I thought the common story was that Oswald was able to get two shots that hit JFK, but that people disputed he could have been able to do that with a bolt action rifle in the time required, so multiple gunshots would not be a new revelation?
I'll do that tonight or tomorrow. I dumped all the PDF's into PDF24 at the same time, and it's about halfway finished OCR'ing them.