The President John F. Kennedy Assassination Records Collection
On This Page Response to Executive Order 14176 Previous Releases What is Currently Available Online FAQs and Additional Resources Transcription Mission Contact Us This webpage was created in response to Executive Order 14176, titled “Declassificatio...
The pdf files aren't in text format, the text isn't easily digestible. There are newspaper clippings, handwritten notes, and some crooked scanned pages. It needs to go thru an OCR and possibly manual entry.
Thank you for this clarification.
I'm gonna feed it into my setup of https://docs.paperless-ngx.com/ once I have all the docs downloaded. It obviously won't be perfect because a lot of these docs are copies of copies of copies and are barely readable by the naked eye, but it will help. If you haven't heard of Paperless-ngx and know how to do things like run Linux or use Docker I HIGHLY recommend trying it out. I scan pretty much everything now. It was a HUGE help when I was fighting to get my mom on Medicaid last year before she died. I was able to scan in a ton of stuff and search for terms to find stuff like her pension info. If I need a copy of my DD-214 I just search my DB for DD-214 and it spits it right out. I fed it all of the txt, pdf, doc, and xls (and their variants) from my NAS and it pulled all of it in.
You can also create tags that can "learn", so if I tag a doc as govt, medical, Mom, and signed it will analyze it and mark other similar docs with the same tags. Then I can search for tags instead of just terms. Worked great for my taxes last year. Pro tip - I have tags for the year (like 2025) as well.
Really cool, it's out of my league but maybe one day I'll learn how
GPT can make sense of text and images regardless of the format