Source: House
Oversight Committee
Source: House
Oversight Committee
The above release consisted of JPG images embedded in PDF files. Why? I have no idea. But it
makes it very difficult to just browse through the photos. I've extracted the images with
a small Python script and placed them in a gallery. This
is far easier to skim through.
The original source of these came in four data sets from the Justice Department. They are:
Data Set 1,
Data Set 2,
Data Set 3,
Data Set 4
Same drill as the last release. I've extracted images from the PDF files, making them
more easily browsed.
The original source of these came in four data sets from the Justice Department. They are:
Data Set 5,
Data Set 6,
Data Set 7,
Data Set 8
⚠️
From this point forward, mirroring the files got difficult. They added bot-detection and did some cookie+JavaScript magic that requires you to verify that you are human every hour. Failing to do so caused downloads to LOOK successful, but actually show up as error pages. Instead of using my normal mirroring scripts, I had to do a combination of manual + automated work. This led to some duplicate files (which I've written scripts to weed out), but may have also led to gaps. Since the numeric filenames have gaps in numbering already, it's impossible to tell where any gaps from missed-downloads may have been. As such, this is a best-effort mirror. I'd guess 99%+ of it is there, but there may have been a few pages missed.
From this point forward, mirroring the files got difficult. They added bot-detection and did some cookie+JavaScript magic that requires you to verify that you are human every hour. Failing to do so caused downloads to LOOK successful, but actually show up as error pages. Instead of using my normal mirroring scripts, I had to do a combination of manual + automated work. This led to some duplicate files (which I've written scripts to weed out), but may have also led to gaps. Since the numeric filenames have gaps in numbering already, it's impossible to tell where any gaps from missed-downloads may have been. As such, this is a best-effort mirror. I'd guess 99%+ of it is there, but there may have been a few pages missed.