Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2018-12-08 | tools: Add version info to merged WARCs | Lars-Dominik Braun | 1 | -4/+14 | |
In preparation for #9. I was hoping to reuse one of schema.org’s microdata schema’s, but neither Action (archival action) nor SoftwareApplication (version information) seem to be suitable. | |||||
2018-11-17 | tools: Add original HTTP header to revisit record | Lars-Dominik Braun | 1 | -10/+9 | |
The payloads may be the same, but the headers are usually not. | |||||
2018-11-10 | tools: Fix WARC merging | Lars-Dominik Braun | 1 | -0/+188 | |
WARC-Target-URI was taken from the previous record, even if the URI was different. This essentially removes the revisited URL from the archive. Also add a few tests. And boy, warcio is a mess. |