diff options
author | Lars-Dominik Braun <lars@6xq.net> | 2018-11-10 11:21:11 +0100 |
---|---|---|
committer | Lars-Dominik Braun <lars@6xq.net> | 2018-11-10 11:21:11 +0100 |
commit | 1d9c607207b49d62f5f853312bb808da47699398 (patch) | |
tree | e2fcf6c0934856b17f3389ad1892268747dd7b13 /crocoite/test_html.py | |
parent | f30ab5515a2775d35e66da9d5dfc52a29a68bf9a (diff) | |
download | crocoite-1d9c607207b49d62f5f853312bb808da47699398.tar.gz crocoite-1d9c607207b49d62f5f853312bb808da47699398.tar.bz2 crocoite-1d9c607207b49d62f5f853312bb808da47699398.zip |
tools: Fix WARC merging
WARC-Target-URI was taken from the previous record, even if the URI was
different. This essentially removes the revisited URL from the archive.
Also add a few tests. And boy, warcio is a mess.
Diffstat (limited to 'crocoite/test_html.py')
0 files changed, 0 insertions, 0 deletions