summaryrefslogtreecommitdiff
path: root/crocoite/warc.py
AgeCommit message (Expand)AuthorFilesLines
2018-12-21Parse URLs by defaultLars-Dominik Braun1-11/+7
2018-11-19Coding styleLars-Dominik Braun1-2/+2
2018-11-06Switch single mode to asyncioLars-Dominik Braun1-23/+9
2018-08-04Properly handle failure to retrieve request bodyLars-Dominik Braun1-1/+15
2018-08-04Reference warcinfo record in every other recordLars-Dominik Braun1-18/+30
2018-08-04Add package information to warcinfoLars-Dominik Braun1-1/+5
2018-08-04Reintroduce WARC loggingLars-Dominik Braun1-4/+34
2018-06-25warc: Add metadata to truncated recordsLars-Dominik Braun1-22/+28
2018-06-25warc: Save DOM-/image screenshot as WARC conversionLars-Dominik Braun1-9/+30
2018-06-21Fix a few issues pointed out by pylintLars-Dominik Braun1-4/+0
2018-06-20Add __slots__ to classesLars-Dominik Braun1-0/+2
2018-06-20Synchronous SiteLoader event handlingLars-Dominik Braun1-101/+64
2018-05-04Move page archiving logic to SinglePageControllerLars-Dominik Braun1-3/+3
2018-05-04Move header unfolding into ItemLars-Dominik Braun1-21/+2
2018-05-04Fetch request POST bodyLars-Dominik Braun1-7/+5
2018-04-14Fix base64 body detectionLars-Dominik Braun1-1/+1
2018-03-25Move getResponseBody call to Item wrapperLars-Dominik Braun1-11/+2
2017-12-25Increase default body sizeLars-Dominik Braun1-2/+4
2017-12-24Refactor behavior scriptsLars-Dominik Braun1-2/+3
2017-12-22Add simple stats-keeping SiteLoaderLars-Dominik Braun1-4/+6
2017-12-22Don’t write WARC record if body cannot be retrievedLars-Dominik Braun1-19/+48
2017-12-20Fix HTTP headers using the same key more than onceLars-Dominik Braun1-2/+15
2017-12-19Serialize WARC writingLars-Dominik Braun1-0/+35
2017-12-17Don’t fetch redirected request bodyLars-Dominik Braun1-8/+12
2017-11-29Use Chrome’s timestamps as WARC-DateLars-Dominik Braun1-0/+6
2017-11-29RefactoringLars-Dominik Braun1-0/+174