summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-11-29Use Chrome’s timestamps as WARC-DateLars-Dominik Braun2-8/+14
2017-11-29RefactoringLars-Dominik Braun5-403/+571
Reusable browser communication and WARC writing.
2017-11-26DOM snapshot: Generate valid HTML5Lars-Dominik Braun2-9/+31
Some tags are “void”, i.e. cannot contain contents and don’t have a closing tag.
2017-11-25Ignore duplicate URLs when saving DOM snapshotLars-Dominik Braun1-1/+10
2017-11-25Workaround broken device metrics resetLars-Dominik Braun1-1/+3
Apparently neither width=0, height=0 nor clearDeviceMetricsOverride() do what they should, so manually reset to 1080p screen size.
2017-11-25Strip on* HTML attributesLars-Dominik Braun2-1/+111
They can carry JavaScript as well and should not be allowed for DOM snapshots.
2017-11-25Rename --run-before-snapshot and document --on* optionsLars-Dominik Braun2-4/+20
2017-11-24DOM snapshot: Save frames/subdocuments as wellLars-Dominik Braun1-13/+36
Request all subdocuments with pierce=True, split the result and save each document. Playback with pywb works, because timestamps of the snapshots are close to each other.
2017-11-24Reset device metricsLars-Dominik Braun1-2/+5
2017-11-24Save onsnapshot script to WARCLars-Dominik Braun1-4/+8
2017-11-22Make <canvas> static before DOM snapshotLars-Dominik Braun3-9/+31
Use --run-before-snapshot=canvas-snapshot.js. Replaces <canvas> with image snapshot. We could use .captureStream() as well.
2017-11-22Emulate different screen sizesLars-Dominik Braun2-3/+25
Causes the browser to load CSS assets and <img> srcset, for example.
2017-11-22Add example fixups for InstagramLars-Dominik Braun3-3/+32
2017-11-21Move base64 metadata into WARC headerLars-Dominik Braun1-1/+1
2017-11-21Graceful page load timeoutLars-Dominik Braun2-11/+32
Stop scrolling script, wait for remaining resources to load.
2017-11-20Add page created from DOM snapshotLars-Dominik Braun3-9/+119
2017-11-17Initial importLars-Dominik Braun7-0/+419