Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2019-01-07 | Log Chrome’s responses to WARC by default | Lars-Dominik Braun | 5 | -19/+32 | |
We may not be able to reproduce every failure, so logging as much as possible is important to figure out what went wrong. Also, in case a bug is uncovered in the future, we can check the logs and possibly fix it with -errata. | |||||
2019-01-05 | browser: Do not overwrite request data when prefetching | Lars-Dominik Braun | 1 | -2/+0 | |
Needs a testcase. | |||||
2019-01-05 | html: Handle CDATA | Lars-Dominik Braun | 1 | -1/+5 | |
When loading XML documents Chrome presents a pretty-printed version to the user, which still contains the original XML when exporting via DOM.getDocument. Not sure how to test this. | |||||
2019-01-05 | controller: Fix PrefixLimit | Lars-Dominik Braun | 1 | -1/+1 | |
Probably broken by the transition to URL() in commit 5e444dd6511d97308a84ae9c86ebf14547d01f01 And yes, we desperately need some tests for this. | |||||
2019-01-04 | behavior: Ignore onstop() failure | Lars-Dominik Braun | 1 | -4/+14 | |
Fails if the page is reloaded/redirected. See issue #13. | |||||
2019-01-04 | logger: Do not log debug by default | Lars-Dominik Braun | 1 | -1/+1 | |
Must’ve slipped through. | |||||
2019-01-04 | coverage: Ignore a few unreachable statements | Lars-Dominik Braun | 2 | -7/+7 | |
2019-01-04 | behavior: Support clicking area and add testcase | Lars-Dominik Braun | 2 | -7/+76 | |
2019-01-03 | browser: Turn Item into RequestResponsePair | Lars-Dominik Braun | 6 | -485/+627 | |
Previously Item was just a simple wrapper around Chrome’s Network.* events. This turned out to be quite nasty when testing, so its replacement, RequestResponsePair, does some level of abstraction. This makes testing alot easier, since we now can simply instantiate it without building a proper DevTools event. Should come without any functional changes. | |||||
2018-12-31 | extract-screenshot: Remove URL from filename | Lars-Dominik Braun | 1 | -8/+19 | |
URL’s can get quite long, overflowing the file name length limit. Instead use sequential filenames and output metadata to stdout. | |||||
2018-12-25 | warc: Add tests | Lars-Dominik Braun | 4 | -17/+280 | |
Using hyothesis-based testcase generation. This is quite nice compared to manual test data generation, since it catches alot more corner cases (if done right). This commit also fixes a few issues, including: - log records will only be written if the log is nonempty - properly quote packageUrl path’s - drop old thread checking code - use placeholder url for scripts without name | |||||
2018-12-25 | logger: Fix constructor default arguments | Lars-Dominik Braun | 2 | -3/+12 | |
Default arguments cannot be mutable objects. | |||||
2018-12-24 | Drop deprecated debug parameter | Lars-Dominik Braun | 1 | -1/+1 | |
2018-12-24 | Use f-strings where possible | Lars-Dominik Braun | 11 | -60/+63 | |
Replaces str.format, which is less readable due to its separation of format and arguments. | |||||
2018-12-23 | Skip test if invalid domain exists | Lars-Dominik Braun | 1 | -7/+17 | |
Must not exist for this test. | |||||
2018-12-22 | Fix recursive mode’s URL parsing | Lars-Dominik Braun | 1 | -1/+2 | |
Broken by commit 5e444dd6511d97308a84ae9c86ebf14547d01f01. URL’s read from stdin must be converted from str. | |||||
2018-12-22 | Switch -recursive to asyncio’s .cancel() | Lars-Dominik Braun | 2 | -55/+58 | |
RecursiveController used a custom .cancel() method before. Instead we can simply cancel .run() and handle the CancelledError inside run() and fetch(). | |||||
2018-12-21 | Remove unused EventHandler property | Lars-Dominik Braun | 1 | -6/+0 | |
Crash detection was moved into -recursive’s return code checking a while ago. | |||||
2018-12-21 | util: Skip missing source files | Lars-Dominik Braun | 1 | -1/+1 | |
Requirement extraction fails if the package is an .egg file (i.e. not extracted). Do not try to compute checksum/file length for them. | |||||
2018-12-21 | Parse URLs by default | Lars-Dominik Braun | 10 | -89/+68 | |
Use library yarl (already pulled in by aiohttp). No URL processed should be a string. | |||||
2018-12-17 | Add simple errata tool | Lars-Dominik Braun | 2 | -1/+98 | |
Fixes #9. | |||||
2018-12-13 | behavior: Whitelist gab.com as well | Lars-Dominik Braun | 1 | -4/+6 | |
2018-12-11 | behavior: Add click test URLs for Twitter | Lars-Dominik Braun | 1 | -1/+3 | |
2018-12-08 | behavior: Dump script options to file as well | Lars-Dominik Braun | 1 | -3/+5 | |
click.js’s data was part of the script before 22adde79940d32c5f094f26f3e18b7160e7ccafc. Now it is injected dynamically, but it still would be nice to have the data available. | |||||
2018-12-08 | controller: Reraise queue processing errors early | Lars-Dominik Braun | 1 | -1/+7 | |
2018-12-08 | tools: Add version info to merged WARCs | Lars-Dominik Braun | 4 | -17/+54 | |
In preparation for #9. I was hoping to reuse one of schema.org’s microdata schema’s, but neither Action (archival action) nor SoftwareApplication (version information) seem to be suitable. | |||||
2018-12-06 | behavior: Fix patreon selector | Lars-Dominik Braun | 1 | -3/+2 | |
And that proves their CSS class names are not stable and cannot be used. | |||||
2018-12-05 | behavior: Add gamasutra.com click selector | Lars-Dominik Braun | 1 | -0/+7 | |
2018-12-02 | behavior: Add more documentation | Lars-Dominik Braun | 1 | -2/+14 | |
2018-12-02 | behavior: Remove outdated comment | Lars-Dominik Braun | 1 | -3/+0 | |
2018-12-02 | behavior: Re-enable clearDeviceMetricsOverride | Lars-Dominik Braun | 1 | -4/+1 | |
Seems to be working again. Chrome bug? | |||||
2018-12-02 | behavior: Improve click testing | Lars-Dominik Braun | 2 | -22/+56 | |
Some pages require scrolling, so we need a SinglePageController. Also mark network-dependent tests with xfail, so they won’t affect the overall test result unless you know what you’re doing (--runxfail). | |||||
2018-12-02 | controller: Add only enabled behavior scripts to warcinfo | Lars-Dominik Braun | 1 | -5/+5 | |
2018-12-02 | behavior: Remove unused slots | Lars-Dominik Braun | 1 | -2/+0 | |
2018-12-02 | controller: Remove unused argument | Lars-Dominik Braun | 2 | -5/+4 | |
Has been replaced by handler a while ago. | |||||
2018-12-01 | util: Remove unused function | Lars-Dominik Braun | 2 | -6/+1 | |
2018-12-01 | behavior: Add selector test cases | Lars-Dominik Braun | 1 | -0/+78 | |
Fixes #3. | |||||
2018-12-01 | behavior: Move click script data to external file | Lars-Dominik Braun | 4 | -149/+169 | |
First step of issue #3 | |||||
2018-12-01 | cli: Fix --behavior | Lars-Dominik Braun | 1 | -2/+3 | |
2018-11-28 | behavior: Expand issue comments on GitHub | Lars-Dominik Braun | 1 | -0/+6 | |
2018-11-26 | behavior: Close Facebook’s nag screen | Lars-Dominik Braun | 1 | -1/+1 | |
Worked previously, broken by a site update. | |||||
2018-11-25 | behavior: Turn scroll JS code into class | Lars-Dominik Braun | 2 | -27/+33 | |
2018-11-25 | single: Graceful ^C | Lars-Dominik Braun | 2 | -2/+13 | |
Allow cancellation of timeout wait. | |||||
2018-11-24 | behavior: Never scroll html/body elements | Lars-Dominik Braun | 1 | -1/+1 | |
Fixes weird positioning of elements tethered to viewport top. | |||||
2018-11-24 | behavior: Fix scrolling | Lars-Dominik Braun | 4 | -42/+49 | |
- Introduce stop() method callable from Python. Looks like the old method (global variable) was not working (any more?). This is much better anyway. - Restore state of scrolled elements (not window). Fixes weird screenshots of twitter.com. | |||||
2018-11-24 | browser: Ignore load failures for nonexisting requests | Lars-Dominik Braun | 1 | -2/+3 | |
Fixes None dereference. | |||||
2018-11-22 | controller: Improve idle waiting | Lars-Dominik Braun | 3 | -19/+89 | |
2018-11-19 | controller: Add parameters to warcinfo | Lars-Dominik Braun | 1 | -0/+7 | |
Add parameters the grab was run with, so we can actually reproduce a run. | |||||
2018-11-19 | Coding style | Lars-Dominik Braun | 12 | -58/+44 | |
Fix a few random issues pointed out by pylint, mainly unused imports. | |||||
2018-11-17 | html: Add tests for tree walker | Lars-Dominik Braun | 1 | -1/+23 | |