Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2018-12-31 | extract-screenshot: Remove URL from filename | Lars-Dominik Braun | 1 | -8/+19 | |
URL’s can get quite long, overflowing the file name length limit. Instead use sequential filenames and output metadata to stdout. | |||||
2018-12-25 | warc: Add tests | Lars-Dominik Braun | 5 | -18/+281 | |
Using hyothesis-based testcase generation. This is quite nice compared to manual test data generation, since it catches alot more corner cases (if done right). This commit also fixes a few issues, including: - log records will only be written if the log is nonempty - properly quote packageUrl path’s - drop old thread checking code - use placeholder url for scripts without name | |||||
2018-12-25 | logger: Fix constructor default arguments | Lars-Dominik Braun | 2 | -3/+12 | |
Default arguments cannot be mutable objects. | |||||
2018-12-24 | Drop deprecated debug parameter | Lars-Dominik Braun | 1 | -1/+1 | |
2018-12-24 | Use f-strings where possible | Lars-Dominik Braun | 11 | -60/+63 | |
Replaces str.format, which is less readable due to its separation of format and arguments. | |||||
2018-12-23 | Skip test if invalid domain exists | Lars-Dominik Braun | 1 | -7/+17 | |
Must not exist for this test. | |||||
2018-12-22 | Fix recursive mode’s URL parsing | Lars-Dominik Braun | 1 | -1/+2 | |
Broken by commit 5e444dd6511d97308a84ae9c86ebf14547d01f01. URL’s read from stdin must be converted from str. | |||||
2018-12-22 | Switch -recursive to asyncio’s .cancel() | Lars-Dominik Braun | 2 | -55/+58 | |
RecursiveController used a custom .cancel() method before. Instead we can simply cancel .run() and handle the CancelledError inside run() and fetch(). | |||||
2018-12-21 | Remove unused EventHandler property | Lars-Dominik Braun | 1 | -6/+0 | |
Crash detection was moved into -recursive’s return code checking a while ago. | |||||
2018-12-21 | util: Skip missing source files | Lars-Dominik Braun | 1 | -1/+1 | |
Requirement extraction fails if the package is an .egg file (i.e. not extracted). Do not try to compute checksum/file length for them. | |||||
2018-12-21 | Parse URLs by default | Lars-Dominik Braun | 12 | -89/+71 | |
Use library yarl (already pulled in by aiohttp). No URL processed should be a string. | |||||
2018-12-18 | travis: -dev builds are allowed to fail | Lars-Dominik Braun | 1 | -6/+11 | |
2018-12-17 | Add simple errata tool | Lars-Dominik Braun | 3 | -1/+99 | |
Fixes #9. | |||||
2018-12-13 | behavior: Whitelist gab.com as well | Lars-Dominik Braun | 1 | -4/+6 | |
2018-12-11 | behavior: Add click test URLs for Twitter | Lars-Dominik Braun | 1 | -1/+3 | |
2018-12-08 | behavior: Dump script options to file as well | Lars-Dominik Braun | 1 | -3/+5 | |
click.js’s data was part of the script before 22adde79940d32c5f094f26f3e18b7160e7ccafc. Now it is injected dynamically, but it still would be nice to have the data available. | |||||
2018-12-08 | controller: Reraise queue processing errors early | Lars-Dominik Braun | 1 | -1/+7 | |
2018-12-08 | tools: Add version info to merged WARCs | Lars-Dominik Braun | 4 | -17/+54 | |
In preparation for #9. I was hoping to reuse one of schema.org’s microdata schema’s, but neither Action (archival action) nor SoftwareApplication (version information) seem to be suitable. | |||||
2018-12-07 | README: Add note about browser config/fonts | Lars-Dominik Braun | 1 | -0/+27 | |
2018-12-06 | behavior: Fix patreon selector | Lars-Dominik Braun | 1 | -3/+2 | |
And that proves their CSS class names are not stable and cannot be used. | |||||
2018-12-05 | irc: Add example config file | Lars-Dominik Braun | 1 | -0/+10 | |
2018-12-05 | behavior: Add gamasutra.com click selector | Lars-Dominik Braun | 1 | -0/+7 | |
2018-12-02 | behavior: Add more documentation | Lars-Dominik Braun | 1 | -2/+14 | |
2018-12-02 | behavior: Remove outdated comment | Lars-Dominik Braun | 1 | -3/+0 | |
2018-12-02 | behavior: Re-enable clearDeviceMetricsOverride | Lars-Dominik Braun | 1 | -4/+1 | |
Seems to be working again. Chrome bug? | |||||
2018-12-02 | behavior: Improve click testing | Lars-Dominik Braun | 2 | -22/+56 | |
Some pages require scrolling, so we need a SinglePageController. Also mark network-dependent tests with xfail, so they won’t affect the overall test result unless you know what you’re doing (--runxfail). | |||||
2018-12-02 | controller: Add only enabled behavior scripts to warcinfo | Lars-Dominik Braun | 1 | -5/+5 | |
2018-12-02 | behavior: Remove unused slots | Lars-Dominik Braun | 1 | -2/+0 | |
2018-12-02 | controller: Remove unused argument | Lars-Dominik Braun | 2 | -5/+4 | |
Has been replaced by handler a while ago. | |||||
2018-12-01 | util: Remove unused function | Lars-Dominik Braun | 2 | -6/+1 | |
2018-12-01 | behavior: Add selector test cases | Lars-Dominik Braun | 1 | -0/+78 | |
Fixes #3. | |||||
2018-12-01 | behavior: Move click script data to external file | Lars-Dominik Braun | 6 | -149/+172 | |
First step of issue #3 | |||||
2018-12-01 | cli: Fix --behavior | Lars-Dominik Braun | 1 | -2/+3 | |
2018-12-01 | README: Minor improvements | Lars-Dominik Braun | 1 | -6/+11 | |
Command line was outdated. | |||||
2018-11-28 | behavior: Expand issue comments on GitHub | Lars-Dominik Braun | 1 | -0/+6 | |
2018-11-26 | behavior: Close Facebook’s nag screen | Lars-Dominik Braun | 1 | -1/+1 | |
Worked previously, broken by a site update. | |||||
2018-11-25 | README: Google Chrome is a dependency | Lars-Dominik Braun | 1 | -0/+2 | |
Obviously. | |||||
2018-11-25 | behavior: Turn scroll JS code into class | Lars-Dominik Braun | 2 | -27/+33 | |
2018-11-25 | single: Graceful ^C | Lars-Dominik Braun | 2 | -2/+13 | |
Allow cancellation of timeout wait. | |||||
2018-11-24 | behavior: Never scroll html/body elements | Lars-Dominik Braun | 1 | -1/+1 | |
Fixes weird positioning of elements tethered to viewport top. | |||||
2018-11-24 | behavior: Fix scrolling | Lars-Dominik Braun | 4 | -42/+49 | |
- Introduce stop() method callable from Python. Looks like the old method (global variable) was not working (any more?). This is much better anyway. - Restore state of scrolled elements (not window). Fixes weird screenshots of twitter.com. | |||||
2018-11-24 | browser: Ignore load failures for nonexisting requests | Lars-Dominik Braun | 1 | -2/+3 | |
Fixes None dereference. | |||||
2018-11-22 | travis: Switch to xenial | Lars-Dominik Braun | 1 | -1/+3 | |
The image offers Python 3.7 and 3.8-dev | |||||
2018-11-22 | controller: Improve idle waiting | Lars-Dominik Braun | 3 | -19/+89 | |
2018-11-19 | controller: Add parameters to warcinfo | Lars-Dominik Braun | 1 | -0/+7 | |
Add parameters the grab was run with, so we can actually reproduce a run. | |||||
2018-11-19 | Coding style | Lars-Dominik Braun | 12 | -58/+44 | |
Fix a few random issues pointed out by pylint, mainly unused imports. | |||||
2018-11-17 | html: Add tests for tree walker | Lars-Dominik Braun | 1 | -1/+23 | |
2018-11-17 | logger: Add more tests | Lars-Dominik Braun | 2 | -3/+25 | |
2018-11-17 | browser: Add tests for header deserialization | Lars-Dominik Braun | 1 | -0/+39 | |
2018-11-17 | devtools: Update browser flags | Lars-Dominik Braun | 1 | -0/+12 | |
Add a few more that seem reasonable. |