| Age | Commit message (Collapse) | Author | Files | Lines | |
|---|---|---|---|---|---|
| 2018-12-22 | Fix recursive mode’s URL parsing | Lars-Dominik Braun | 1 | -1/+2 | |
| Broken by commit 5e444dd6511d97308a84ae9c86ebf14547d01f01. URL’s read from stdin must be converted from str. | |||||
| 2018-12-22 | Switch -recursive to asyncio’s .cancel() | Lars-Dominik Braun | 2 | -55/+58 | |
| RecursiveController used a custom .cancel() method before. Instead we can simply cancel .run() and handle the CancelledError inside run() and fetch(). | |||||
| 2018-12-21 | Remove unused EventHandler property | Lars-Dominik Braun | 1 | -6/+0 | |
| Crash detection was moved into -recursive’s return code checking a while ago. | |||||
| 2018-12-21 | util: Skip missing source files | Lars-Dominik Braun | 1 | -1/+1 | |
| Requirement extraction fails if the package is an .egg file (i.e. not extracted). Do not try to compute checksum/file length for them. | |||||
| 2018-12-21 | Parse URLs by default | Lars-Dominik Braun | 10 | -89/+68 | |
| Use library yarl (already pulled in by aiohttp). No URL processed should be a string. | |||||
| 2018-12-17 | Add simple errata tool | Lars-Dominik Braun | 2 | -1/+98 | |
| Fixes #9. | |||||
| 2018-12-13 | behavior: Whitelist gab.com as well | Lars-Dominik Braun | 1 | -4/+6 | |
| 2018-12-11 | behavior: Add click test URLs for Twitter | Lars-Dominik Braun | 1 | -1/+3 | |
| 2018-12-08 | behavior: Dump script options to file as well | Lars-Dominik Braun | 1 | -3/+5 | |
| click.js’s data was part of the script before 22adde79940d32c5f094f26f3e18b7160e7ccafc. Now it is injected dynamically, but it still would be nice to have the data available. | |||||
| 2018-12-08 | controller: Reraise queue processing errors early | Lars-Dominik Braun | 1 | -1/+7 | |
| 2018-12-08 | tools: Add version info to merged WARCs | Lars-Dominik Braun | 4 | -17/+54 | |
| In preparation for #9. I was hoping to reuse one of schema.org’s microdata schema’s, but neither Action (archival action) nor SoftwareApplication (version information) seem to be suitable. | |||||
| 2018-12-06 | behavior: Fix patreon selector | Lars-Dominik Braun | 1 | -3/+2 | |
| And that proves their CSS class names are not stable and cannot be used. | |||||
| 2018-12-05 | behavior: Add gamasutra.com click selector | Lars-Dominik Braun | 1 | -0/+7 | |
| 2018-12-02 | behavior: Add more documentation | Lars-Dominik Braun | 1 | -2/+14 | |
| 2018-12-02 | behavior: Remove outdated comment | Lars-Dominik Braun | 1 | -3/+0 | |
| 2018-12-02 | behavior: Re-enable clearDeviceMetricsOverride | Lars-Dominik Braun | 1 | -4/+1 | |
| Seems to be working again. Chrome bug? | |||||
| 2018-12-02 | behavior: Improve click testing | Lars-Dominik Braun | 2 | -22/+56 | |
| Some pages require scrolling, so we need a SinglePageController. Also mark network-dependent tests with xfail, so they won’t affect the overall test result unless you know what you’re doing (--runxfail). | |||||
| 2018-12-02 | controller: Add only enabled behavior scripts to warcinfo | Lars-Dominik Braun | 1 | -5/+5 | |
| 2018-12-02 | behavior: Remove unused slots | Lars-Dominik Braun | 1 | -2/+0 | |
| 2018-12-02 | controller: Remove unused argument | Lars-Dominik Braun | 2 | -5/+4 | |
| Has been replaced by handler a while ago. | |||||
| 2018-12-01 | util: Remove unused function | Lars-Dominik Braun | 2 | -6/+1 | |
| 2018-12-01 | behavior: Add selector test cases | Lars-Dominik Braun | 1 | -0/+78 | |
| Fixes #3. | |||||
| 2018-12-01 | behavior: Move click script data to external file | Lars-Dominik Braun | 4 | -149/+169 | |
| First step of issue #3 | |||||
| 2018-12-01 | cli: Fix --behavior | Lars-Dominik Braun | 1 | -2/+3 | |
| 2018-11-28 | behavior: Expand issue comments on GitHub | Lars-Dominik Braun | 1 | -0/+6 | |
| 2018-11-26 | behavior: Close Facebook’s nag screen | Lars-Dominik Braun | 1 | -1/+1 | |
| Worked previously, broken by a site update. | |||||
| 2018-11-25 | behavior: Turn scroll JS code into class | Lars-Dominik Braun | 2 | -27/+33 | |
| 2018-11-25 | single: Graceful ^C | Lars-Dominik Braun | 2 | -2/+13 | |
| Allow cancellation of timeout wait. | |||||
| 2018-11-24 | behavior: Never scroll html/body elements | Lars-Dominik Braun | 1 | -1/+1 | |
| Fixes weird positioning of elements tethered to viewport top. | |||||
| 2018-11-24 | behavior: Fix scrolling | Lars-Dominik Braun | 4 | -42/+49 | |
| - Introduce stop() method callable from Python. Looks like the old method (global variable) was not working (any more?). This is much better anyway. - Restore state of scrolled elements (not window). Fixes weird screenshots of twitter.com. | |||||
| 2018-11-24 | browser: Ignore load failures for nonexisting requests | Lars-Dominik Braun | 1 | -2/+3 | |
| Fixes None dereference. | |||||
| 2018-11-22 | controller: Improve idle waiting | Lars-Dominik Braun | 3 | -19/+89 | |
| 2018-11-19 | controller: Add parameters to warcinfo | Lars-Dominik Braun | 1 | -0/+7 | |
| Add parameters the grab was run with, so we can actually reproduce a run. | |||||
| 2018-11-19 | Coding style | Lars-Dominik Braun | 12 | -58/+44 | |
| Fix a few random issues pointed out by pylint, mainly unused imports. | |||||
| 2018-11-17 | html: Add tests for tree walker | Lars-Dominik Braun | 1 | -1/+23 | |
| 2018-11-17 | logger: Add more tests | Lars-Dominik Braun | 2 | -3/+25 | |
| 2018-11-17 | browser: Add tests for header deserialization | Lars-Dominik Braun | 1 | -0/+39 | |
| 2018-11-17 | devtools: Update browser flags | Lars-Dominik Braun | 1 | -0/+12 | |
| Add a few more that seem reasonable. | |||||
| 2018-11-17 | browser: clearBrowserCookies is supported unconditionally | Lars-Dominik Braun | 1 | -4/+1 | |
| canClearBrowserCookies apparently has been removed from protocol 1.3. | |||||
| 2018-11-17 | tools: Add original HTTP header to revisit record | Lars-Dominik Braun | 2 | -11/+13 | |
| The payloads may be the same, but the headers are usually not. | |||||
| 2018-11-17 | click: Add gab.ai | Lars-Dominik Braun | 1 | -0/+10 | |
| Load more posts on profile page and more comments and replies on individual post pages. | |||||
| 2018-11-14 | Async chrome process startup | Lars-Dominik Braun | 6 | -157/+161 | |
| Move it to .devtools. Seems more fitting. | |||||
| 2018-11-10 | tools: Fix WARC merging | Lars-Dominik Braun | 2 | -18/+205 | |
| WARC-Target-URI was taken from the previous record, even if the URI was different. This essentially removes the revisited URL from the archive. Also add a few tests. And boy, warcio is a mess. | |||||
| 2018-11-08 | devtools: Disable websocket pings to Chrome | Lars-Dominik Braun | 2 | -1/+12 | |
| Chrome does not like that. | |||||
| 2018-11-06 | Switch single mode to asyncio | Lars-Dominik Braun | 5 | -175/+141 | |
| This is a direct port to asyncio without any design changes. These need to happen in further refinements. Fixes issue #1. | |||||
| 2018-11-06 | Switch site loader to async DevTools communication | Lars-Dominik Braun | 2 | -229/+236 | |
| 2018-11-06 | Add simple asyncio-based DevTool communication | Lars-Dominik Braun | 2 | -0/+406 | |
| Inspired by pychrome/aiochrome, but includes crash handling and async get() instead of callbacks. | |||||
| 2018-11-03 | html: Add tests for tag/attribute stripping | Lars-Dominik Braun | 1 | -0/+38 | |
| 2018-10-30 | recursive: Actually stop the grab when canceled | Lars-Dominik Braun | 1 | -1/+3 | |
| This change was lost during the merge of 958563a3602780b48599c27acf212139c2e6904d. | |||||
| 2018-10-30 | Reduce idle wait time after stopping page | Lars-Dominik Braun | 1 | -4/+4 | |
