Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2019-10-19 | devtools: Fix load testcase | Lars-Dominik Braun | 1 | -18/+35 | |
Handle new *ExtraInfo events, but do not use them in browser yet, since they’re still marked experimental. | |||||
2019-10-18 | click: Fix click selectors | Lars-Dominik Braun | 1 | -2/+2 | |
YouTube and Vimeo. | |||||
2019-10-13 | browser: Work around missing responseReceived events | Lars-Dominik Braun | 1 | -0/+7 | |
Looks like Chrome extensively reuses request ids now. Sucks, since we relied on their uniqueness. For now ignore requests without a dedicated responseReceived event. See issue #24. | |||||
2019-10-13 | extract-links: Do not depend on document.body | Lars-Dominik Braun | 1 | -1/+1 | |
Fixes #25. Root frame does not actually display a page. Can’t reproduce this issue with a simple test case unfortunately. | |||||
2019-10-13 | devtools: Remove explicit loop parameter | Lars-Dominik Braun | 1 | -5/+4 | |
aiohttp removed it with release 4.0.0a1: https://github.com/aio-libs/aiohttp/commit/c8dbe758e2cfa4304cab9a1b056031aba92e4f02 and we weren’t using it anyway. | |||||
2019-07-29 | doc: Auto-generate list of supported click selectors | Lars-Dominik Braun | 5 | -21/+89 | |
Using shinx plugin. Also improve click selector descriptions for this purpose. | |||||
2019-07-28 | Back to -dev | Lars-Dominik Braun | 1 | -1/+1 | |
2019-07-28 | Release version 1.1.0v1.1.0 | Lars-Dominik Braun | 1 | -1/+1 | |
2019-07-28 | behavior: Update click selectors | Lars-Dominik Braun | 1 | -9/+3 | |
2019-07-28 | behavior: Increase idle timeout for click testing | Lars-Dominik Braun | 1 | -1/+3 | |
2019-07-28 | Update docs | Lars-Dominik Braun | 2 | -153/+6 | |
Add missing optin -b for IRC bot. Simplify sphinx config file. | |||||
2019-07-28 | Fix wrong Content-Type header parameter | Lars-Dominik Braun | 5 | -24/+120 | |
In line with HTTP “encoding” parameter should be called “charset”. Fixable errata item created. Fixes issue #19. | |||||
2019-07-25 | behavior: Ignore failed onload script injection | Lars-Dominik Braun | 1 | -10/+21 | |
Will be re-injected by controller anyway. | |||||
2019-07-13 | Cookie injection support | Lars-Dominik Braun | 8 | -22/+214 | |
Add command-line options injecting individual cookies or cookie file into Chrome. Provide default cookie file. This changes the IRC bot’s command splitting to shlex.split, which allows shell-like argument quoting. Fixes #7. | |||||
2019-07-11 | devtools: Add more crash error handling | Lars-Dominik Braun | 1 | -6/+23 | |
In case the whole browser crashes (rare) we will neither be able to close the tab on __aexit__, nor send SIGTERM to it. Make sure we still terminate gracefully. | |||||
2019-07-06 | Improve documentation | Lars-Dominik Braun | 2 | -12/+68 | |
2019-07-06 | controller: Add missing import | Lars-Dominik Braun | 1 | -1/+1 | |
2019-07-04 | Release version 1.0.0v1.0.0 | Lars-Dominik Braun | 1 | -1/+11 | |
2019-07-04 | dashboard: Ignore invalid json input | Lars-Dominik Braun | 1 | -1/+5 | |
We should be able to recover from this. | |||||
2019-07-04 | behavior: Update click selector list | Lars-Dominik Braun | 1 | -18/+3 | |
Remove instagram, no stable CSS names. Update gab. | |||||
2019-07-04 | Update documentation | Lars-Dominik Braun | 6 | -60/+62 | |
Re-arrange stuff, add release guide. Needs a lot more work though. | |||||
2019-07-04 | devtools: Prefix temp directories | Lars-Dominik Braun | 1 | -1/+1 | |
2019-07-04 | Rename cli utils | Lars-Dominik Braun | 6 | -98/+127 | |
crocoite-recursive is now just crocoite, crocoite-grab is not user-facing any more and called crocoite-single. In preparation for 1.0 release. | |||||
2019-07-03 | irc: Do not respond when not addressed directly | Lars-Dominik Braun | 1 | -1/+1 | |
This fixes annoying messages when using the bot’s nick as the first word of a message, i.e. “chromebot can do that”. | |||||
2019-07-02 | behavior: Add missing uuid’s to logging call | Lars-Dominik Braun | 1 | -2/+5 | |
2019-07-02 | Fix exit status logging | Lars-Dominik Braun | 1 | -1/+1 | |
Fixes commit 158f55eb7fb24fa26727a008ad44964390171060. Logger works only if WARC is still open. | |||||
2019-07-02 | Stabilize WARC headers | Lars-Dominik Braun | 6 | -46/+73 | |
In preparation for 1.0 release: - Correct mime types - Add X-Crocoite-Type, so logs, scripts, dom-snapshots and screenshots can be identified easily - Remove random WARC headers like X-Chrome-Initiator. We don’t want to maintain those. - Remove non-standard urn-based package URLs. Can’t use them without a urn-registration | |||||
2019-06-28 | tools: Add missing \n to JSON output | Lars-Dominik Braun | 1 | -0/+1 | |
Fixes 76811bd3f0b3fc8688939e31fdab2c71c89cc75b | |||||
2019-06-27 | extract-screenshot: Allow extracting only the first screenshot | Lars-Dominik Braun | 1 | -1/+6 | |
2019-06-27 | merge: Dump machine-readable info | Lars-Dominik Braun | 1 | -2/+18 | |
2019-06-26 | Allow turning off cert validation | Lars-Dominik Braun | 3 | -11/+37 | |
Add --insecure switch (shamelessly stolen from CURL) to both, -grab and -irc. | |||||
2019-06-26 | behavior: screenshot: Extend viewport for fixed elements | Lars-Dominik Braun | 2 | -11/+57 | |
Fixes #14, but needs a test case. | |||||
2019-06-18 | behavior: Fix screenshots | Lars-Dominik Braun | 1 | -4/+16 | |
Chrome’s behavior wrt screeshots changed in some version, so now artificially extending the viewport via device metrics is required. | |||||
2019-06-18 | Re-inject behavior scripts on site reload | Lars-Dominik Braun | 7 | -52/+114 | |
Fixes #13. Event handler’s push() is async now. | |||||
2019-06-18 | Fix idle state tracking race condition | Lars-Dominik Braun | 4 | -93/+121 | |
Closes #16. Expose SiteLoader’s page idle changes through events and move state tracking into controller event handler. Relies on tracking time instead of asyncio event, which is more reliable. | |||||
2019-06-17 | devtools: Fix testcase | Lars-Dominik Braun | 1 | -3/+18 | |
The body is only available after receiving the loadingFinished event. | |||||
2019-06-17 | html: Fix CDATA walking | Lars-Dominik Braun | 2 | -5/+42 | |
Missing “from” keyword, returned generator instead of dicts. Properly recreate CDATA elements now. | |||||
2019-06-17 | cli: Log exit status | Lars-Dominik Braun | 1 | -0/+1 | |
2019-05-30 | controller: Fix -recursive stats | Lars-Dominik Braun | 1 | -2/+5 | |
have previously included running jobs. Remove them. | |||||
2019-05-30 | controller: Correctly re-raise exceptions | Lars-Dominik Braun | 1 | -1/+2 | |
asyncio.gather returns the task’s results or exception, not task objects. Probably a copy&paste error. | |||||
2019-05-30 | controller: Fix DepthLimit | Lars-Dominik Braun | 2 | -12/+45 | |
The policy itself must be stateless, since there can be multiple ExtractLinks events (which would cause DepthLimit to reduce its depth every time). | |||||
2019-05-26 | behavior: Add clicking for vimeo.com | Lars-Dominik Braun | 1 | -0/+11 | |
2019-05-24 | dashboard: Remove delete button | Lars-Dominik Braun | 2 | -16/+3 | |
There’s really no point in having it | |||||
2019-05-24 | dashboard: Add global bot stats | Lars-Dominik Braun | 2 | -2/+18 | |
2019-05-22 | behavior: Extract links from plain-text documents | Lars-Dominik Braun | 1 | -0/+13 | |
2019-05-13 | devtools: Try to delete temp Chrome data dir – hard | Lars-Dominik Braun | 1 | -1/+11 | |
Fixes #17. | |||||
2019-05-12 | behavior: Ignore invalid URLs when extracting links | Lars-Dominik Braun | 2 | -2/+18 | |
Fixes #18. | |||||
2019-05-05 | irc: Switch job id’s to proquints | Lars-Dominik Braun | 1 | -4/+41 | |
They’re easier to read and remember for humans. Plus we don’t really need 128 bits of randomness. Time-based id’s are fine here. | |||||
2019-05-05 | irc: Add job info to warcinfo record | Lars-Dominik Braun | 2 | -6/+22 | |
2019-05-05 | cli: Allow adding extra data to warcinfo record | Lars-Dominik Braun | 2 | -4/+12 | |