Age | Commit message (Collapse) | Author | Files | Lines |
|
In preparation for 1.0 release:
- Correct mime types
- Add X-Crocoite-Type, so logs, scripts, dom-snapshots and screenshots
can be identified easily
- Remove random WARC headers like X-Chrome-Initiator. We don’t want to
maintain those.
- Remove non-standard urn-based package URLs. Can’t use them without a
urn-registration
|
|
Using hyothesis-based testcase generation. This is quite nice compared
to manual test data generation, since it catches alot more corner cases
(if done right).
This commit also fixes a few issues, including:
- log records will only be written if the log is nonempty
- properly quote packageUrl path’s
- drop old thread checking code
- use placeholder url for scripts without name
|
|
Replaces str.format, which is less readable due to its separation of
format and arguments.
|
|
Requirement extraction fails if the package is an .egg file (i.e. not
extracted). Do not try to compute checksum/file length for them.
|
|
Use library yarl (already pulled in by aiohttp). No URL processed should
be a string.
|
|
In preparation for #9.
I was hoping to reuse one of schema.org’s microdata schema’s, but
neither Action (archival action) nor SoftwareApplication (version
information) seem to be suitable.
|
|
|
|
Fix a few random issues pointed out by pylint, mainly unused imports.
|
|
This is a direct port to asyncio without any design changes. These need
to happen in further refinements.
Fixes issue #1.
|
|
Change warcinfo record format to JSON (this is permitted by the specs)
and add Python version, dependencies and their versions as well as file
hashes.
This should give us enough information to figure out the exact
environment used to create the WARC.
|
|
Judging from the docs this is the proper way to store these resources.
Enable both for the IRC bot by default, since they won’t interfere with
IA’s wayback machine.
|
|
No functional changes, just cleanup. Replaces onload and onsnapshot
events. Move screen metric emulation, DOM snapshots and screenshots here
as well.
|