summaryrefslogtreecommitdiff
path: root/README.rst
diff options
context:
space:
mode:
authorLars-Dominik Braun <lars@6xq.net>2017-11-20 19:19:05 +0100
committerLars-Dominik Braun <lars@6xq.net>2017-11-20 19:25:33 +0100
commitca01f82227a8b79f1cbc4f5e0be5434804dc3c0e (patch)
treedd8aafb9b1672f70985eb5dd14635eb8635dd5e3 /README.rst
parent0b8a8e88a3c33c14e52241190ee6478cb2acd49d (diff)
downloadcrocoite-ca01f82227a8b79f1cbc4f5e0be5434804dc3c0e.tar.gz
crocoite-ca01f82227a8b79f1cbc4f5e0be5434804dc3c0e.tar.bz2
crocoite-ca01f82227a8b79f1cbc4f5e0be5434804dc3c0e.zip
Add page created from DOM snapshot
Diffstat (limited to 'README.rst')
-rw-r--r--README.rst20
1 files changed, 17 insertions, 3 deletions
diff --git a/README.rst b/README.rst
index 7eea272..f66da27 100644
--- a/README.rst
+++ b/README.rst
@@ -10,6 +10,7 @@ Dependencies
- Python 3
- pychrome_
- warcio_
+- html5lib
.. _pychrome: https://github.com/fate0/pychrome
.. _warcio: https://github.com/webrecorder/warcio
@@ -34,7 +35,20 @@ Caveats
-------
- Original HTTP requests/responses are not available. They are rebuilt from
- data available. Character encoding for text documents is changed to UTF-8.
-- Some sites request different assets based on screen resolution, some fetch
- different scripts based on user agent.
+ parsed data. Character encoding for text documents is changed to UTF-8.
+- Some sites request assets based on screen resolution, pixel ratio and
+ supported image formats (webp). Replaying those with different parameters
+ won’t work, since assets for those are missing. Example: missguided.com.
+- Some fetch different scripts based on user agent. Example: youtube.com.
+- Requests containing randomly generated JavaScript callback function names
+ won’t work. Example: weather.com.
+
+Most of these issues can be worked around by using the DOM snapshot, which is
+also saved. This causes its own set of issues though:
+
+- JavaScript-based navigation does not work.
+- Scripts modifying styles based on scrolling position are stuck at the end of
+ page state at the moment. Example: twitter.com
+- CSS-based asset loading (screen size, pixel ratio, …) still does not work.
+- Canvas contents are probably not preserved.