summaryrefslogtreecommitdiff
path: root/README.rst
diff options
context:
space:
mode:
Diffstat (limited to 'README.rst')
-rw-r--r--README.rst20
1 files changed, 17 insertions, 3 deletions
diff --git a/README.rst b/README.rst
index 7eea272..f66da27 100644
--- a/README.rst
+++ b/README.rst
@@ -10,6 +10,7 @@ Dependencies
- Python 3
- pychrome_
- warcio_
+- html5lib
.. _pychrome: https://github.com/fate0/pychrome
.. _warcio: https://github.com/webrecorder/warcio
@@ -34,7 +35,20 @@ Caveats
-------
- Original HTTP requests/responses are not available. They are rebuilt from
- data available. Character encoding for text documents is changed to UTF-8.
-- Some sites request different assets based on screen resolution, some fetch
- different scripts based on user agent.
+ parsed data. Character encoding for text documents is changed to UTF-8.
+- Some sites request assets based on screen resolution, pixel ratio and
+ supported image formats (webp). Replaying those with different parameters
+ won’t work, since assets for those are missing. Example: missguided.com.
+- Some fetch different scripts based on user agent. Example: youtube.com.
+- Requests containing randomly generated JavaScript callback function names
+ won’t work. Example: weather.com.
+
+Most of these issues can be worked around by using the DOM snapshot, which is
+also saved. This causes its own set of issues though:
+
+- JavaScript-based navigation does not work.
+- Scripts modifying styles based on scrolling position are stuck at the end of
+ page state at the moment. Example: twitter.com
+- CSS-based asset loading (screen size, pixel ratio, …) still does not work.
+- Canvas contents are probably not preserved.