diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/index.html | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/doc/index.html b/doc/index.html index f9daf88..6749647 100644 --- a/doc/index.html +++ b/doc/index.html @@ -129,13 +129,15 @@ dump</a> of the <a href="https://ar.wikipedia.org/">Arabic Wikipedia</a> as of July 2019, extracted using <a href="https://github.com/attardi/wikiextractor/tree/3162bb6c3c9ebd2d15be507aa11d6fa818a454ac">wikiextractor</a> - containing 857386 articles</li> + containing 857,386 articles</li> + <li>1,709 ebooks from <a + href="https://www.hindawi.org/books">hindawi.org</a></li> <li>and a plain-text copy of the Quran from <a href="http://tanzil.net/docs/download">tanzil.net</a> using the options Simple Enhanced and Text (for inclusion of diacritics)</li> </ul> <p> - summing up to roughly 1.5 billion characters. + summing up to roughly two billion characters. <!-- --> The plot below shows <bdo dir="ltr" lang="ar">ا ل ي م و ن</bdo> can be considered the most frequently used letters in the Arabic language. |