1 files changed, 11 insertions, 7 deletions
diff --git a/doc/index.html b/doc/index.html
index 6749647..19151b0 100644
--- a/doc/index.html
+++ b/doc/index.html
@@ -120,16 +120,18 @@
 		The corpus used for the following analysis consists of
 		</p>
 		<ul>
-			<li>547,110 articles from
-			<a href="https://www.aljazeera.net/">aljazeera.net</a>, an
-			Arabic-language news site</li>
-			<li>149,901 articles from <a href="http://www.bbc.com/arabic">BBC
-			Arabic</a>, another Arabic-language news site</li>
 			<li><a href="https://dumps.wikimedia.org/arwiki/20190701/">a
 			dump</a> of the <a href="https://ar.wikipedia.org/">Arabic
 			Wikipedia</a> as of July 2019, extracted using
 			<a href="https://github.com/attardi/wikiextractor/tree/3162bb6c3c9ebd2d15be507aa11d6fa818a454ac">wikiextractor</a>
 			containing 857,386 articles</li>
+			<li>547,110 articles from
+			<a href="https://www.aljazeera.net/">aljazeera.net</a>, an
+			Arabic-language news site</li>
+			<li>149,901 articles from <a href="http://www.bbc.com/arabic">BBC
+			Arabic</a>, another Arabic-language news site</li>
+			<li>116,754 documents from the
+			<a href="https://conferences.unite.un.org/UNCorpus/en/DownloadOverview">United Nations Parallel Corpus v1.0</a></li>
 			<li>1,709 ebooks from <a
 			href="https://www.hindawi.org/books">hindawi.org</a></li>
 			<li>and a plain-text copy of the Quran from <a
@@ -137,12 +139,14 @@
 			options Simple Enhanced and Text (for inclusion of diacritics)</li>
 		</ul>
 		<p>
-		summing up to roughly two billion characters.
+		summing up to roughly
+		825 million words or
+		5.5 billion characters. <!-- == combined button presses -->
 		<!-- -->
 		The plot below shows <bdo dir="ltr" lang="ar">ا ل ي م و ن</bdo> can be
 		considered the most frequently used letters in the Arabic language.
 		<!-- -->
-		Together they account for more than 50% of all letters in the corpus.
+		Together they account for more than 55% of all letters in the corpus.
 		</p>
 	</div>
 	</div>