From 4679f89e8fe2541e10eb1c834eb9f56a68b0e3ee Mon Sep 17 00:00:00 2001 From: Lars-Dominik Braun Date: Sat, 25 Apr 2020 21:07:11 +0200 Subject: ar-lulua: Optimize layer two and three Take another stab at the symbol layers and call it v0.3. --- lulua/data/report/index.html | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) (limited to 'lulua/data/report/index.html') diff --git a/lulua/data/report/index.html b/lulua/data/report/index.html index 96725b7..0e4c779 100644 --- a/lulua/data/report/index.html +++ b/lulua/data/report/index.html @@ -230,7 +230,13 @@ From several runs with 100.000 iterations each the layout which had good scores and looked reasonable to the human eye was picked. - Optimal arrengement of layers two and up are still under investigation. + Afterwards the second layer was optimized using the same process, but + only using data from the Hindawi corpus, because it is the only one + with at least some fully diacriticised texts. + + Finally the different brackets were arranged by hand and the remaining + symbols algorithmically distributed on the third layer using the raw + Wikitext from the Arabic Wikipedia dataset.

-- cgit v1.2.3