summaryrefslogtreecommitdiff
path: root/lulua/data/report/index.html
diff options
context:
space:
mode:
Diffstat (limited to 'lulua/data/report/index.html')
-rw-r--r--lulua/data/report/index.html286
1 files changed, 209 insertions, 77 deletions
diff --git a/lulua/data/report/index.html b/lulua/data/report/index.html
index e35e08f..e2108cd 100644
--- a/lulua/data/report/index.html
+++ b/lulua/data/report/index.html
@@ -5,15 +5,23 @@
<title>لؤلؤة: لوحة مفاتيح عربية</title>
<!--<meta name="description" content="">-->
<meta name="viewport" content="width=device-width, initial-scale=1">
-
+ {# Fonts #}
<link href="https://fonts.googleapis.com/css?family=IBM+Plex+Mono|IBM+Plex+Sans:100,400&display=swap" rel="stylesheet">
+
+ {# Pure.css #}
<link rel="stylesheet" href="https://unpkg.com/purecss@2.0.3/build/pure-min.css" integrity="sha384-cg6SkqEOCV1NbJoCu11+bm0NvBRc8IYLRGXkmNrqUBfTjmMYwNKPWBTIKyw9mHNJ" crossorigin="anonymous">
<link rel="stylesheet" href="https://unpkg.com/purecss@2.0.3/build/grids-responsive-min.css" integrity="sha384-+lKLtMyKzY/ZdqRXlhw2whazop7duDxgP+oWPB+EGry5wK+vJnZwSsgMnAmDS/39" crossorigin="anonymous">
- <script src="https://polyfill.io/v3/polyfill.min.js?features=es6" crossorigin="anonymous"></script>
- <script id="MathJax-script" async src="https://www.unpkg.com/mathjax@3.1.0/es5/tex-mml-chtml.js" integrity="sha384-ynpCZozLxgqK3wrgBu8qH8qPG3eD8mME8z0zugAX26UMb5HfLp2PtvtDH4vdmgkm" crossorigin="anonymous"></script>
+
+ {# TeX support #}
+ <link rel="stylesheet" href="https://unpkg.com/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">
+ <script defer src="https://unpkg.com/katex@0.13.18/dist/katex.min.js" integrity="sha384-GxNFqL3r9uRJQhR+47eDxuPoNE7yLftQM8LcxzgS4HT73tp970WS/wV5p8UzCOmb" crossorigin="anonymous"></script>
+
+ <script defer src="https://unpkg.com/katex@0.13.18/dist/contrib/auto-render.min.js" integrity="sha384-vZTG03m+2yp6N6BNi5iM4rW4oIwk5DfcNdFfxkk9ZWpDriOkXX8voJBFrAO7MpVl" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
+
{# bokeh #}
+ {# Defer loading bokeh, so it does not block rendering #}
{% for f in bokehres.js_files -%}
- <script src="{{ f }}"></script>
+ <script defer crossorigin="anonymous" src="{{ f }}"></script>
{%- endfor %}
{% for f in bokehres.css_files -%}
<link rel="stylesheet" href="{{ f }}">
@@ -129,11 +137,10 @@
<div class="lbox">
<h2>The Arabic alphabet</h2>
<p>
- 28 letters make up the Arabic alphabet and quite a few extra
- symbols are required for proper text input, like the hamza in its different
- shapes <bdo dir="ltr" lang="ar">أ إ آ ء ئ ؤ</bdo>, ta marbutah <bdo
- dir="ltr" lang="ar">ة</bdo>, alif maqsurah <bdo dir="ltr"
- lang="ar">ى</bdo> and various diacritics for vowelized texts.
+ 28 letters make up the Arabic alphabet and quite a few extra symbols are
+ required for proper text input, like the {{ hamzah }} in its different
+ shapes <bdo dir="ltr" lang="ar">أ إ آ ء ئ ؤ</bdo>, {{ tamarbutah
+ }}, {{ alifmaqsurah }} and various diacritics for vowelized texts.
<!-- -->
Since the performance of a keyboard layout depends on the text entered
it is necessary to study its mono-, di- and trigraph frequencies first.
@@ -222,8 +229,9 @@
</details>
<p>
- The plot below shows <bdo dir="ltr" lang="ar">ا ل ي م و ن</bdo> can be
- considered the most frequently used letters in the Arabic language.
+ The plot below shows {{ alif }}, {{ lam }}, {{ ya }}, {{ mim }}, {{
+ waw }} and {{ nun }} can be considered the most frequently used letters
+ in the Arabic language.
<!-- -->
Together they account for more than 55% of all letters in the corpus.
</p>
@@ -328,17 +336,17 @@
The most frequent letters have all been assigned to the home row, which
makes them easily accessible.
<!-- -->
- <bdo lang="ar" dir="ltr">ا</bdo> and <bdo lang="ar" dir="ltr">ل</bdo>
+ {{ Alif }} and {{ lam }}
are typed with different hands, balancing the load on hands almost
evenly.
<!-- -->
The index and middle finger of both hands share the majority of the
typing load, but naturally the left middle finger is used more
- frequently due to its assignment to the letter alif.
+ frequently due to its assignment to the letter {{ alif }}.
</p>
<p>
- The layout targets Quaranic and Modern Standard Arabic (MSA), also called Fusha
+ The layout targets Quaranic and Modern Standard Arabic (MSA), also called Fuṣḥa
(<bdo lang="ar">الفصحى</bdo>), only.
<!-- -->
Dialectical Arabic (<bdo lang="ar">العامية</bdo>) is mainly a spoken
@@ -353,35 +361,35 @@
Designing the layout to be compose-based has both benefits and
disadvantages.
<!-- -->
- Compose-based mainly means the hamza <bdo lang="ar" dir="ltr">ء</bdo>
- is treated like an optional diacritic for Alef, Waw and Yah instead of
- viewing Alef-Hamza, Waw-Hamza and Yah-Hamza as precombined, atomic
- units.
+ Compose-based mainly means the {{ hamzah }} is treated like an optional
+ diacritic for {{ alif }}, {{ waw }} and {{ ya }} instead of viewing
+ {{ alifhamzah }}, {{ wawhamzah }} and {{ yahamzah }} as precombined,
+ atomic units.
<!-- -->
- Although <bdo lang="ar" dir="ltr">أ</bdo> and <bdo lang="ar"
- dir="ltr">ا</bdo> are not the same, the hamza can be dropped if the
- writer’s intention is unambigiously inferable from context.
+ Although {{ alifhamzah_ }} and {{ alif_ }} are not the same, the {{
+ hamzah_ }} can be dropped if the writer’s intention is unambigiously
+ inferable from context.
<!-- -->
- Thus it makes sense to provide hamza as a combining character on the
- keyboard.
+ Thus it makes sense to provide {{ hamzah_ }} as a combining character
+ on the keyboard.
<!-- -->
Additionally it uses two keys less than precombining it with its stems,
- allowing the entire alphabet plus hamza diacritic to fit on a single
+ allowing the entire alphabet plus hamzah diacritic to fit on a single
keyboard layer.
<!-- -->
However, there is a cost to this approach:
- All hamza variants account for {{
+ All {{ hamzah_ }} variants account for {{
'%.1f'|format(layoutstats['ar-osx'].hamzaImpact*100) }}% of button
combinations.
<!-- -->
- Splitting hamza and from its stem means doubling the total number of
- button combinations and thus button presses, decreasing scores like
+ Splitting {{ hamzah_ }} and from its stem means doubling the total number
+ of button combinations and thus button presses, decreasing scores like
words per minute (WPM) slightly.
<!-- -->
- Splitting Alef and Alef-Hamza could also reduce pressure on left middle
- finger and allow for more even distribution, since {{
- layoutstats['ar-osx'].hamzaOnAlef|fraction }}<sup>th</sup> of all Alef
- uses are with Hamza.
+ Splitting {{ alif }} and {{ alifhamzah }} could also reduce pressure
+ on left middle finger and allow for more even distribution, since {{
+ layoutstats['ar-osx'].hamzaOnAlef|fraction }}<sup>th</sup> of all {{
+ alif }} uses are with {{ hamzah }}.
</p>
<details class="remarks">
<summary></summary>
@@ -415,23 +423,6 @@
{% endif %}
{% endfor %}
</div>
-<div class="sentencestats">
-<p lang="en">Examples:</p>
-<ul lang="ar">
-{% for sentence in stats.sentences %}
- <li>
- {% for match, weight in sentence[0] -%}
- {%- if weight is none -%}
- <span>{{ match }}</span>
- {%- else -%}
- {%- set c = weight|blendn((38, 139, 210), (108, 113, 196), (211, 54, 130), (220, 50, 47)) -%}
- <span style="color: rgb({{ c[0] }}, {{ c[1] }}, {{ c[2] }});" title="{{ '%5.5f'|format(weight) }}">{{ match }}</span>
- {%- endif -%}
- {%- endfor %}
- </li>
-{% endfor %}
-</ul>
-</div>
{% endmacro %}
<figure id="ar-lulua-heat">
@@ -450,7 +441,62 @@
<div class="lbox">
<h2><a href="#related">Related work</a></h2>
<p>This section explores existing keyboard layouts made for the
- Arabic language and analyzes their usability.</p>
+ Arabic language and analyzes their usability.
+ <!-- -->
+ Comparing them with the proposed layout above is difficult at best,
+ because the layouts presented below cover different character sets.
+ <!-- -->
+ Some lack numbers, some do not include short vowels and others provide
+ no way to type symbols.
+ <!-- -->
+ Therefore no individual score is assigned to each layout, but an analysis
+ of each layout’s features is given.
+ </p>
+ </div>
+ </div>
+ </div>
+
+ <figure id="triadeffort">
+ <div class="lbox" lang="en">
+ <div id="triadeffort-div"></div>
+ </div>
+ <figcaption class="pure-g flexreverse">
+ </figcaption>
+ </figure>
+
+ <div class="pure-g flexreverse">
+ <div class="pure-u-1 pure-u-xl-1-2">
+ </div>
+ <div class="pure-u-1 pure-u-xl-1-2" lang="en">
+ <div class="lbox">
+ <p>
+ Carpalx tries to minimize the effort of typing in blocks of three
+ consecutive keystrokes, triads, and thus a good layout in that sense
+ should make typing frequent triads easy.
+ <!-- -->
+ The figure above plots cumulative triad frequency on the x-axis and
+ weighted cumulative effort on the y-axis.
+ <!-- -->
+ Thus, at an x value of 0.5 the y-axis is the sum of triad frequencies
+ multiplied by their effort for all triads responsible for 50% of the
+ typing process.
+ <!-- -->
+ Standard layouts are the layouts from <a href="#ar-asmo663">ASMO</a>,
+ <a href="#ar-linux">Linux</a> and <a href="#ar-osx">OSX</a> whereas
+ <em>usable</em> lists only those which are actually relevant for typing.
+ <!-- -->
+ As we can see the layout presented above meets the optimization goal.
+ <!-- -->
+ Only the top 5% of all triads are “easier” to type with <a
+ href="#ar-malas">Malas’ layout</a>, because lulua splits {{ hamzah }}
+ from its {{ alif }} stem.
+ <!-- -->
+ As expected the <a href="#ar-phonetic">phonetic layout</a> is one of the
+ worst ones, because QWERTY is not optimized for Arabic letter frequencies.
+ </p>
+ <p>
+ The following sections provide details about these layouts.
+ </p>
</div>
</div>
</div>
@@ -474,8 +520,8 @@
dir="ltr" lang="ar">ض ص، س ش، ح ج خ</bdo>) and not frequency.
<!-- -->
Also it overuses the right index finger by assigning the four
- high-frequency letters <bdo lang="ar" dir="ltr">ا ت و ة</bdo> to
- it.
+ high-frequency letters {{ alif }}, {{ ta }}, {{ waw }} and {{ tamarbutah
+ }} to it.
</p>
</div>
</div>
@@ -497,14 +543,14 @@
<h3><a href="#ar-osx">Mac OS X</a></h3>
<p>
Mac OS X’s Arabic keyboard layout makes a few small changes to ASMO
- 663 by moving the <bdo lang="ar" dir="ltr">ة</bdo> to a hard to
+ 663 by moving the {{ tamarbutah }} to a hard to
reach spot on the right of the top row.
<!-- -->
It also moves the short vowels from the first to the top row of the
second layer and replaces them with symbols.
<!-- -->
The bottom row keys are aditionally shifted to the right, beginning
- with <bdo lang="ar" dir="ltr">ر</bdo>.
+ with {{ ra }}.
</p>
</div>
</div>
@@ -528,15 +574,14 @@
A more common layout is the one used on Linux, which also exists on
Windows with minor changes to the first layer.
<!-- -->
- While its top and center row barely differ from ASMO 663 the
- bottom row now contains a separate key for the ligature <bdo
- lang="ar" dir="ltr">ﻻ</bdo>, likely inherited from <a
+ While its top and center row barely differ from
+ ASMO 663 the bottom row now contains a separate key
+ for the ligature {{ lamalif }} , likely inherited from <a
href="https://oztypewriter.blogspot.com/2014/10/the-arabic-typewriter-keyboard-and.html">early
typewriter layouts</a>.
<!-- -->
But at the cost of pushing punctuation characters to the second
- layer, <bdo dir="ltr" lang="ar">د</bdo> into the top and <bdo
- dir="ltr" lang="ar">ذ</bdo> even further into the number row.
+ layer, {{ dal }} into the top and {{ dhal }} even further into the number row.
</p>
</div>
</div>
@@ -591,10 +636,10 @@
</p>
<p>
While the layout distributes load between fingers quite well it
- favors the left hand by assigning <bdo dir="ltr" lang="ar">ا</bdo>
- and <bdo dir="ltr" lang="ar">ل</bdo> to it.
+ favors the left hand by assigning {{ alif }}
+ and {{ lam }} to it.
<!-- -->
- The decision to place <bdo dir="ltr" lang="ar">ث</bdo> in a very
+ The decision to place {{ tha }} in a very
prominent spot seems weird, given it only accounts for 0.5% of all
symbols, even in their own analysis.
</p>
@@ -636,15 +681,13 @@
<!-- -->
Probably due to their unusual assumption that middle- and
ring-finger rest in the top row their results are suboptimal,
- placing both <bdo dir="ltr" lang="ar">ا</bdo> and <bdo dir="ltr"
- lang="ar">ي</bdo> in the top row.
+ placing both {{ alif }} and {{ ya }} in the top row.
<!-- -->
Their analysis notices this and suggests improved positions for
both characters, but these are not actually implemented.
<!-- -->
- The big asymmetry is caused by placing <bdo dir="ltr" lang="ar">ا
- ل ي</bdo> and <bdo dir="ltr" lang="ar">و</bdo>, four of the five
- most frequent letters, on the right hand side.
+ The big asymmetry is caused by placing {{ alif }}, {{ lam }}, {{ ya }} and
+ {{ waw }}, four of the five most frequent letters, on the right hand side.
</p>
</div></div>
</div>
@@ -672,11 +715,11 @@
optimized for typing speed only, claiming 35% faster typing compared
to the <a href="#ar-linux">currently used layouts</a>.
<!-- -->
- However the decision to put <bdo dir="ltr" lang="ar">ي</bdo> in the top
+ However the decision to put {{ ya }} in the top
row seems odd.
<!-- -->
- Assigning the same left index finger to <bdo dir="ltr" lang="ar">ا
- ي و</bdo>, which are three of the most frequent letters, heavily
+ Assigning the same left index finger to {{ alif }},
+ {{ ya }} and {{ waw }}, which are three of the most frequent letters, heavily
strains this particular finger.
</p>
</div>
@@ -711,8 +754,8 @@
well.
<!-- -->
However their algorithm seems to favor the bottom row instead of the
- easier to use top row since it places the letters <bdo dir="ltr"
- lang="ar">ب ت ر</bdo> there.
+ easier to use top row since it places the letters {{ ba }}, {{ ta }}
+ and {{ ra }} there.
</p>
</div>
</div>
@@ -746,20 +789,102 @@
provide <em>three</em> single-quote marks ’ and <em>two</em> Arabic
semicolon <bdo dir="ltr" lang="ar">؛</bdo>.
<!-- -->
- Additionally it places <bdo dir="ltr" lang="ar">ي</bdo> in an even
+ Additionally it places {{ ya }} in an even
worse position than Malas’ layout.
</p>
</div>
</div>
</div>
- <figure id="ar-osman">
+ <figure>
<div class="lbox">
<img src="ar-osman-heat.svg">
{{ fingerhandstats(layoutstats['ar-osman']) }}
</div>
</figure>
+ <div id="ar-qtaish" class="pure-g flexreverse">
+ <div class="pure-u-1 pure-u-xl-1-2">
+ </div>
+ <div class="pure-u-1 pure-u-xl-1-2" lang="en">
+ <div class="lbox">
+ <h3><a href="#ar-qtaish">Qtaish et al</a></h3>
+ <p>
+ Qtaish et al presented this layout in
+ <a href="http://www.sci-int.com/pdf/637456047563529791.pdf">An Improved Arabic Keyboard Layout</a> in 2021.
+ <!-- -->
+ They use a novel corpus of 5 million words or roughly 66 million letters
+ <!-- 65713689 adding up the numbers in their paper --> consisting of
+ newspapers, (now defunct) social networks and blogs, as well as six
+ dictionaries (see remarks).
+ <!-- -->
+ Then letters were classified into three categories based on their
+ frequency, which apparently were used to populate home, top and bottom
+ row (in this order).
+ <!-- -->
+ Additionally bigrams were somehow used to arrange letters and avoid
+ placing them on the same or adjacent fingers, making rolling finger
+ movements incentiviced by carpalx impossible.
+ <!-- -->
+ Ultimately it looks like the layout was designed by hand and not through
+ an automated process. <!-- see section 5 of the paper -->
+ </p>
+
+ <details class="remarks">
+ <summary></summary>
+ <em>Alfarahindi</em> and <em>Alein Dictionary</em> are most likely the same 8th century book <a href="https://en.wikipedia.org/wiki/Kitab_al-%27Ayn">Kitab al-'Ayn</a>.
+ <!-- -->
+ There are also <a href="https://ar.wikipedia.org/wiki/%D8%AA%D8%A7%D8%AC_%D8%A7%D9%84%D8%B9%D8%B1%D9%88%D8%B3_%D9%85%D9%86_%D8%AC%D9%88%D8%A7%D9%87%D8%B1_%D8%A7%D9%84%D9%82%D8%A7%D9%85%D9%88%D8%B3">Taj-Alaroos</a>, <a href="https://www.lesanarab.com/letter/">Lesan Alarab</a>, <a href="https://ar.wikipedia.org/wiki/%D8%A7%D9%84%D9%85%D8%B9%D8%AC%D9%85_%D8%A7%D9%84%D9%88%D8%B3%D9%8A%D8%B7">Almujam Alwaseet</a> and <a href="https://ar.wikipedia.org/wiki/%D8%A7%D9%84%D9%85%D9%86%D8%AC%D8%AF_(%D9%84%D9%88%D9%8A%D8%B3_%D9%85%D8%B9%D9%84%D9%88%D9%81)">Almunjed</a>.
+ <!-- -->
+ All of them are lexica, not dictionaries.
+ </details>
+
+ <p>
+ Although the authors do not provide a number row, it has been added for
+ fair comparison.
+ <!-- -->
+ However to be actually usable the layout would need punctuation symbols and diacritics.
+ </p>
+ </div>
+ </div>
+ </div>
+
+ <figure id="ar-qtaish">
+ <div class="lbox">
+ <img src="ar-qtaish-heat.svg">
+ {{ fingerhandstats(layoutstats['ar-qtaish']) }}
+ </div>
+ </figure>
+
+ <div id="ar-ergoarabic" class="pure-g flexreverse">
+ <div class="pure-u-1 pure-u-xl-1-2">
+ </div>
+ <div class="pure-u-1 pure-u-xl-1-2" lang="en">
+ <div class="lbox">
+ <h3><a href="#ar-ergoarabic">Ergoarabic</a></h3>
+ <p>
+ Being first released in 2021 <a
+ href="https://github.com/darkstego/ergoarabic">Ergoarabic</a> is the
+ most recent optimized layout for Arabic.
+ <!-- -->
+ On the first layer it combines the positions of 17 keys from the <a
+ href="#ar-linux">standard PC layout</a> with brackets and punctuation
+ symbols from QWERTY, aiming to retain compatibility with both.
+ <!-- -->
+ The remaining keys have been re-arranged or moved to the shift layer by
+ hand for improved ergonomics.
+ </p>
+ </div>
+ </div>
+ </div>
+
+ <figure id="ar-ergoarabic">
+ <div class="lbox">
+ <img src="ar-ergoarabic-heat.svg">
+ {{ fingerhandstats(layoutstats['ar-ergoarabic']) }}
+ </div>
+ </figure>
+
<div id="ar-phonetic" class="pure-g flexreverse">
<div class="pure-u-1 pure-u-xl-1-2">
</div>
@@ -769,7 +894,7 @@
<p>
The <a href="http://arabic.omaralzabir.com/home">Arabic Phonetic Keyboard</a>
simply maps the QWERTY layout to Arabic letters, based on their sound.
- Thus Q becomes <bdo dir="ltr" lang="ar">ق</bdo>, Y becomes <bdo dir="ltr" lang="ar">ي</bdo> and so on.
+ Thus Q becomes {{ qaf }}, Y becomes {{ ya }} and so on.
It claims to be optimized for writing vowelized texts, especially
Quranic Arabic, and thus includes quite a few combining characters and
special symbols.
@@ -836,9 +961,16 @@
</section>
<script>
-fetch('letterfreq.json')
- .then(function(response) { return response.json(); })
- .then(function(item) { Bokeh.embed.embed_item(item, 'letterfreq-div'); })
+function loadPlots () {
+ const plots = ['letterfreq', 'triadeffort'];
+ for (const p of plots) {
+ fetch(p + '.json')
+ .then(function(response) { return response.json(); })
+ .then(function(item) { Bokeh.embed.embed_item(item, p + '-div'); });
+ }
+}
+if (document.readyState != "loading") loadPlots ();
+else document.addEventListener("DOMContentLoaded", loadPlots);
</script>
</body>