This is work in progress and contributions are welcome. Head over to GitHub to see where you can help.
Other languages using the Arabic alphabet (regional dialects, Urdu, Persian) are explicitly not supported.
There are 28 letters in the Arabic alphabet, plus quite a few extra symbols required for proper text input, like the hamza in its different shapes أ إ آ ء ئ ؤ, ta marbutah ة, alif maqsurah ى and various diacritics for vowelized texts. Since the usability of a keyboard layout depends on the text entered it is necessary to study letter and letter combination frequencies first. The corpus used for the following analysis consists of
summing up to roughly two billion characters. The plot below shows ا ل ي م و ن can be considered the most frequently used letters in the Arabic language. Together they account for more than 50% of all letters in the corpus.
Below are statistics for the proposed layout.
While technically speaking not a layout but alternative input method, Intellark by Intellaren is worth mentioning. It is based on repeatedly pressing the same button to modifiy the current character. For example pressing A on the QWERTY keyboard cycles through the alternatives ا أ إ آ and ء. Obviously this is slow, error-prone and violates Dvorak’s guidelines for keyboard layout designs.