Keyboard Layout Customization
The letters on my old keyboard is getting off again. I haven't even used it for several months and I can't even remember how many keyboards I have thrown away out of this reason. Those Germen say, "the one who buys cheap has to buy twice". That's true and it is such a waste. This is not the life-style I wanted to have. I prefer keep one thing for a long time, instead of changing it from time to time. Even though nowadays, keeping disposing cheap products and buy new ones could be a more economical choice.
With this thought in mind, I have decided to build a custom keyboard with a set of high quality keycaps. I went for Akko 5075 DIY kit, and used Akko V3 Creamy Purple switches. For the key caps, I have ordered a set with XDA profile.
But then I noticed that since the keycaps are in XDA profile, they are in the same height in different rows, which means I can put a key anywhere I want. That means, I can change the keyboard layout to whatever I want. Since I'm "building" a keyboard anyways, why not building the layout for the keyboard too?
I'm already used to QWERTY-Layout, but it doesn't mean that I'm satisfied with it. My biggest problem is the position of the keys "d", "e", "c" and "x". If I type something like "exceeded" or "succeeded", it is always one of my finger moving back and forth, or two fingers sqeezing each other to try to get another out of their way. Beside that, the "y" key, although used very often, is being put at a position which is very hard to reach. These two facts has lowered my typing speed and typing experiences dramatically.
So, my goal is to design a keyboard layout that fit my own typing style and my own language usage to increase typing speed and typing experiences as much as possible.
I have a rough plan. First I shall gather all the information of the key rates and also the bigram frequencies in different languages I'm using. And then I should record the typing lattency of all the possible bigrams of latin letters. Based on these 2 sets of data, I would be able to evaluate the layout. And then the goal is to find the perfect layout to minimalize the overall lattency. After that, I only need to find the layout with the lowest lattency.
For the first set of data, it is relatively simple. There are different researches already out there about the frequencies of the words in a certain languages. For example:
- English: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/PG/2006/04/1-10000
- Chinese: https://lingua.mtsu.edu/chinese-computing/phonology/syllable.php
- German: https://en.wiktionary.org/wiki/User:Matthias_Buchmeier#German_frequency_list
For each of the languages, I calculated the key and bigram frequencies separately. After that, I normalised those frequencies, so that the values are not influenced by the different sample bases for different languages.
In the next step, I put different weights on different languages, which is: Chinese 2, English 3 and German 1. This roughly reflexes the amount that I type in different languages. With those weights, I have calculated the weighted average frequecies of each keys and bigrams.
To record the bigram typing lattency, I wrote a key speed recorder program. It will generate all possible bigrams and choose one randomly. I have to type each bigram 20 times as fast as I can, so that the average lattencies can be calculated. That is 26 letters including semi-colon, which is 27 keys. And there is 27 * 27 = 729 possible bigrams. It took me several hours to finish the recording work.
With the information being gathered, I thought it should be easy to design the keyboard now. I just go through all the layout possibilities and find the one with the lowest lattency and call it a day, right? No! There is 27! = 1.089e+28 possibilities. To figure out the latency for each and every possible layout would take a huge amount of time. I have to find a better algorithm for it.
At first, I thought about something in the direction of greedy strategy. Based on the randomly generated keyboard layouts, the program will randomly change one key pairs. If the result is better, I will keep the changed version. But if not, it will chang another pair, until there is no pairs to be changed.
The algorithm was efficient, but it only finds local minimum. Soon I got a huge amount of different results being claimed as "the best", but actually they are just a bunch of "local valleys". There must be a mechanism to bring this algorithm out of the local valley. And the best algorithm that was in my mind is the so called "genetic algorithm".
Genetic algorithm is inspired by how the genes are woking in the real world. I was using a slightly modified version of it. At the beginning, a set of randomly generated layouts will be created. And then it enters a loop. In each of the iteration, the program will ranomly divide them into groups of two, so that they can "pair" with each other. For each pair of parent, it mixs random features, which is in my case the position of the keys, from both of the parent. And based on the features and some randomness, their "child" would be created. The child will then go through the local best algorithm again, so that it "grows up", and finally, the programm will select the best results as survivors for the base for the next iteration.
This algorithm works like a charm. After 70 to 80 iterations, the result will become relatively stable. And if I try it several times, the result stays the same. It is very likely that I have found the global best! The best layouts looks like this:
w v r h m k g o u ;
l s n d f p t i a e
q b j y x z c
x v w h l k g o u ;
r s n d f p t i a e
q b m y j z c
w k r h m q g o u ;
l s n d f p t i a e
v b j y x z c
x k w h m q g u o ;
r s n d f p t i a e
v b l y j z c
x w r h m k g o u ;
l s n d f p t i a e
q v b y j z c
Although they are probably the global bests based on the statistic data, I liked none of them. Most of them has big issues. E.g., the bigrams like "ck", "tion", "rn", "bs" and those in Chinese pinyin "ua" will still cause trouble in all of those layouts.
After having played with the statistic data and the algorithm for quite a while, my learnings were:
- slightly different data from the experiment will cause huge difference of the keyboard layout.
- putting all the vowels at the upper right corner seems to be a very good practice, because the vowels will appear in all the words, and they will mostly be conbined with other consonances on the other hand. An typing left right hand alternately is according to the experience resulting in lowest typing lattency.
- the algorithm is to find the overall lowerest typing lattency based on the statistic data. But my goal with the perfect layout seems to be shifted to avoid single finger bigram as much as possible. However, it mismatches the goal of the algorithm.
The approach have to be changed completely.
I abandoned the recorded bigram typing latency data, and try to find a clustering strategy to divide keys into 3-3-3-6-6-2-2-1 groups, each group is corresponding to the keys being pressed by a single finger. The best clustering strategy is to keep the overall chance of single finger bigram as low as possible. This was being achieved simply by modifying the evaluation part of the code.
The result was:
i; oe vu ypgwcf dkmbtj lhn srz qxa
q s l d k y p o v i
x r h m b g w e u ;
a z n t j c f
Based on this result, I can further figure out how the layout should exactly be by moving keys around. If I move a key, I should only move it within a group, so that the key is still being typed with the same finger; or move a whole group to another place where the group can fit in, so that this group of keys will be typed by another finger completely. After sevral hands-on tests, I found my personal favorite, and it also becomes the layout that I'm currently using.
j r h w p q k o v ;
a s n g y b t e u i
x z l c f m d , .
Since the keys on the left-hand home-row are "A", "S", "N", "G" and "Y". I call this layout ASNGY (pronounced as "assengy").
I have tested the layout, and I'm very happy with it. There is a keyboard layout evaluating website. I have tried it with ASNGY, and the result is 1.879, which is slightly better than Halmak. Although not the best among all layouts, I'm very satisfied with this reult, since these results only apply to English language. In Chinese Pinyin, there is a very high rate of vowel combinations like "ue", "ua", but most of the other layouts tend to pile up vowels together, which will still causing a lot of single finger bigram problems. But in my current design, the only piled up vowels are "e" and "o", which is seldomly coming up in a roll in Chinese.
Since the keycaps will come later, I can't test the layout yet on my newly built keyboard, so I rearranged the keys on my laptop so that I could start practising it.
The only problem now is that I'm too used to QWERTY. Now my typing speed with ASNGY is about 13 WPM, compairing to QWERTY at 90 WPM, there is still a long way to go. But I believe, since QWERTY got a score of 2.4 on the layout evaluation website. The WPM with ASNGY passing over QWERTY is just a matter of time.