Friday, December 30, 2022

Compound Word Splitting in WikDict Search

 Many languages allow building compound words by combining multiple words into a single word without spaces or dashes in between. With the exception of very common compound words, these are unlikely to be found in a dictionary, even though they are totally reasonable words. To look up these words in a dictionary, you have to split the word into its parts and look up each of them separately. This is cumbersome and very difficult for novice speakers. WikDict now alleviates this problem by attempting to split a compound word when the word is not found directly in the dictionary.

The feature is in its early stages and both the list of supported languages (currently German, English, Finnish, Dutch and Swedish) and the accuracy are expected to improve over time. If you want to help out, have a look at the wikdict-compound repository. As always, feedback is very welcome!