Thursday, September 13, 2018

WikDict dictionaries now in FreeDict and updated data

WikDict now provide many dictionaries for the FreeDict project. FreeDict converts these dictionaries into different formats, e.g. the slob format used by Aard dictionary which allows offline usage on Android devices. Thanks to everyone involved in getting this ready!

In other news, all dictionaries have been updated and a few bugs (e.g. duplicated translations) resolved. If you notice any problems, please report them right away!

Saturday, May 12, 2018

Nine more languages: Testers wanted!

WikDict now provides translations for nine more languages:

  • Bulgarian
  • Dutch
  • Indonesian
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malagasy
  • Norwegian
This nearly doubles the number of translations across all languages to more than 5.8 million!

Since I don't speak all of these languages myself, I could only do basic testing. If you speak one of the new languages, please look up some translations an let me know the result!

Saturday, November 11, 2017

Now with 22% more translations!

WikDict builds on data extracted by the dbnary project. This project changed its way of storing data, which required adaptations on the part of WikDict. This is the reason why WikDict data has not been updated during the last months.
Now this work is finally done and new data is available in the web interface. This includes all changes done to the underlying Wiktionaries as well as additional bug fixes which prevented some translations from showing up properly. Overall this yields 22% more translations than the previous data from March 2017. As always, please let me know about any problems you encounter or suggestions for improvement.

Sunday, February 19, 2017

Get translations while typing

Having a typeahead autocompletion is very helpful when the work you are looking for is long or hard to type. But it can get even better by providing the translation along with the autocompletion. This is now available on WikDict.

As always, feedback is very welcome!

Sunday, January 22, 2017

More Translations, less Noise

There has been a large number of changes to the generation of WikDict dictionary changes. While many of them are related, some are just included in this post to give you a good summary of what happened in the last months.

More Translations

Deriving Translations from Intermediate Languages

When a translation is not found in the dictionary, you could give up and tell the user that there is no such translation. Or you could try to use translations between other languages to give a (hopefully accurate) answer. Here's an example. Let's say the word "dog" can't be found in the English-German dictionary, you could try to use French as an intermediate language:

dog (en) -> chien (fr)
chien (fr) -> Hund (de)
=> dog (en) -> Hund (de)

While this is useful, it can generate wrong translations due to ambiguities. WikDict tries to get the best of both worlds by applying a scoring dependant on multiple different factors to rank and filter the results of this approach.

Bug Fixes and Workarounds

Some bugs, especially a bug in the Virtuoso database made it necessary to skip some translations. The known bugs are now fixed or a workaround is applied. 

More Recent Data

As always, the people working on Wiktionary and DBnary aren't lazy, either. Their changes trickle down to WikDict with some delay and lead to visible improvements over time.

Less Noise

More filtering and better sorting

The scoring mentioned above has also been used to improve the sorting of words, senses and translations, as well as to filter some less reliable results introduced when reading dictionaries in reverse.

Less Unparsed Markup in Senses

When senses/definitions for words are extracted from Wiktionary, quite a large number of different Markups might be left inside those strings. WikDict got better at parsing those texts, so you will see less [[brackets]], <tags> and [1] left 1. over | numbers : or symbols than before. If you still see those, please let me know.

Sunday, July 24, 2016

Links now link to searches in WikDict

Previously, clicking on a term in the dictionary results lead to the corresponding Wiktionary page. Feedback from users has shown that this is not a typical user's expectation. Now all linked terms lead to a search in WikDict using the clicked term as search text.

The Wiktionary links can now be found in the side bar at the right instead. As always, feedback on this change is very welcome!

Sunday, April 24, 2016

Stemming support for English

All English entries are now searched using the Porter stemming algorithm, which means that more translations will be found if you use something different than the base form of word. The most common case is searching for a plural (e.g. "stoats") are getting a translation for the singular ("stoat"), even though the plural form does not appear anywhere in the data set.