There is a new generation of automatic speech recognition (ASR) software on the market. It was developed by Newton Technologies in cooperation with SpeechLab (Technical University of Liberec). The Speech to Text software is implemented in their online application Beey, which can also be used for subsequent editing of the resulting text form of the recording.
This new type of voice recognition software does not rely on a specific data set in a dictionary, so it can recognize and transcribe even words not included in its database. Any inaccuracies or differences in spelling are apparent when applying the automatic spell checker, which immediately marks potential problematic passages for the editor. The software is also better at dealing with lower quality recordings or new words.
New model offers up to 10% improvement
Verifying the quality of the transcription with the new ASR model compared to the previous version is relatively easy – it only requires testing several various audio files, such as radio programs, phone call recordings, YouTube videos, recordings of parliamentary and government proceedings, TV news and journalism. The automatic transcription of these recordings is then compared against manually prepared verbatim transcripts. However, as the accuracy of recognition results increases, it becomes more difficult to achieve significant changes. Therefore, even a small improvement is an important advance. The results of this testing exceeded all expectations: the average transcription accuracy of the old ASR model was 83.60 %, while the new ASR model was more than 9 percentage points better with an accuracy of 92.65 %. That’s approximately 1 error less per every 10 words!
Beey vs. Microsoft vs. Google
It’s also interesting to see how Beey fares against the competition. Most foreign ASR technologies do not support Czech language, so only the most frequently used transcription services were used for this comparison: recognition programs from Google and from Microsoft. The results were again exceptional: using the same test data, the Beey software achieved 92.65% accuracy, Microsoft has reached a success rate of 90.07 % and Google of only 78.74 % – that is, even less than the previous Beey program. Moreover, these were not ideal studio quality recordings with zero background noise, but public recordings from Czech media. Considering the size of their teams and the development budgets of these multinational companies, the result of a small Czech company can be considered an extraordinary success. This is not only due to their enthusiasm for the cause, but also to 15 or more years of research and the large amount of data collected by Newton Technologies during that time.
Since Beey is available outside of the Czech market as well, the next obvious step was to see where it stands in transcribing other languages. It was therefore tested on similarly diverse recordings in English and German. The contrast between these results proved to be less apparent. The overall accuracy of Microsoft’s ASR program was better than Beey’s for English – 92.93 % versus 92.24 %, the difference being less than 1 %! In German, where the number of Beey users has been increasing recently – and thus has been the focus of more intensive improvements – the Czech recognizer performed better. Beey reached 92.51% accuracy where Microsoft’s was 86.88 % – a difference of more than 5 percentage points. Despite the recent update, Google’s service did not reach higher results with German at 80.18 % and English at only 77.51 %.
Beey definitely doesn’t lag behind global competition! The new neural recognition model is being actively developed for more languages in cooperation with the SpeechLab (Technical University of Liberec): at the moment, it is also available for Norwegian, Russian and Slovakian, soon to be introduced in Polish. Other languages still use the previous version of the ASR software for now, but new updates and improvements are continuously being provided for it.
Uses of Beey
The Beey app already has over eight thousand users. It is popular with journalists for transcribing interviews and archiving recordings, and it is used for TV and radio monitoring, for example by the Austrian media company APA. Beey subtitles the online televisions DVTV and Seznam TV. Its professional editors also prepare high-quality subtitles for TV NOVA and TV Prima. Some municipalities, local government and state administration offices can be found among Beey’s users as well.
Do you want to try out Beey’s automatic transcription services for yourself and compare the results with other services? Contact firstname.lastname@example.org to gain access to the app and test recordings.