Which machine translation engines perform better in different domains?

The Intento report on the State of Machine Translation 2022 brought with it a wealth of insight on how current machine translation engines perform. And while MT customization is the gold standard when it comes to domain-specific translations, it’s interesting to note how even the stock models are performing in different domains.

In this article, we will take a look at how well MT models perform across different domains,and take note of trends and significant patterns in the data from the Intento report. Let’s dive in.

Report overview

The report tested 31 machine translation engines in 9 different domains:

Colloquial
Education
Entertainment
Financial
General
Healthcare
Hospitality
IT
Legal

In this article, we will take a look at the top-performing MT engines across different languages in each domain. The report notes 16 leaders in the same tier, across all domains as measured by normalized COMET scores.

What about custom models?

Most of the MT engines under evaluation offer custom models that are adapted to different domains. But it should be noted that the report only evaluates the stock models—that is, without any customization—of these MT engines, and as such are trained on generic data that isn’t concentrated in any particular domain.

Custom models are generally the preferred option when it comes to domain-specific translations, but there are cases where general machine translation is adequate. Choosing an MT engine that performs well in certain domains also in all likelihood makes it easier to train them further.

To learn more about the difference between stock and custom MT models, read our article: Custom vs stock models in machine translation: Which is better?

Results: Minimal coverage for best quality

When it comes to providing minimal coverage across all domains, Google and DeepL are the runaway leaders of the pack, both being present as among the most viable options in each domain. In most cases, these two choices alone are enough to provide coverage for English into a different language.

For the entertainment domain, Naver and Tencent provide additional coverage that Google and DeepL are unable to fill in, for Korean and Chinese respectively.

In the hospitality domain, Amazon provides coverage for Arabic, and Tencent once again provides coverage for Chinese.

For healthcare and IT, Microsoft covers German and Amazon covers Portuguese.

All in all, it’s possible to obtain full minimal coverage for all domains and language pairs using only six MT engines, with Google and DeepL doing most of the heavy lifting.

Parting thoughts

The results are pretty clear about the runaway advantage that Google and DeepL hold, so most businesses looking to invest in machine translation would find it best to invest in those two first. Of course, it also depends on what languages you want to prioritize.

These results are interesting, but when it comes to true quality and performance nothing beats a custom model tailored specifically to your chosen domain. Most MT providers offer this option already, or come with models pre-trained in specific domains, so it’s a matter of choosing the right one.

If you’re not sure which option is right for you, we at machinetranslation.com are more than happy to help.