Machine translation provider using TMX files

In the public version of OmegaT, you can place a file in the project tm/mt subfolder, in which case the segment is displayed with a purple background when automatically inserted in the Editor to indicate that it is Machine Translation output.

In DGT-OmegaT we used a different approach. You can place tmx files with MT output in the project mt subfolder (warning: not tm/mt) which is automatically created if you create a new project using the DGT-OT Wizard or which you can manually create in your project.

In case of a match, the result will be displayed not in the Fuzzy Matches pane (as in the public OmegaT), but in the Machine Translation pane. To do that, we simply added one more implementation of Machine Translation (i.e. new implementation of IMachineTranslation interface), which, instead of calling a server, reads data from one or more TMX files.

Unlike the Fuzzy Matches pane, the Machine Translation pane can display only one match per provider and without any other information (such as a score). Our algorithm takes the best match, accepting any result with more than 80% (the 20% margin is only here to prevent small differences such as tags, which have probably been deleted before submitting to the MT engine). If one answer is found with 100%, the search stops.

  All of this was already proposed to OmegaT team, and even documented here (not by us, and also with some differences): as described in the ticket, files from mt/ directory appear in the translation memory pane, rather than matches pane, and it does not prevent from automatic insertion inside the editor, once considered that fuzzy matches always have priority over machine translation.

Since it is not included in OmegaT 4 nor 5, if you want to do the same you can download the plugin. To use this plugin:

  1. Download local-mt.jar (the source file is only for developers)
  2. Install in plugins/ directory of OmegaT
  3. Next time you start OmegaT, the menu "Options => Machine Translate" will have a new entry named "Local", activate it
  4. Create a folder mt/ inside your project and copy TMX files here (don't forget that this plugin is not a MT engine but a way to display in MT pane result of another engine)

 

Last but not least, as you could see in the previous chapter, a segment coming from this provider can (in DGT-OmegaT only, not in the plugin!) be inserted automatically in the Editor, but it must be explicitely configured like this :

 

The rule is the following : Machine Translation output will be inserted if, and only if, there is no match to insert - from the project memory or from the external memories - according to the «minimal match similarity» rule. As you could see here, the segment will be displayed with a gray background and without a score.

This corresponds to RFE #678 in OmegaT's SourceForge site.

 

​Note about other Machine Translation systems

In DGT, the use of Google Translate or similar tools is forbidden for confidentiality reasons. Initially we decided to remove those options from the code. But now that we are publishing our work on the Internet - and because other users may have the same constraint or not - we decided to move the relevant code to a separate plugin named mt_plugins.jar so that you/your manager can decide to keep it or not. It is simple to remove it: simply remove the jar and place the tmx file(s) with MT output in the project mt subfolder. This way you can use MT output without calling any MT server.

As internally, we use only one MT engine, there was no sense to always display the engine name in the window, so we initially removed it. But in the public version, if you use mt_plugins.jar then you can have more than one answer. So, the rule is that the engine name appears in the window if and only if there are more than one engine available (even if only one is active).

Some features specific to external engines have been added in june 2017 (DGT-2.5 update 4, and DGT-3.0 update 4)

 

Comments

Dear Thomas

I believe you are able to create a class and JAR for OmegaT for service eTranslation

https://www.proz.com/forum/machine_translation_mt/342632-etranslation_th...

Cheers,
Milan

Hi Milan

That is not so easy, because if I read this, the service is asynchronous, and probably because it is too slow to be synchronous. OmegaT's machine translation API is designed for synchronous services: it sends a query and expects to receive a result before the user goes to next segment. On the contrary this service seems to be designed to translate full documents and you have to wait for receiving a callback.

I can study the idea but not sure it is really possible

Regards

Thomas

 

Add new comment