DGT-OmegaT 3.4-TEST-3.4 and 3.5-DEV-4.1 published

Download : 3.4-TEST-3.4 and 3.4-DEV-4.1

Changes in search screens (all releases)

 

1. Can do replacements in orphan segments

In standard OmegaT, until release 4.0 there was a small inconsistency: it was possible to search for replacement in all segments, including orphans, but such segments did not appear in the editor. Finally the core team decided to exclude orphans from replacements, and we initially did the same.

But for our users, this can be a problem because these segments are saved in project_save.tmx and may also be re-used in another project or in a translation memory, so they were searching for a way to do replacements also for orphan segments.

So, finally we propose the following compromise: now you have a checkbox which enables to do replacements in orphan segments, but if you activate it, interactive search will stay in gray, you can only do full replacement.

2.  In Pre-translate screen, option to translate as alternative

Until now, when you used the pre-translate scren, all inserted segments were default segments, unless the project is not allowing them at all. Now, you have a checkbox which enables to select between translation as default (in which case duplicate segments will not appear) and alternative translations.

Note that this is global, you cannot choose segment per segment. Or you must do multiple searches.

Statistics again (3.5-DEV only)

These changes are continuation of statistics improvements made in previous release.

Correction: calculate scores in correct order

In release 3.5-DEV-4.0 we implemented use of selected score instead of a specific score in statistics. But later a user mentioned in the list that in the actual algorithm, non-stemmed and adjusted scores are not calculated at all if the stemmed score is less than threshold. As a consequence it appears as 0 when you try to retrieve it, for example in order to disaplay it in the matches pane.

The idea is good, as it saves time, but the problem is that stemmed score should not anymore be the first one used, except of course if the user explicitly selected it as such. We noticed it when retrieving statistics: lot of segments were considered as "no match" because the adjusted score was not calculated, so considered as 0. Now the scores are calculated in the order selected by the user: only if we are sure that the entry will be excluded from search results we don't calculate the second or third score.

The change is intentionally done in DEV only, so that you can try with TEST release to see the difference.

Optimisation: do not build objects while writing statistics

Since OmegaT 3.0 (2013), matches statistics use the same algorithm as the matches pane. The idea by itself is good, but it is highly time and memory consuming because calculating such statistics becomes equivalent to calling the matches pane for each segment and abandonning the result after use.

This would be different if the results of matches pane were kept, for example in a cache, so that next call to same segment would only need to be updated with new segments rather than recalculated. This is something we may study later.

Now in release 3.5-DEV-4.1, we still use the same loop as in the matches pane, but instead of building the matches list, we keep in memory only the best score: this is enough to fullfill the statistics table and we save the time for building the and the memory needed to build objects which won't be re-used later.

 

Once again, since this changes may be controversial, we do it only in DEV release and invite you to test and compare with 3.4 (where only the penalties are applied) and 3.3 (where we still use same algorithm as standard OmegaT) and tell us which one is better. Now the most probable is that a lot of segments (in particular small ones) will appear with "no match" in 3.3 and 3.4 and with matches in 3.5: this is due to the correction mentioned in previous section (calculate all scores), these segments had previously a 0 score because it was not calculated... even if it is a bug, we correct it only in DEV to enable to you doing comparisons. Tell us what you think.

 

Theme: 
OmegaT

Add new comment

Limited HTML

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.