Download : 3.4-TEST-3.3 and 3.4-DEV-4.0
This release ensures that match statistics are based on same algorithm as adjusted score in matches pane (previous release used the same tokenization method but user complained that penalties are not applied)
Note: even if it can be considered as a bug, this correction is intentionally not yet in STABLE release: please test whenever the results are better and then we can study the possibility to backport it to stable branch.
During investigation about statistics I discovered another potential problem with matches calculation.
In 2014, OmegaT introduced possibility to select which score is used to sort the entries in matches pane. But there was a mistake: they still build the list using the stemmed score as primary key, and only when the list is complete they sort it. The problem with this solution is that a result with poor stemmed score will be rejected even if the primary selected score is high!
So, instead, this new release will use the key directly when building the list and also while building statistics (so, contrarily to 3.4-TEST, if you select non-stemmed score as primary, it will also be used for statistics)
Apparently this behavior of OmegaT was intentional, but I don't know why. I perfectly understand that it could be controversial, that is the reason why I do it only in DEV release. Now you have indeed 3 algorithms to compare:
Port to STABLE version will only occur when a consensus is found about these problems.
Add new comment