Submitted by admin on Wed, 25/09/2024 - 17:10
Download: DGT-3.7-DEV-4.2
DGT-OmegaT has possibility to configure sort criteria for glossaries. New DEV release 3.7-DEV-4.2 adds some new possible criteria:
- STRING: this criteria uses Java String's comparator, using Unicode code points without taking in account language-specific criteria like accented letters. Note that until recelty, this is what original OmegaT did;
- PRIMARY : uses language-specific collator configured as PRIMARY strength; in most languages that means without taking care of accents
- SECONDARY : uses language-specific collator configured as SECONDARY strength; in most languages that means taking care of accents
- TERTIARY : uses language-specific collator configured as TERTIARY strength; in most languages that means case sensitive
- IDENTICAL : uses language-specific collator configured as IDENTICAL strength; see Java documentation for meaning
- LENGTH FIRST: identical to normal collator except that if string A begins with string B (for example "words" contains "word") the longest comes at first. This is what is implemented as default in OmegaT 6.1, and that is the contrary to what normal lexical order would do (for example, word < words < work in normal order, while words < word < work in this order)
The idea comes from a discussion with the core team. Note that LENGTH FIRST is not the sort criteria which was used in OmegaT 6.0.1, which accepted any string containing another: this one is not a consistent sort order because it is not transitive (for example "aim" < "claim", "aim" < "air" but "air" < "claim"!)
We are perfectly conscient that this is something very technical, that is the reason why it is only implemented in DEV release: the goal is not really to use these features in real life, but to compare them and decide later how we can propose them to users in a more user-friendly way.
Add new comment