Way of knowing which segments are repetitions
Thread poster: Gary Raymond Bokobza
Gary Raymond Bokobza
Gary Raymond Bokobza
Spain
Local time: 12:32
French to English
+ ...
Nov 11, 2016

I need to split a 50k document to be translated by several translators. I am working from a new TM in Trados Studio 2015. The analysis shows 12200 repetitions. Is there a way of knowing where these repetitions are in the document so as not to assign repeated segments? I mean, is there a way of clicking on a segment and Trados "telling" you that, for example segment 245 is a repetition of segment 552.

 
Walter Blaser
Walter Blaser  Identity Verified
Switzerland
Local time: 12:32
French to German
+ ...
Use the Display Filter Nov 12, 2016

Hi

Yes, you can use the Display Filter for this. Use the predefined filter "Repetitions - All" and you will then see only the repetitions. You may then for example lock all these segments to prevent all translators from translating them differently (preferably translate them before you lock them).

Walter


Virgilio Dominguez del C
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 12:32
Member (2006)
English to Afrikaans
+ ...
@Walter Nov 12, 2016

Walter Blaser wrote:
Yes, you can use the Display Filter for this. Use the predefined filter "Repetitions - All" and you will then see only the repetitions.


No, the "All" setting displays all segments, not all repetitions. There is no built-in filter to show all repetitions (and you can't run multiple filters simultaneously AFAIK).

The Display Filter is on the Review tab, by the way. Don't ask why. It just is.


[Edited at 2016-11-12 12:53 GMT]


 
Thomas Pfann
Thomas Pfann  Identity Verified
United Kingdom
Local time: 11:32
Member (2006)
English to German
+ ...
Export repetitions and translate them first Nov 12, 2016

During the file analysis you can export 'frequently occurring units' (FOUs). Those are then placed into a separate xliff file and can be translated first - of course, that means that they are going to be translated out of context so the translator might spend more time cross-checking the source file for context, but it's still a very handy feature when files with lots of internal repetitions need to get split between several translators.

See SDL Support website:

Export Frequent Segments

When this option is selected, the analysis process checks all files in the project to identify frequently occurring segments. The frequently occurring units are placed in an XLIFF file for translation. A separate XLIFF file is produced for every language pair. The XLIFF files are placed in the project's Exports folder.
The XLIFF files can be translated and the translation units added to the translation memories before work begins on other project tasks.

A unit has to occur in the project source files more times than or equal to the number set in the Number of occurrences box.


See also here: https://community.sdl.com/solutions/language/translationproductivity/f/90/t/5350

[Edited at 2016-11-12 12:52 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 12:32
Member (2006)
English to Afrikaans
+ ...
You're in for a rough ride Nov 12, 2016

Garboktrans wrote:
I need to split a 50k document to be translated by several translators. I am working from a new TM in Trados Studio 2015. The analysis shows 12200 repetitions. Is there a way of knowing where these repetitions are in the document so as not to assign repeated segments?


Trados has very limited filters with regard to repetitions. Basically, there are only two: "first occurrences" (which show only the first occurrences of repetitions, but no non-repeating segments) and "excluding first occurrences" (which excludes all first occurrences of repetitions, but *does not* exclude any non-repeating segments). So there is no filter that shows only non-first occurrences of repetitions.

If you have lots and lots of time:

A. If you don't care about translating in context:
Use the method mentioned by Thomas. Right-click the project, select "Analyse", and click Next until you get to the screen where you would normally click Finish. On that screen, select Analyze Files on the left, and then select "Export frequent segments" and set it to 2. This will result in a file that contains only the first occurrences of repetitions and no other segments whatsoever (the file won't open automatically, but will be located in the Exports subfolder of the project folder).

B. If you do want to translate in context:
Use the "excluding first occurrences" filter and lock all those segments. This means that the entire file is visible to your translators, but only the first occurrences of repetitions can be translated by them.

Then, but only after they have finished translating the repetitions, you can import their file to a new TM and then divide the original file among the other translators (and provide the TM, or pre-translate against the TM).

It is not possible (natively) to create a version of the file in which only non-repeating segments are unlocked.

If you want all translators to work at the same time:

Here's an idea about how to "remove" repeating segments from the file, but I haven't tested it very extensively.

1. Start with a project that contains a copy of the SDLXLIFF file. Then right-click the project and select Batch Tasks > Pseudo-translation. Select "Deterministic" and "Dollar sign" when prompted. In the "Append characters to the start" field, add something unique, e.g. "_#!#!#". This will "translate" all segments with dollar signs. The _#!#!# will be useful later to filter segments that still contain those characters.
2. Go to the file itself, and use the "excluding first occurrences" filter. Then select all segments in that view, and clear the target fields (if you don't know how to select multiple segments, just ask -- there's a trick to it that's not logical).
3. Now use the "first occurrences" filter, and then mark all those segments as "translated".
4. Create a new TM and import the SDLXLIFF file into that TM (or use your favourite method of adding all translated segments to the TM)
5. Now, create a new project with the original file in it, and pre-translate (Batch Tasks) it against that TM (and specify a match threshold of 100%).

The resulting file will have dollar signs in all repeating segments, and all non-repeating segments will be empty. If you want to make first occurrences also empty (a good idea), simply use the "first occurrences" filter on that file, and then clear the target text from those segments.

Don't forget at the end to delete all translations that start with _#!#!#, and then rerun the SDLXLIFF file against the TM so that those empty segments get translated.

I mean, is there a way of clicking on a segment and Trados "telling" you that, for example segment 245 is a repetition of segment 552.


Not as far as I know, no.

[Edited at 2016-11-12 13:37 GMT]

[Edited at 2016-11-12 13:38 GMT]


Virgilio Dominguez del C
 
Nora Diaz
Nora Diaz  Identity Verified
Mexico
Local time: 04:32
Member (2002)
English to Spanish
+ ...
Follow Walter's advice, it's spot-on Nov 12, 2016

Walter Blaser wrote:

Hi

Yes, you can use the Display Filter for this. Use the predefined filter "Repetitions - All" and you will then see only the repetitions. You may then for example lock all these segments to prevent all translators from translating them differently (preferably translate them before you lock them).

Walter


What Walter explains above is exactly right. Here's a video demonstrating how it works: https://youtu.be/d2FJpra_uGM


Virgilio Dominguez del C
 
Miguel Carmona
Miguel Carmona  Identity Verified
United States
Local time: 03:32
English to Spanish
... Nov 12, 2016

Samuel Murray wrote:

Walter Blaser wrote:
Yes, you can use the Display Filter for this. Use the predefined filter "Repetitions - All" and you will then see only the repetitions.


No, the "All" setting displays all segments, not all repetitions. There is no built-in filter to show all repetitions (and you can't run multiple filters simultaneously AFAIK).


@Samuel,

Walter is right.

"General - All Segments" displays all segments.

"Repetitions - All" displays all repetitions.

"Repetitions - First Occurrences" displays only first occurrences of all repetitions.


Virgilio Dominguez del C
 
Nora Diaz
Nora Diaz  Identity Verified
Mexico
Local time: 04:32
Member (2002)
English to Spanish
+ ...
Propagation origin info Nov 12, 2016

Garboktrans wrote:

I mean, is there a way of clicking on a segment and Trados "telling" you that, for example segment 245 is a repetition of segment 552.


Only after the segments have been translated, confirmed and autopropagated. See how here: https://youtu.be/C7zJGSqoN1Y


 
Nora Diaz
Nora Diaz  Identity Verified
Mexico
Local time: 04:32
Member (2002)
English to Spanish
+ ...
Advanced Display Filter in Studio 2017 Nov 12, 2016

Not that it matters for this thread, as the task at hand can be achieved with the filter currently available in Studio 2015, but Studio 2017 will have an Advanced Display Filter where one can combine several criteria to refine the filter, such as Content, Segment Status, Segment Origin, Repetitions, Segment Review, Segment Locking, Comments, Document Structure, so, for example, it would be possible to filter for segments that include a key word, are repetitions and are unlocked.

Virgilio Dominguez del C
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 12:32
Member (2006)
English to Afrikaans
+ ...
@Miquel Nov 12, 2016

Miguel Carmona wrote:
Samuel Murray wrote:
Walter Blaser wrote:
Yes, you can use the Display Filter for this. Use the predefined filter "Repetitions - All" and you will then see only the repetitions.

No, the "All" setting displays all segments, not all repetitions. There is no built-in filter to show all repetitions.

Walter is right.
"General - All Segments" displays all segments.
"Repetitions - All" displays all repetitions.
"Repetitions - First Occurrences" displays only first occurrences of all repetitions.


My test file is this: (I put the repetitions in italics here, to make them stand out)

This is the house that Jack built.
This is the ho2use that Jack built.
This is the ho3use that Jack built.
This is the house that Jack built.
This is the ho2use that Jack built.
This is the ho4use that Jack built.
This is the house that Jack built.
This is the ho5use that Jack built.

If I select the "Repetitions - All" filter, then all of these segments remain in the view, even though segments 3, 6 and 8 are not repetitions. If I select "Repetitions - First occurrences", only segments 1 and 2 are displayed. If I select "Repetitions - Excluding first occurrences", then all segments excluding segments 1 and 2 are displayed, even though segments 3, 6 and 8 are not repetitions. I'm using Trados 2015.


Virgilio Dominguez del C
 
Nora Diaz
Nora Diaz  Identity Verified
Mexico
Local time: 04:32
Member (2002)
English to Spanish
+ ...
Expected behavior Nov 12, 2016

Samuel Murray wrote:

My test file is this: (I put the repetitions in italics here, to make them stand out)

This is the house that Jack built.
This is the ho2use that Jack built.
This is the ho3use that Jack built.
This is the house that Jack built.
This is the ho2use that Jack built.
This is the ho4use that Jack built.
This is the house that Jack built.
This is the ho5use that Jack built.

If I select the "Repetitions - All" filter, then all of these segments remain in the view, even though segments 3, 6 and 8 are not repetitions. If I select "Repetitions - First occurrences", only segments 1 and 2 are displayed. If I select "Repetitions - Excluding first occurrences", then all segments excluding segments 1 and 2 are displayed, even though segments 3, 6 and 8 are not repetitions. I'm using Trados 2015.



This is actually normal behavior for Studio. Your sample sentences are in fact all repetitions, because by adding a number to the word "house", you've turned "hou2se", "hou3se", etc. it into a placeable/token. In fact, if you translate and confirm segment 2 (keeping the token), you will see that segment 3 is autopropagated from segment 2.

2016-11-12_125448

Try something like this instead:

This is the house that Jack built.
This is the car that Jack bought.
This is the house that Jack grew up in.
This is the car Jack drove.
This is the house that Jack built.
This is the car that Jack bought.
This is the house that Jack built.
This is the car that Jack bought.


Virgilio Dominguez del C
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Way of knowing which segments are repetitions







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »