Content missing from PDF source
Thread poster: Ofra Hod (X)
Ofra Hod (X)
Ofra Hod (X)  Identity Verified
Israel
Local time: 07:33
English to Hebrew
Aug 28, 2022

I am translating a 116-page PDF document and I noticed that some content is missing in the source column. I went to check the intermediate docx file and see that the content is missing there, too.

I can fix this in the final deliverable, but what if there are more such omissions that I didn't notice?


 
Soonthon LUPKITARO(Ph.D.)
Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 12:33
English to Thai
+ ...
Pdf texts as graphic Aug 28, 2022

Trados can convert PDF texts into DOCX or bilingual Trados file if source texts are searchable. Therefore, graphic texts in PDF is ignored as image in Trados platforms.

Regards
Soonthon L.


Nick Quaintmere
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 08:33
English to Russian
Not exactly Aug 28, 2022

Soonthon LUPKITARO(Ph.D.) wrote:
Trados can convert PDF texts into DOCX or bilingual Trados file if source texts are searchable. Therefore, graphic texts in PDF is ignored as image in Trados platforms.
Trados has an OCR feature powered by Solid Documents and it can process both vector-based and bitmap PDF files. However Trados in general is not designed for ocring. If you want a reliable translatable source file, you have to convert it into docx first, tidy it up and only then import into Trados.


Sebastian Witte
Jaime Oriard
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 08:33
English to Russian
Video Aug 28, 2022

You can try to fiddle with options based on this this video (direct attention to what she says at 1:08 and 1:30): https://youtu.be/K8LU8kFV55c?t=68
There is a preview option for you to see how different settings may change the output.

*If you can see changes in the preview mode but not in your file, probably you need to re-build the pair of sdlxliff files. If this is the case, you can
... See more
You can try to fiddle with options based on this this video (direct attention to what she says at 1:08 and 1:30): https://youtu.be/K8LU8kFV55c?t=68
There is a preview option for you to see how different settings may change the output.

*If you can see changes in the preview mode but not in your file, probably you need to re-build the pair of sdlxliff files. If this is the case, you can either remove previous files from source and target (by switching between the language banner symbols), then drag and drop your file into the source section of the same project in Trados (in this case you should apply your setting changes using the 'Project Settings-File Types-PDF-Converter' path), then click 'Prepare without project TM', or simply create a project with the new settings from scratch (in this case you should apply your setting changes using the 'File-Options-File Types-PDF-Converter' path).

[Edited at 2022-08-28 15:10 GMT]
Collapse


 
Ofra Hod (X)
Ofra Hod (X)  Identity Verified
Israel
Local time: 07:33
English to Hebrew
TOPIC STARTER
My PDF is searchable Aug 28, 2022

I am sorry, I wasn't clear about it. My PDF is not graphic. The text in it (including the missing text) is searchable.

 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 08:33
English to Russian
Aug 28, 2022

Ofra Hod wrote:
My PDF is not graphic.
It does not matter save that you can try different settings and the IRIS plugin as I mentioned above. But the most robust way is to convert your pdf file into docx before importing it into Trados.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 06:33
Member (2006)
English to Afrikaans
+ ...
My suggestion Aug 29, 2022

Stepan Konev wrote:
But the most robust way is to convert your PDF file into DOCX before importing it into Trados.

I have tried a few PDF-to-DOCX converters, and my experience is that Trados' converter is among the best out there (especially for taking into account that sentences should not be broken up at line endings).

So my suggestion would be that you start by creating a temporary Trados project for the PDF file, which will then generate a DOCX source file. Then, compare the DOCX file against the PDF file (line by line, page by page) to ensure that everything is there, and then create a new Trados project using that updated/fixed DOCX file (not using the original PDF file).

No PDF converter is perfect and there is always a risk that the converter will fail to convert some text, for whatever reason, so it is the translator's responsibility to ensure that the DOCX file that he translates contains all of the text.


Enza Esposito Degli Agli
Stepan Konev
 
Ofra Hod (X)
Ofra Hod (X)  Identity Verified
Israel
Local time: 07:33
English to Hebrew
TOPIC STARTER
Trados is usually doing a good job Aug 30, 2022

Thank you! What you say supports my gut feeling that Trados was doing a good job in general. I will compare the docx and the pdf, as you suggest.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Content missing from PDF source







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »