This point was my first recollection and introduction to the concept of translation memory about a decade ago, which was in the context of an enterprise content management system sending source content to be translated.
This post is a primer of sorts for those of you who may come from a content technology background, learning about something called Translation Memory.
Rough definition of translation
Before we get to the "memory" part, let's start with translation itself, or the process of changing some source content into a desired target language. There's always a (language) pair and usually you want to know both the language and the location that language is spoken for both the source and target.
In terms of costs, you could technically pay for people (sometimes referred to as human, to be distinguished separately from machine) to translate entire documents from scratch, every single request.
However, since you're paying for the translation, wouldn't it be great to avoid paying full price if you're translating the same content... again?
That's where translation memory comes into play.
Translation memory is a database of sorts that captures and reuses translated pieces of content. These pieces come from the translated (electronic) documents or files, broken down into segments of source and target pairs.
What is a segment?
A segment is some piece of text to be translated. Segments might be words, sentences, or perhaps paragraphs of text. They should have enough context that they can be translated yet not so large that you'll rarely get the same arrangement of text translated again in the future.
Since translation memory well, remembers past translations, translation requesters can get their requests completed sooner and translators can work more quickly. In general, the projects cost will be lower when using translation memory.
A unit is the combined source and translated target segment.
Translation costs depend on your language service provide (LSP), translation job needs (e.g. is it a rush project?), the source and target translations, and any special handling of the job.
The language-specific costs for a given source-to-target pair might be charged by some fixed metric such as word count or character. Translation segments that have been translated before tend to cost less than brand new segments. In-between the "cheaper" previously-seen, and more costly never-before-seen segments are what you may call "fuzzy" matches.
A fuzzy match is a translation segment where the source is close, but not exactly the same, to a previously-seen translation segment.
Specifically the match is on the source, which is now obvious to me, but for a brief moment I almost got translation memory backwards. I'm only admitting that in case it helps to know there's a bit of nuance to this seemingly simple concept.
Note that depending on the implementation of translation memory for a given translation management system, there is also the concept of reverse translation memory where a system might check for translation segment matches past target languages.
A fuzzy match is where many of the words in a source segment are found in a translation memory translation unit (pair of source and target segment), but it's not quite 100%. It's a little... fuzzy, similar to the concept of "fuzzy search" for those familiar with website or enterprise search.
So now that we know a translation request, especially in a digital format, can be broken down and have segments matched from translation memory. What does this mean for organizations needing language services?
Essentially, organizations and their language service providers can leverage past translations to increase the speed while reducing the costs of translation.
In general, the more exact of a translation memory match, the cheaper the translation cost.
Final tip for demonstrating translation memory
As a side comment, I thought demonstrating content management-related concepts like content versioning is hard, especially if you want to prepare some checked out content for others to work on. It's too easy to "fix" and clean up your demo example! However, demonstrating translation memory adds even more nuanced challenges.
Not only do you want a plausible translation to be captured in translation memory already, you then want a plausible translation that differs slightly.
That's it in terms of translation memory coming from a web content management perspective. For your favorite content management system you'll want to keep language pairs in mind, see how you can configure and adjust the segments where needed, and confirm if your CMS has the ability to do things like review in-progress translation units. Ultimately, leveraging past translations with translation memory will save you time and costs while improving consistency.
Learn a bit more by reading about Translation Memory in the context of Trados Studio. You're welcome to join the RWS community as well, perhaps by checking out our various product groups.
Post a Comment
Feel free to share your thoughts below.
Some HTML allowed including links such as: <a href="link">link text</a>.