Getting More from Translation Memory
Get the List of 5,400+ Translation Agencies Now! No Recurring Membership Fees!
This article
is reprinted with the kind permission of the Localisation
Resource Centre. It first appeared in the March 2004
issue of Localisation Focus.
Translation Memory (TM) technology
and its benefits are widely known in the localization
industry. TM technology is mainly used for reusing
text, thus saving time and reducing costs by using
previously translated units. Through the reuse of
text, we can achieve three objectives, namely improved
consistency, minimized turnaround time and reduced
translation costs.
(Editor’s note: To learn
more about TMX, and to help contribute to standards
in the GILT industry, plan on attending the LISA
Global Strategies Summit 2004 in San Francisco,
where members of OSCAR will discuss the future of
standards and how you can help ensure that they will
meet your needs.)
If we take a look at the latest Translation Memory eXchange
(TMX) specifications and the tags used in the TMX
file format, we can get more than just time-and-cost
savings and can maximize benefits in previously unexplored
ways. Since the TMX file format is based on XML, it
offers a lot of flexibility. Having your TM in XML-compliant
TMX file format offers you more opportunities to make
the most of the information stored in it and ensures
faster Return on Investment (ROI). This article focuses
on nonstandard TMX file format usage.
Having
your TM in XML-compliant TMX file format offers you
more opportunities to make the most of the information
stored in it and ensures faster Return on Investment
(ROI).
TM in TMX
The TMX standard and the specifications
for the file format were developed by OSCAR, a special
interest group of the Localization Industry Standards
Association (LISA). The purpose of the TMX standard
is to create a standard file format, which can be
imported or exported easily using any translation
memory tool. Thus, a TM stored in TMX file format
makes it easy to transfer the TM and use it in any
tool that supports the TMX standard. In a TM, each
unit and its related information is stored using various
tags. The OSCAR team is continuously working on improving
the TMX standard and revise and publish TMX specifications
periodically. TMX specification documents are available
at http://www.lisa.org/tmx/tmx.htm
TMX
can be used for other purposes besides translation
More than just translations
Translation Memory technology, as
its name suggests, is mainly used for translation
purposes. It is used in leveraging a current project
to save time and money. However, it can be used for
other purposes. If you open a TM saved in TMX file
format, you will find other tags as well as source
and translated units. These tags contain information
such as client ID and project ID type (whether it
is documentation, software or HELP project) etc. This
extra information is saved using the <prop>
element within the <tu> element. The
<prop> element is used to define the
various properties of the parent element (or of the
document when <prop> is used in the
<header> element). These properties
are not defined by the standard format.
<tu>
<prop type="x-Domain">Food Processing</prop>
<prop type="x-Product">Label Maker Pro</prop>
<prop type="x-Project">labelmaker_v12</prop>
<prop type="x-Format">Resource Files</prop>
<prop type="x-Terminology">Glossary</prop>
<tuv xml:lang="EN-US">
<seg>Cancel</seg>
</tuv>
<tuv xml:lang="FI-FI">
<seg>Sulje</seg>
</tuv>
</tu>
|
Figure 1. <prop>
element giving more information about the translated
text.
As we can see from Figure 1, various
unpublished property types have been used (they start
with prefix “x-”). From these property
types, you can ascertain that this translation is
related to the Food Processing industry (see the <prop
type="x-Domain"> tag), and refers to a product titled
Label Maker Pro (see the <prop type="x-Product">
tag) that has the project ID; labelmaker_v12 (see
the <prop type="x-Project"> tag). Furthermore,
you can also ascertain the file type of the localizable
item (see the <prop type="x-Format"> tag). The
“x-Terminology” property type is used
to specify whether the item is included in the glossary.
This additional information can be
useful when you want to know how a string was translated
in a particular version of the product and you want
to correct it based on the validator’s comments.
Or if you want to know the differences between the
translation of a string in one version of the product
and in another version. Also, tagging a unit with
the glossary property can help you to create a glossary
by selecting only glossary-tagged units. This can
be accomplished using XSL stylesheets or by writing
a script.
Customized string
changes
During User Interface (UI) testing,
we are faced with UI specific problems. Strings that
do not fit in a given space are a common example.
For this, you will either need to abbreviate or rephrase
the translation, so that it will fit in the screen-space
provided.
When reflecting these changes in the
TM, you can use the <note> element inside <tu>
(see Figure 2), this element allows you to record
details about why you abbreviated or rephrased the
target string.
<tu>
<note>This string was abbreviated to
fit in the dialog-box.</note>
<tuv xml:lang="EN-US">
<seg>Activate autolock</seg>
</tuv>
<tuv xml:lang="DE-DE">
<seg>Autom. Sperre aktivieren</seg>
</tuv>
</tu>
|
Figure 2. <note>
element describing customized string changes.
Furthermore, if the TM software has
a feature capable of showing notes in red after leveraging,
then these notes will be useful to translators. For
example, during the translation of future versions,
if a translator sees such a note in red, he/she will
keep the abbreviated translation as it is and will
not retranslate it. This will avoid unnecessary effort
wasted in dialogue box and string-resizing (See Figure
3).
<!-- abbreviated string, do not edit -->
{0>Activate autolock<}100
{>Autom. Sperre aktivieren<0}
|
Figure 3. Leveraged
text, showing a higlighted note to the translator.
Using XSL with TMX
With XSL, you can show selective parts
of your TM. You can show source and target units in
a tabular format in the browser. You can also choose
to show only glossary related items that are tagged
with glossary property tags. You may find this useful,
for example, if you want to put localized material
on your company’s intranet. The Opentag
website has some useful XSL template collections
which are available here.
Using these XSL templates, you can generate a tab-delimited
file from a TMX file, view translated units of the
TMX file, and also change between older and more recent
versions of TMX.
Using TM for FAQs and Knowledgebases
User documentation contains
a lot of useful information such as Frequently Asked
Questions (FAQs), quick installation steps, troubleshooting
etc. After you have localized the user manual, your
TM will have this information stored in different
languages. You can use this TM (in TMX file format)
as a searchable database, so that users (customers,
business partners, translators, validators etc.) can
search for necessary information via your website
and can get the search results in their native language.
Through specially designed web interfaces,
the user can perform various types of searches based
on keywords, phrases, product-names etc. The search
engine will then search through the company’s
master TM (stored in TMX file format) and will display
the search results in the user’s preferred language.
Since a file using the TMX file format is a structured
document with XML tags, the search function will be
performed more efficiently and search criteria will
yield more precise results. This can help reduce costs
in many areas. For example, as translators will have
ready access to previous translations, they will have
fewer queries and thus, it will minimize time wasted
on unnecessary communication. This will also help
increase the speed of translation. The technical support
team may get less support calls as users can get necessary
information, in their native language, from the searchable
knowledgebase on the Internet. Overall it can reduce
the time and money spent on different modes of communication.
Supporting TMX Features
In order to make the most of your
TM and the different usages mentioned in this article,
companies producing TM software, should implement
different TM-related features in their software. One
such feature would be an export feature that allows
file-export based on tags, so that users can export
selective tags (such as property types shown in Figure
1).
An example of where this could be
useful would be when localising a user manual for
localized software. Often a translator may not know
how to handle software UI references appearing in
the localized manual. This may be because the translator
has no access to the localized software or relevant
reference material. Most of the time, leaving the
UI reference in English and putting a translation
in the brackets that follow would handle the situation.
If we could export the TM units that have UI terms
into a separate file, then it would help the translator
and reduce a large amount of work in subsequent phases.
Furthermore, there are some software
packages on the market that allow you to edit the
translated units in the TM. Heartsome’s
TMX Editor is one such package. It has an attractive
graphical user interface which makes the editing job
more user friendly. Such editors help to improve the
quality of a TM and make TM maintenance easy.
One way to improve the TMX standard,
would be to add new tags, elements and attributes.
One of the main objectives behind TM technology is
the reuse of text. Let’s hope that the suggestions
made in this article will help to achieve this objective.
Shailendra Musale
has worked in software localization for over ten years
in Singapore and Finland. He periodically writes on
various topics related to the localization industry.
He currently works as a Globalization Engineering
Project Manager (GEPM) at Veritas India. He can be
reached at smusale@vxindia.veritas.com
Reprinted
by permission from the Globalization Insider,
11 May 2004, Volume XIII, Issue 2.2.
Copyright
the Localization Industry Standards Association
(Globalization Insider: www.localization.org,
LISA: www.lisa.org)
and S.M.P. Marketing Sarl (SMP) 2004
Read
more articles - Free!
E-mail
this article to your colleague!
Need
more translation jobs? Click here!
Translation
agencies are welcome to register here - Free!
Freelance
translators are welcome to register here - Free!
Subscribe
to TranslationDirectory.com newsletter - Free!
Take
part in TranslationDirectory.com poll - your voice counts!
|