By
Frederico Carvalho
Project Manager
All Tasks Technical Translations
Get the List of 5,400+ Translation Agencies Now!
No Recurring Membership Fees!
So many words and no way to count
them… Frederico Carvalho at All Tasks Technical
Translations in Brazil takes a quick historical
and humorous look at what this means in the “real
world” on a daily basis for project managers
and business people and makes a plea for word count
standardization. To review a proposal currently
being submitted to LISA, please read Andrzej Zydron’s
article in this same issue, GILT
Metrics – Slaying the Word Count Dragon.
A while ago, I was
busy at work when my boss called me up on the phone
to assign me to a very special task: I was to investigate
the standard (if any existed) for word counting
Chinese documents.
I work in Brazil at
a translation agency, so we are very accustomed
to handling multiple language pairs, albeit mainly
European ones, e.g., English-Portuguese, Portuguese-Spanish,
English-German, etc. But the world has changed.
Just a short time ago, the Brazilian president went
on a six-day trip to China to sign trade agreements
between the two countries – the developing
world's two biggest economies. Even before this
official visit, the translation demand for Chinese
was increasing in Brazil. Thus, our business department
has been wondering what the best practice would
be for evaluating Chinese text volumes.
We localized a mobile
device into 28 languages last year, including Traditional
and Simplified Chinese, but back then we had just
one client and only one translator (fortunately
in Brazil), so we could keep things under control.
But now, we have new clients everyday, and at least
four Chinese translators (half of them abroad) available
full-time. And, we have many doubts about the word
counts now. The Chinese language is formed by characters
that represent a word or a part thereof, and which
have ideographic roots. Therefore, we cannot assume
that one Chinese character is equal to one word,
so the traditional method for word count, as we
apply it to Western languages, is likely to fail.
“Standard”
pages can contain anywhere from 1200 to 2046 characters!
So, I set to work on
my special project. I contacted agencies from Asia
and translators, and I asked LISA staff to help
clarify the issue, only to discover that there is
no international standard for word counts in any
language, let alone Asian languages. However, I
did find out that LISA has a Special Interest Group
called OSCAR
(Open Standards for Container/Content Allowing Re-use)
that is actively working to establish a standard
in this area (see the section at the end of the
article for more information).
Brazilian Standardization
I must admit it wasn’t
a big surprise that no real standard exists globally,
since it’s no different in Brazil. The usual
procedure adopted by most agencies is an old-fashioned
method inherited from the ancient time of typewriters
– the Standard Page (O.K., I’m not blaming
anyone since I also had my own Olivetti). This page
is supposed to contain around 200 words, and can
vary anywhere from around 1200 to 1250 to 1400 to
1500 characters, depending on agency policy (better
not to mention the journalistic standard page, which
is composed of 1700 characters). Recently, a translators’
labor union came forward to bravely suggest that
agencies should stop shooting at random and adopt
the standard page of 1250 characters.
When we need to outsource
translation services abroad, we face the same problem
caused by the lack of word count standardization.
Some international agencies also use the standard
page system. But what is needed is a true global
standard. For instance, German and Russian (including
Polish and Czech) pages have 1800 characters, while
in Italy I have found that the standard varies between
1250 – 1500 characters/page. In Thailand,
agencies evaluate text by using an A4 paper with
text size set at 16 points to arrive at a standardized
page of 2046 characters.
Paddle Your Own
Canoe
Right now, customers
and translators are forced to have special rate
tables for each standard. All of this could be avoided
through the implementation of a real standardized
word count procedure, providing consistent counts
so that all of us can stop wasting so much money
and effort juggling the different methods and trying
to bring them into synch.
Some people blame the
agencies for not speaking the same language, some
blame the international translator organizations
(that are not able to establish a uniform standard),
and others even blame Gutenberg as the great villain.
He standardized letter forms for movable type in
order to generate the first printed books, and now
we are not able to create a standardized procedure
to evaluate the text that goes on them.
The Chinese Solution
The discussion goes
on. Even in Asia, they do not agree upon an equal
basis for measuring texts. However, most Chinese
agencies have adopted a simple model that text be
evaluated at 1000 characters per page. I think,
myself, that this is the best solution for word
counting, since it doesn’t matter if it’s
for source or target language.
By the way, historical
research indicates that the earliest dated (868
A.C.) printed book also comes from China. A sign
that the Chinese were already working on standardization
long before our technological innovations.
LISA Is Working
on a Solution
Because of this lack
of standardization, OSCAR, LISA’s Special
Interest Group for GILT standards, is examining
the issue of word count and other metrics for translation
work. Unfortunately, as can be seen in my examples
above, finding a single metric that will satisfy
the different expectations in various countries
may prove difficult. It might seem intuitive, and
even obvious, that word counts should be used as
a basis for determining translation costs, but this
is only obvious for Indo-European languages that
have a well-defined concept of a word, and, in particular,
a concept of a word that can be somewhat easily
counted. Word count won’t work for all languages,
so any standard solution will need to address a
basis for dealing with languages that cannot be
counted in the same way as English, Portuguese or
Russian. Thai, Chinese, and Japanese, to name some
of the more common ones, cannot be easily counted
without tremendously sophisticated linguistic knowledge
that is not generally available in computer programs.
In today’s global marketplace, it is no longer
possible to have a myopic view of the world that
only considers the needs of European languages.
Although it is impossible
to say what solution OSCAR will arrive at, a single
standard will finally lead companies and translators
out of the jungle and into the light. It will simplify
bidding, subcontracting and billing, and help prevent
disputes and surprises. When everyone can agree
what a word or a character is and how to count it,
we will all be better able to focus on our jobs,
not on the aggravating task of counting words.
Frederico Carvalho
is a graduate in Communications Science and a Project
Manager at All Tasks Technical Translations in Brazil.
He is currently working on the SEED (Schlumberger
Excellence in Educational Development) website localization
project. Carvalho can be reached at fred_DELETE_THIS@alltasks.com.br.
Reprinted
by permission from the Globalization Insider,
17 June 2004, Volume XIII, Issue 2.3.
Copyright
the Localization Industry Standards Association
(Globalization Insider: www.localization.org,
LISA: www.lisa.org)
and S.M.P. Marketing Sarl (SMP) 2004
Read
more articles - Free!
E-mail
this article to your colleague!
Need
more translation jobs? Click here!
Translation
agencies are welcome to register here - Free!
Freelance
translators are welcome to register here - Free!
Subscribe
to TranslationDirectory.com newsletter - Free!
Take
part in TranslationDirectory.com poll - your voice counts!