
Studio usually does a good job localising source language dates in
the correct target format. But sometimes the source text uses an
incorrect date format or language variants cause problems. Here’s a
quick look at how date auto-substitution should work out of the box, and
how to tweak it if it doesn’t.
Auto-substitution
To get Studio to automatically localise dates in your target
language, go to File>Options>Language Pairs>[your specific
language pair]>Translation Memory and Automated
Translation>Auto-substitution>Dates and Times.
The screenshot shows the options available for a long date in your
target language. Select your preferred formats for long and short dates
and times.

You also need to enable auto-substitution specifically for dates:

Note that these options apply to all newly-created Studio projects.
To change date substitution options in a current project, go to Project
settings.
Now, when you see a date in a source segment, it’ll be underlined in
blue. Click in the target segment, call up the recognised token using
the shortcut Ctrl+comma and press Enter.

Easy, isn’t it? Unfortunately, it doesn’t always work.
Source date format correct, auto-substitution wrong
Even if the source date format is right, sometimes a language variant
or other problem will stop auto-substitution from working. Regex Match AutoSuggest Provider,
an OpenExchange app, is a great solution. As its names suggests, this
app combines AutoSuggest and regular expressions to match user-defined
source strings.
Regex Match AutoSuggest Provider can catch date formats in any
language and replace them with a target language pattern. Basically,
months have to be predefined and translated as variables, and days and
years are simply reproduced using back references. The app integrates
with the AutoSuggest engine and matches are shown in the AutoSuggest
drop-down list:
Paul Filkin recently posted a short video on YouTube to show how this works for Spanish to English dates. He explained how to automatically change 01 de enero de 1999 to 01 January 1999 using:
- this Regex pattern: (\d{1,2})\sde\s(#Month#)\sde\s(\d{4})
- this Replace pattern: $1 $2 $3
Watch the video to learn how to do it.
Fairly easy, isn’t it? Unfortunately, this trick doesn’t always work.
Incorrect source date format
In practice, Spanish dates are often written incorrectly. According to the Diccionario panhispánico de dudas, the correct long date format is 31 de diciembre de 1992.
Other formats are acceptable, depending on the genre:
- For international scientific or technical texts, the preferred ISO format is year, month and date, with no preposition: 1992 diciembre 31.
- For letters and documents, the Diccionario panhispánico de dudas states that it is ‘not incorrect’ to use the definite article before the year: 31 de diciembre del 1992.
All other formats are incorrect, including:
- 31 diciembre 1992
- 31 de Diciembre de 1992
- 31 de diciembre de 1.992
- 31 de diciembre 1992
And yet, if you translate out of Spanish, you’ll often see these variants in your source texts.
Regex Match AutoSuggest Provider – Studio solution
Every time I come across a new date variant, I expand my customised
Regex pattern so that it will catch the unrecognised date format.

This string picks up all the correct and incorrect dates mentioned above:
(\d{1,2})(\s|)([Dd][Ee]|)(\s|)(#Month#)(\s|)(de|DE|del|DEL|)(\s|)(\d)(\.|)(\d{3})
The Replace Pattern is:
$1 $5 $9$11
Key:
2 digits, an optional space, de in upper or lower case, or nothing, an optional space, a month, an optional space, de or del in upper or lower case, or nothing, an optional space, a single digit, an optional dot, 3 digits.
Each field or group is placed in brackets, which means the Replace
Pattern can be created by entering a back reference ($) and the
corresponding group number.
The trick is to define months as variables and pretranslate them under the Variables tab in Regex Match AutoSuggest Provider:

Don’t forget to add all possible variants of the months, in upper case, lower case and initialised capital.
Short date formats
Short date formats can be localised the same way. The Regex string below should catch all these dates:
- 31 dic 1992
- 31.dic.1992
- 31-dic-1992
- 31 DIC 1992
- 31/dic/1992
Regex pattern: (\d+)(\s|-|/|.|)(#AbvMonth#)(\s|-|/|.|)(\d+)
Replace pattern: $1$2$3$4$5
I’m sure there are neater ways of writing Regex patterns to catch all
these date variants, so please add your suggestions in the comments
below. I look forward to learning more ways of dealing with right dates
and wrong dates.
No comments:
Post a Comment