MultiCorpora was founded in 1993 by Gerry Gervais who formerly worked in a large Canadian government department responsible for publishing large volumes of bilingual documentation. He tried to increase the productivity of translation activities by implementing Computer Aided Translation (CAT) technologies into multilingual publishing and document management operations. Since the available technology was unsatisfactory, Gervais started to develop MultiTrans as a tool that supports various types of content, provides context, and does not require up-front setup and investment. MultiCorpora was formally incorporated 2.5 years ago in Quebec, Canada and the first version of MultiTrans began shipping to customers in early 2000. Until recently, sales and marketing have focused mainly on the Canadian and US markets, with particular success in governmental institutions, the pharmaceutical industries, financial services as well as agencies and freelancers. Just this year (2002), MultiCorpora also starts to step into the European market.

Fig. 1: The TransCorpora Search Module offers three different search options (left panel): free input, TransCorpora, and Terminology search. Search results are displayed to the right using blue highlighting of the search term in the source file and yellow alignment of the corresponding sentences in the source and target.
MultiTrans focuses on document translation. Binaries and resource code is not supported. Like Star Transit, MultiTrans differs from traditional Translation Memory technology. Instead of an often bloated and cumbersome central database (Translation Memory or TM), MultiTrans uses one or several collections of reference documents arranged in language pairs. This way each translation can be based on a number of old translations selected for the actual purpose. MultiTrans calls these collections Corpora (plural of corpus). Corpora are created by the Corpus Builder and may include all previously translated MS Word (*.doc and *.rtf), WordPerfect, HTML and XML, PDF and text files as well as TMX-compliant translation memories. Language pairs are automatically aligned and indexed followed by an also automated term extraction (words and expressions). This is achieved by a high performance engine, capable to process more than 50,000 words per minute. Included into this first module is also a powerful and fast search engine called TransCorpora Search capable to search across multiple words, phrases, sentences and paragraphs. Search results are presented in a split-screen view which shows source and target expressions in their full documents context.

Fig. 2: The MultiTrans TermBase Terminology Editor.
MultiTrans' second main module is the multilingual terminology database TermBase, which does not only support the common "Nominal Terminology" but also recurring expressions of typically 5 words or less to be replaced by a pre-translation process. In addition, even Sentence-level Translation Memory (TM) databases can be imported (TMX or delimited text) and used as terminology. The TermBase can be easily extended through material from the language pair corpora by powerful expression extraction and alignment. MultiTrans optionally offers TermBase C/S (Client/Server) for building, sharing and distributing terminology across a network, intranet or the Internet.

Finally, for the translation of new documents, MultiTrans Translation Support Workbench connects familiar editing environments such as MS Word with a project's multiple Corpora and TermBases. In MS Word this is achieved by a collection of macros included in the Add-On TransTerm. The formerly independent product DoubleVue™ transforms MS Word into a split-screen editing environment for simultaneous tracking and display of source and target translation texts.


Fig. 3: Interactive translation of a RTF document using MultiTrans' TransTerm Add-On for MS Word.
Editions & Prices:

  • MultiTrans Light (Number of expressions in a TermBase and Number of words in reference documents is limited - no network access): 689 $/€ plus shipping & handling.
  • MultiTrans Pro (Number of expressions in a TermBase and Number of words in reference documents is NOT limited - Network access): 2219 $/€ plus shipping & handling.

Volume discounts for both versions available. Client-Server and Web Terminology starts at $/€ 995 and depends on the number of users.


Usage & Evaluation:

The makers of MultiTrans stress that their corpus-based approach enables instant access to leveraged corpora of documents containing thousands of files and millions of words by eliminating manual alignment. This leads to savings in time and money and high performance from the start. This is doubtlessly the case if you already have a huge treasure in multilingual documents. However, even just a couple of documents on the same topic or for the same client such as multilingual public web pages can yield time savings. Although I found some problems in the automated alignment and manual corrections are not always easy to make, you are almost immediately ready-to-go. If absolutely no multilingual reference material is accessible, MultiTrans is also able to do the traditional laborious way of feeding a TM while you translate. Translated segment pairs can be fed into a database (the TermBase, which can export as TMX). In contrast to traditional TM solutions, you do not only have a complete TM when finished but also the full text source and target documents which can be put into a corpus. Such full-text multi corpora can also widen the scope of use to data mining and knowledge management.

Another real advantage of the corpus-based approach is their independence of full sentence repetition. MultiTrans can identify full sentences as well as expressions and phrases. In most document types, repetition of whole sentences is actually quite rare. Typically more than 90% of total repetition is at the expression and word level. Unfortunately, translation using sub-sentence entities produced quite unsatisfactory result in the test documents. Translated German nouns were always inserted non-capitalized into the surrounding untranslated text and sub-sentence fragments were inserted without regards of different rules in grammar and syntax in the source and target languages. The default settings for projects involving German did not automatically take care of this problem. By fine-tuning the settings, results can be improved but it still does not make sense to non-capitalize german nouns under any circumstances. Moving correctly translated blocks around and cleaning up the syntax is in most cases more laborious than doing a manual translation. In contrast to this, a concordance search of traditional TM tools lets you choose between several grammatically proper sentences. This classic approach can, however, also be used in MultiTrans if you feed the TermBase not only with terms and phrases but with your aligned bilingual corpora - thus using it like a traditional TM.
Another obstacle experienced was the inflexibility to teach the automated alignment to accept two sentences in one languages and only one in the other. MultiTrans lacks familiar tools to split and combine source or target sentences but after getting deeper into the tool's architecture I was able to achieve a correct alignment.

At present only translations of Word, WordPerfect and Text documents are supported. PowerPoint is currently in testing - commercial release is planned for the first quarter in 2003. In addition, as MultiTrans uses Word processor plugins as the editing interface, you can translate most file types, MS Word and WordPerfect are able to open. However, this is absolutely not the case for HTML files opened in MS Word. Future versions of MultiTrans will offer an add-on for HTML files called WebTrans. This development is currently in beta and is in test-use by some customers. Although only in beta, MultiCorpora provided me with this new module, so I can have a look on its functionality. Besides the interactive translation, all functions available in the TransTerm interface of MS Word are already implemented enabling the user to open HTML pages in a WYSIWYG environment (Fig. 4). It is uncertain whether MultiCorpora will provide this module free of charge or as an additional voluntary tool at extra costs. According to company statements, another planned extension will support XML documents in a WYSIWYG environment if a XML display definition is provided. This Add-On is due in spring 2003 and will enable MultiTrans to also process other DTP formats that are XML enabled, such as the latest versions of QuarkXPress, and FrameMaker. Again, I could not get a statement on whether this Add-On will be included in the Pro version or offered at extra-costs.


Fig. 4: Transcorpora translation of a MS-Word HTML document using MultiTrans' new TransWeb module currently in beta state.
My problems with this innovative application started with the installation: undocumented Macro- and Visual Basic issues could only be solved by deinstallation and reinstallation of MS Word. However, the support faced similar problems in less than 1% of their clients. It is therefore possible that it was my test system which is quite stressed out from frequent installations and deinstallations caused these unusual problems. When using the otherwise convenient interactive translating mode, it was quite upsetting, that single translation steps could not be made undone. You have to go through it all and finally hit save or cancel thus saving or dumping all steps. I was told that there is an improvement on its way to solve this problem.

MultiTrans comes with an easy-to-follow tutorial in MS Word format including internal links to point the user to the learning units. Unfortunately, all links were broken but manual navigation through the chapters was not such a big deal. More serious was that most links from within the application or the help file to MultiCorpora's web site were broken. This included links to the support page, the registration and upgrade download page as well as the non existent FAQ-page. I was told that this was due to a major upgrade of the corporate web site and should be fixed by now. Minor shortcomings are the lack of (alt-)descriptions of buttons in the Transcorpora GUI, the missing history in the search field and the time-consuming and tedious job of manual reselection of trans corpora files every time you start the program. MultiTrans comes with a printed documentation which by far does not sufficiently describe the vast options of the application: For example, there is no hint about the meaning of the text highlighting in different colors. The lack of a register and index makes it extremely difficult to find information in the printed manual.

In conclusion, MultiTrans is a very promising innovative translation tool which combines the strengths of traditional TMs with the advantage of a full-text reference database which is indexed into sub sentence phrases and words (full-text multilingual corpus).


System Requirements:
Windows 95, 98, NT 4.0, 2000, XP; Microsoft WORD 97, 2000, XP (or other editing environment); Pentium 200 MHz; 32 MB RAM; 800x600 display; 20MB free disk space; Internet access for online updates.

Company Information:
MultiCorpora R&D Inc.; CDTI de Hull, 490 St. Joseph Blvd., Suite 102, Hull, Quebec, Canada J8Y 3Y7; Phone: +1-819-778-7070 (North America Toll-free: 1-877-725-7070), Fax: +1-819-778-0801; E-mail: info@multicorpora.com; URL: http://www.multicorpora.com.



