English-Russian Dictionary
English-Russian Dictionary
English-Russian Dictionary
More and more users nowadays prefer electronic dictionaries to paper editions because of their
convenience, easier access to different kinds of linguistic data, especially in the case of big professional
dictionaries. This is why most of the popular paper dictionaries have their electronic versions.
Nevertheless, printed editions are still popular with certain categories of users and even perceived by
them as more trustworthy than purely electronic dictionaries. This paper describes the procedure of
making a print version of the electronic English-Russian dictionary Lingvo Universal and shows different
kinds of problems lexicographers dealt with at each stage of the project.
1. Introduction
In the year 1990 the first electronic English-Russian dictionary appeared in Russia. It was
called Lingvo and had the following functionality:
a convenient interface which enabled the user to see simultaneously the entry in the
chosen dictionary, its word list and the text being translated or read;
cross-reference links between the entries in the same dictionary;
advanced search for a word, a word form or a multiword expression (MWE) in most
zones of dictionary entries, the results of the search being shown for each zone of the
dictionary entry;
separate entries for semi-fixed phrases and collocations with unconventional
translations;
presentation of word forms for each of the one-word vocables, separate entries for
phrasal verbs and verbal syntagms (phrase syntax), etc.
At this point the company producing Lingvo launched its first lexicographic project and
became the first commercial business to produce its own English-Russian dictionary in
Russia. Now the company, ABBYY, including a software company, a publishing house and a
translation agency produces different kinds of dictionaries and carries out lexicographic
research. The latest version of Lingvo software incorporates dozens of dictionaries of different
languages, both bilingual and monolingual. Most of them are licensed electronic versions of
high-quality paper editions, but some of them are the fruit of the company’s own
lexicographic research, such as Lingvo Universal English-Russian Dictionary.
Despite being a closed country for seventy years, Russia has a long tradition of bilingual
lexicography and terminography based on the principles of linguistic science. There was a
whole range of high-quality bilingual dictionaries published in USSR and in post-Soviet
Russia including both general and terminological editions, such as The Russian-English
Dictionary edited by A. Smirnitsky (first published in 1948), The Russian-English Dictionary
by I. Yermolovich (published in 2004), The Comprehensive English-Russian Dictionary
(Bolshoi anglo-russky slovar’, BARS) edited by I. Galperin (first published in 1972), The
New Comprehensive English-Russian Dictionary (Novy Bolshoi anglo-russky slovar’,
NBARS) edited by J. Apresjan (published in 1993), the latter being the most comprehensive
lexicographic description of the English and Russian vocabulary of the time. However, since
the advent of market economy, this long-established tradition has been under permanent
threat. The situation can be explained by the specific character of the dictionary publishing
industry in the USSR and in post-Soviet Russia. Unlike British publishing houses producing
539
Julia Anokhina
dictionaries, Russian publishers hardly if ever had any lexicographers in their staff.
Comprehensive dictionaries (bilingual and monolingual) were compiled by research teams
from universities and scientific institutions, bilingual dictionaries of specialist terminology
being compiled by individual authors. Individual authors and research teams signed contracts
with publishing houses that edited those dictionaries and published them. But since the fall of
the USSR, philology and linguistics have been, in general, neglected both by the previous
main sponsor, the government, and by private sponsors seeking immediate return for their
investment. It is quite logical that in such circumstances it has been almost impossible for a
research institution to create a large research team capable of producing a really new
comprehensive bilingual dictionary and thus start a costly and long-term dictionary project.
As for the publishing houses, they cannot afford such a project either; consequently the
situation is that comprehensive bilingual dictionaries appearing now in Russia are mostly
compilations from earlier dictionaries presented as ‘new revised editions’.
Authors writing about electronic lexicography often stress the fact that present-day electronic
dictionaries, including electronic editions of printed dictionaries, have richer content and
present it in a more convenient way than printed ones. See, e.g., Atkins and Rundell (2008:
239), Atkins and Varantola (2008: 371). However, this is not true for all electronic
dictionaries, many of which (especially electronic dictionaries published online, without being
professionally edited) are quite often low quality and their content cannot compete with the
content of quality printed dictionaries. That even made B.T.S. Atkins (2008: 31) say that only
‘books are the focus of professional lexicography, and the dictionaries discussed, reviewed,
praised, or criticized are books’, not electronic editions. This is the reason why a lot of
experienced users of dictionaries (researchers, translators and teachers) prefer authority
‘paper’ dictionaries or electronic versions of such dictionaries to purely electronic ones. And
that was one of the reasons why, despite being a software company, ABBYY in 2005 decided
to publish its own printed dictionary based on the electronic Lingvo Universal English-
Russian Dictionary (Lingvo UERD). Producing a comprehensive printed dictionary was a
matter of prestige for ABBYY, but of course it was not the only reason for launching such a
costly project. The printed format has its advantages; first of all, it can reach customers who
rarely use computers (and, consequently, electronic dictionaries) or even do not own one.
Moreover, it is widely believed that information printed on paper is better apprehended, so a
printed dictionary is a good option for people learning a foreign language.
While making a printed dictionary from an electronic one is quite an unusual task, making a
paper dictionary from an electronic database has become a norm nowadays. Nevertheless
many more articles have been published on the subject of making electronic versions of
printed dictionaries, than on the problem of making a ‘paper’ dictionary from an electronic
one. See, e.g., Schmidt and Geyken (2008), Alegria et al. (2006), Haja et al. (2006), Atkins
and Varantola (2008), etc. This seems quite logical, taking into account the above-cited
opinion that electronic dictionaries compiled by professional lexicographers ‘all started life as
books’ (Atkins 2008: 31). However it seems certain that future dictionaries will be mainly
540
Section 3. Reports on Lexicographical and Lexicological Projects
electronic, because electronic dictionaries save space, provide easier access to different kinds
of linguistic data, and electronic databases ensure more efficient team work for
lexicographers. Traditional printed dictionaries will be sooner or later transformed into the
kind of databases used to produce different kinds of dictionaries, including printed ones. The
current article is going to describe a reverse transformation: that of an electronic dictionary
database into a multifunctional database and later on into a printed dictionary. As a
lexicographer involved in the project I’d like to report on it and to show different the kinds of
problems we had to deal with.
As already mentioned, Lingvo UERD was initially conceived as a purely electronic edition,
part of the Lingvo software. It was based on The English-Russian Dictionary by V. Müller,
one of the most popular English-Russian dictionaries of the past that was completely revised
by ABBYY lexicographers and enriched by new examples from different Internet resources,
such as the corporate forum ‘Dobavim v Lingvo!’ (‘Let’s add it to Lingvo!’), where users of
Lingvo software post their suggestions, from monolingual English dictionaries, classical and
modern literature, texts related to different professional fields, and later on from the in-house
linguistic corpora of ABBYY comprising 200 million words. It was designed as a
comprehensive English-Russian dictionary for professional users and advanced learners. The
first printed edition of the dictionary in two volumes was initiated in 2005, and the dictionary
was published in 2007 by the publishing house Russky yazyk under the title ABBYY Lingvo
Comprehensive English-Russian Dictionary (Bolshoi anglo-russky slovar’ ABBYY Lingvo,
hereafter – BARS ABBYY Lingvo). The second edition in one volume was initiated in 2008 by
the publishing house ABBYY Press and is currently being in press. The dictionary entries were
edited in the in-house dictionary writing system (DWS) ABBYY Lingvo Content.
As the dictionary database grew further, it was transformed into a multifunctional database,
used to produce different kinds of dictionaries, Lingvo UERD being just one of its possible
products. That database is used, for example, to produce such dictionaries as Lingvo
Universal First Step (an abridged version of the dictionary for mobile platforms), Lingvo
UERD for schoolchildren (a more compact edition of the dictionary with fewer and shorter
examples and without certain groups of lexemes, such as taboo and obscene vocabulary), as
well as different printed dictionaries, BARS ABBYY Lingvo being the first printed edition
based on its content.
While preparing the ground for the project we faced a whole range of problems which can be
grouped into two categories and could be of interest to other lexicographers involved in
similar projects.
4.1. Problems related to different access to electronic vs. printed dictionary data and to
different user tasks while accessing them
4.1.1. Data redundancy and data repetition in Lingvo UERD vs. space saving in printed
dictionaries.
Unlike such electronic dictionaries as Oxford Advanced Learners’ English Dictionary,
Macmillan English Dictionary and some others, Lingvo does not show a page of a printed
dictionary reproduced electronically, it shows an entry. It should be noted however, that
541
Julia Anokhina
Lingvo UERD is not the only dictionary available on Lingvo software. The latest version of it,
ABBYY Lingvo x3, comprises 157 dictionaries of 11 languages. So, when entering a word, the
user can see several entries from different dictionaries of a given language or a language pair
at a time, but only one entry per dictionary:
Figure 1. A view of ABBYY Lingvo x3 application window with entries mark-up from four respective
dictionaries: Lingvo UERD, RadioElectronics English-Russian Dictionary, LingvoEconomics English-Russian
Dictionary and Collins Cobuild Advanced Learner’s English Dictionary.
As a result, we have to repeat many kinds of information for the user’s convenience, e.g.
variant spellings, irregular forms. Such forms have their own entries referring to the main
entry (e.g. color referring to colour) and they are also repeated in the main entry. Thus the
user gets information about both forms regardless of the dictionary entry he/she accessed first
(e.g. color or colour). Being convenient to the user of the electronic dictionary, such
presentation of linguistic data is a problem in its printed version, because it takes up
additional space and, consequently, enlarges the volume of the printed dictionary. The
problem was solved by means of special tags that excluded parts of information presented in
the electronic dictionary from its printed edition. The same tags are also used in the dictionary
database to abridge longer translation comments and explicative translations (glosses) for the
needs of the printed dictionary and to exclude long example phrases (usually literary
quotations) from the entries of BARS ABBYY Lingvo.
4.1.2. The different structure of word lists in Lingvo UERD and in printed dictionaries
(a) In printed bilingual dictionaries, as in some monolinguals, proper names (geographical and
personal) are usually placed in the appendix. That is not the case with Lingvo UERD and with
its electronic database where they are all included in the general word list. It should be
mentioned however that the dictionary includes only personal English names that can present
some difficulty for translation or be of interest for a Russian-speaking user. For example,
Mary is used in English (1) as a popular personal name, (2) as the name of various biblical
characters, (3) as the name of the Virgin Mary. As a popular personal name it is often
transliterated into Russian as Мэри, but as the name of biblical characters and the Virgin
Mary it is substituted by the Russian form Мария (Maria). That is why this personal name
542
Section 3. Reports on Lexicographical and Lexicological Projects
was included in the dictionary. When dealing with the problem we decided to keep this
feature of the electronic dictionary in its printed version, i.e. to keep such entries in the main
word list.
(b) The Lingvo UERD word list also includes some fixed and semi-fixed phrases, phrasal
idioms and even collocations with unconventional translations. Such MWEs have their own
entries in the DWS database, but unlike compounds that are generally presented as entries
both in bilingual and in monolingual dictionaries, they do not have any part of speech (POS)
labels, and the link to such an entry is placed in the appropriate zone of the core word. The
core word entry of the electronic dictionary is thereby a hypertext, so transforming it into a
printed entry entails inevitable losses.
Figure 2. The entry city in BARS ABBYY Lingvo 2007 with capital city as a word combination.
Using this export algorithm we managed to exclude such MWEs from the word list of the
printed dictionary and to transform them into usage examples. Though such presentation is
rather a loss, we had to comply with it anyway, due to the strict space limits of the printed
edition.
4.1.3. Different means of visualization of labels, abbreviations, entry zones etc in Lingvo
UERD and in printed dictionaries
Space limits are not as tough in the case of an electronic dictionary as they are for a printed
edition, so instead of labels full word forms can be used, and there are no tildes. Such
presentation of linguistic data is believed to be more user-friendly and has been chosen, e.g. in
the electronic versions of The Longman Dictionary of Contemporary English and Oxford
Advanced Learners’ Dictionary.
(a) Labels in Lingvo UERD are not replaced by full word forms, but they are different from
those used in bilingual printed dictionaries published in Russia:
543
Julia Anokhina
POS and grammar labels are abbreviated forms of the Russian words, not international
Latin symbols, as in printed and many electronic dictionaries;
register, sphere and style labels are shorter than those used in printed dictionaries
because in Lingvo they are provided with pop-ups showing full forms of abbreviations.
(b) The tilde is not used in Lingvo UERD and markers of entry zones are different from those
used in printed dictionaries: e.g., the zone of idioms is usually marked by a rhombus in
printed dictionaries, and in Lingvo UERD idioms are placed after bold double dots.
(c) Some types of information (e.g. different capitalization of the headword in one of its
meanings, defectiveness of its paradigm in one of the meanings etc) is also presented in
Lingvo UERD in a way different from printed dictionaries, at least those printed in Russia.
The problems listed above were solved by means of export adjustments. New export
algorithms applicable to different kinds of dictionaries were added to the latest version of the
DWS ABBYY Lingvo Content.
4.2. Problems related to the specific character of the Lingvo software format, i.e. specific
problems of the dictionary
The problems discussed above are more or less common to different kinds of electronic vs.
printed dictionaries. Those listed below are specific to the Lingvo software format.
4.2.1. Different types of the entry structure in Lingvo UERD and in printed dictionaries
(a) The entry structure in Lingvo UERD has specific features, such as a specific way of
presenting lexical and grammatical homonymy, lexical senses and sub-senses. Thus, unlike
printed comprehensive academic dictionaries published in Russia, Lingvo UERD does not use
superscripts to mark lexical homonyms, and uses Roman numerals as markers of lexical
homonyms, unlike those traditional printed dictionaries where they often mark grammatical
homonyms. In order to fit the entry structure of Lingvo UERD to the usual format of a printed
dictionary and not to confuse the Russian user accustomed to another structure of dictionary
entries we had to register all the differences between the electronic dictionary and the ‘paper’
format and to improve the interface of the DWS. The latest version of it enables the user to
choose markers for each level of a dictionary entry, be it a Roman or Arabic numeral, a
superscript etc, before exporting dictionary data into RTF or any other chosen format.
(b) Besides the presentation of lexical and grammatical homonymy, lexical senses and sub-
senses, Lingvo UERD has some other specific traits, such as a different set of entry zones in
comparison with printed comprehensive dictionaries, at least those printed in Russia, different
mark-up of the zones. There are also some kinds of information included in this electronic
dictionary but absent in printed editions, e.g. sound files demonstrating the pronunciation of a
headword by a native speaker, links to other dictionaries in the same dictionary software, and
web-links.
544
Section 3. Reports on Lexicographical and Lexicological Projects
1
The names of entry zones listed in this section are terms used in the DWS ABBYY Lingvo Content for the
convenience of the users. Thus, classifier zone contains information that distinguishes a vocable, a sense or a
sub-sense in question from others, and unrelated word combinations zone contains idiomatic expressions that are
not connected with any of the core word senses and, unlike ‘ordinary’ idioms, have no fixed canonical form.
545
Julia Anokhina
zone, the latter being placed before the zone of idioms, at the end of a dictionary
entry;
zone of idioms. This zone comprises active links leading to phrasal idioms that are
sub-entries in the dictionary:
[expr]to set one\'s cap at / for smb.--[trn]задумать женить кого-л. на себе, иметь виды на кого-
л.[/trn][/expr]
[id]<<cap in hand>>[/id]
Figure 3. A view of two respective entry zones (that of unrelated word combinations in tags [expr] and idioms in
tags [id]) in the DWS ABBYY Lingvo Content and in Lingvo UERD in the entry cap (noun).
As distinct from Lingvo UERD, printed bilingual comprehensive dictionaries have fewer
zones. Thus, for example, they do not have any specific zone for proverbs and idioms with no
fixed canonical form; most of them (or at least those published in Russia) do not have zones
of synonyms and antonyms either. As for collocations, fixed and semi-fixed expressions, they
are quite often assigned the status of usage examples. At the same time, more attention is paid
to the sequence of usage examples in the appropriate zone, MWEs being presented in
alphabetical order, word combinations preceding the usage examples in the form of full
sentences and literature quotations.
In order to convert the microstructure of Lingvo UERD into the format acceptable for printed
dictionaries we had to inventory all kinds of differences between the formats and to adjust the
export.
This feature is very convenient for the user of the electronic dictionary, but not of the printed
one. In the printed version sub-entries are normally transformed into simple usage examples
or idioms within the core word entry. Meanwhile there are a lot of links in the electronic
database (preceded by the label см. тж. – see also) referring to such sub-entries. You can
imagine a user of the printed dictionary searching through the core word entry to find the
MWE he was referred to, especially if the entry is long. In order not to inconvenience the
users of the printed edition, we had to exclude from it all such reference links.
As was shown above, most of the problems we had to deal with in the framework of the
project were solved by means of export algorithms or improvements made to the DWS
interface. However, the success of exporting into another format is only ensured by maximum
formalization of the dictionary data. But as there were some inconsistencies in Lingvo UERD
546
Section 3. Reports on Lexicographical and Lexicological Projects
1) editing and revising the dictionary content in the DWS, according to the style guides.
This part of the job was done by ABBYY lexicographers. The aim of this stage was to
formalize the dictionary content;
2) transfer or export of the dictionary content into RTF. This part of the job was done
by means of the DWS ABBYY Lingvo Content with the assistance of ABBYY programmers;
3) checking the resulting files and correcting the export mistakes by the editors of the
publishing house.
5. Conclusion
Present-day dictionary databases tend to include as much linguistic data as possible in order to
be used as a basis for different kinds of dictionaries, including printed editions. As an
electronic database is in fact a big hypertext comprising multiple links and different kinds of
specific data which cannot be exported to the ‘paper’ format, making a paper dictionary from
such a database may be quite a challenging task. Working hand in hand with the publishing
house editors enabled us to minimize the inevitable losses resulting from such a procedure.
The other result of this work was the creation of a printed dictionary more in line with the
needs of modern users, enriched with colloquial vocabulary, computer and networking
vocabulary, and cultural information, presented in a more convenient and user-friendly way.
547
Julia Anokhina
Bibliography
Dictionaries
ABBYY Lingvo Comprehensive English-Russian Dictionary. Moscow: Russky yazik media. 2007.
[Bolshoy anglo-russky slovar’ ABBYY Lingvo]
Apresjan J.D. (ed.). New English-Russian Comprehensive Dictionary. Moscow: Russky yazik. 2003.
[Novy bolshoy anglo-russky slovar’]
Electronic dictionaries
Collins Cobuild Advanced Learner’s English Dictionary. (2008). [cd-rom, in ABBYY Lingvo x3].
Glasgow: HarperCollins Publishers.
Longman Dictionary of Contemporary English. (2005). [cd-rom]. Harlow: Pearson Longman.
Macmillan English Dictionary for Advanced Learners. (2006). [cd-rom]. Oxford: Macmillan
Publishers.
Oxford Advanced Learners’ English Dictionary. (2005). [cd-rom]. Oxford: Oxford University Press.
Oxford Dictionary of English, Revised Edition. (2005). [cd-rom, in ABBYY Lingvo x3]. Oxford:
Oxford University Press.
Other literature
Alegria, I. et al. (2006). ‘Building an Electronic Version of the Cuban Basic School Dictionary’. In
Proceedings of 12th EURALEX International Congress. Turin,. Vol. 1. 243-250.
Apresjan, J.D. ‘Lexicographic Conception of the New English Russian Comprehensive Dictionary’. In
Apresjan, J.D. (ed.). New English Russian Comprehensive Dictionary. Moscow: Russky yazik,
2003. V. 1, 6-17. [Leksikograficheskaya kontseptsiya Novogo bol’chogo anglo-russkogo slovar’a]
Apresjan, J.D. (2008). ‘Principles of Systematic Lexicography’. In Fontenelle, T. (ed.). Practical
Lexicography. A Reader. Oxford: Oxford University Press. 51-60.
Atkins, B.T.S.A, Rundell, M. (2008). The Oxford Guide to Practical Lexicography. Oxford: Oxford
University Press.
Atkins, B.T.S. (2008). ‘Theoretical Lexicography and its Relation to Dictionary-making’. In
Fontenelle, T. (ed.). Practical Lexicography. A Reader. Oxford: Oxford University Press. 31-50.
Atkins, B.T.S., Varantola, K. (2008). ‘Monitoring Dictionary Use’. In Fontenelle, T. (ed.). Practical
Lexicography. A Reader. Oxford: Oxford University Press. 337-371.
Haja, G. et al. (2006). ‘The Dictionary of Romanian Language: Steps Toward the Electronic Version’.
In Proceedings of 12th EURALEX International Congress. Turin. Vol. 1. 417-424.
Heid, U., Gouws, R.H. (2006). ‘A Model for a Multifunctional Dictionary of Collocations’. In
Proceedings of 12th EURALEX International Congress. Turin. Vol. 2. 979-988.
Schmidt, T., Geyken, A., Storrer, A. (2008). ‘Refining and Exploiting the Structural Markup of the
eWDG’. In Proceedings of 13th EURALEX International Congress. Barcelona. 469-481.
Spohr, D. (2008). ‘Requirements for the Design of Electronic Dictionaries and a Proposal for their
Formalisation’. In Proceedings of 13th EURALEX International Congress. Barcelona. 617-629.
548