This directory contains files for Technical Report CSRI-399.

The files in this directory are the following:

1) README       (3    KB - the ASCII file you are now reading)
2) tr-399.ps    (1656 KB - PostScript)
3) tr-399.ps.gz (479  KB - PostScript compressed with the program "gzip")

If you have the UNIX "gunzip" program, get the file tr-399.ps.gz.
Remember to transfer the file in binary mode. After the transfer, 
"gunzip" the file.

If you do not have the UNIX "gunzip" program, get the file TR-399.ps 
in ASCII mode. 

After transfering the file, print it on a PostScript printer.

If you have any questions or comments about this technical report, please
contact pedmonds@cs.toronto.edu or gh@cs.toronto.edu

---------------------------------------------------------------------------

SEMANTIC REPRESENTATIONS OF NEAR-SYNONYMS FOR AUTOMATIC LEXICAL CHOICE

                            by Philip Edmonds

                                ABSTRACT


We develop a new computational model for representing the fine-grained
meanings of near-synonyms and the differences between them.  We also
develop a sophisticated lexical-choice process that can decide which
of several near-synonyms is most appropriate in any particular
context.  This research has direct applications in machine translation
and text generation, and also in intelligent electronic dictionaries
and automated style-checking and document editing.

We first identify the problems of representing near-synonyms in a
computational lexicon and show that no previous model adequately
accounts for near-synonymy.  We then propose a preliminary theory to
account for near-synonymy in which the meaning of a word arises out of
a context-dependent combination of a context-independent core meaning
and a set of explicit differences to its near-synonyms.  That is,
near-synonyms cluster together.

After considering a statistical model and its weaknesses, we develop a
clustered model of lexical knowledge, based on the conventional
ontological model.  The model cuts off the ontology at a coarse grain,
thus avoiding an awkward proliferation of language-dependent concepts
in the ontology, and groups near-synonyms into subconceptual clusters
that are linked to the ontology.  A cluster acts as a formal
  usage note that differentiates near-synonyms in terms of
fine-grained aspects of denotation, implication, expressed attitude,
and style.  The model is general enough to account for other types of
variation, for instance, in collocational behaviour.

We formalize various criteria for lexical choice as preferences
to express certain concepts with varying indirectness, to express
attitudes, and to establish certain styles.  The lexical-choice
process chooses the near-synonym that best satisfies the most
preferences.  The process uses an approximate-matching algorithm that
determines how well the set of lexical distinctions of each
near-synonym in a cluster matches a set of input preferences.

We implemented the lexical-choice process in a prototype
sentence-planning system.  We evaluate the system to show that it can
make the appropriate word choices when given a set of preferences.