The file uttr366.ps contains the CSRI technical report: Automatically generating hypertext by computing semantic similarity by Stephen J. Green If you have the UNIX gunzip program, get the uttr366.ps.gz file. Remember to transfer the file in binary mode. Uncompress it, and print it on a postscript printer. If you do not have uncompress, get the uttr366.ps file in ascii mode, and print it on a postscript printer. This report was made available for anonymous ftp by: sjgreen@cs.utoronto.ca Abstract -------- We describe a novel method for automatically generating hypertext links within and between newspaper articles. The method is based on lexical chaining, a technique for extracting the sets of related words that occur in texts. Links between the paragraphs of a single article are built by considering the distribution of the lexical chains in that article. Links between articles are built by considering how the chains in the two articles are related. By using lexical chaining we mitigate the problems of synonymy and polysemy that plague traditional information retrieval approaches to automatic hypertext generation. In order to motivate our research, we discuss the results of a study that shows that humans are inconsistent when assigning hypertext links within newspaper articles. Even if humans were consistent, the time needed to build a large hypertext and the costs associated with the production of such a hypertext make relying on human linkers an untenable decision. Thus we are left to automatic hypertext generation. Because we wish to determine how our hypertext generation methodology performs when compared to other proposed methodologies, we present a study comparing the hypertext linking methodology that we propose with a methodology based on a traditional information retreival approach. In this study, subjects were asked to perform a question-answering task using a combination of links generated by our methodology and the competing methodology. We show combined results for all subjects tested, along with results based on subjects' experience in using the World Wide Web. We detail the construction of a system for performing automatic hypertext generation in the context of an online newspaper. The proposed system is fully capable of handling large databases of news articles in an efficient manner.