Research on N-Grams in Information Retrieval
Systems
Publications
-
E. Adams.
A Study of Trigrams and Their Feasibility as Index Terms in a
Full Text Information Retrieval System.
Ph.D. Thesis, George Washington University, Dept. of Computer Science, 1991.
-
R. C. Angell, G. E. Freund and P. Willett.
``Automatic spelling correction using trigram similarity measure.''
Information Processing and Management 19(4):255-261. 1983.
-
J. E. Burnett, D. Cooper, M. F. Lynch, P. Willett and M. Wycherley.
``Document retrieval experiments using indexing vocabularies of varying size.
I. Variety generation symbols assigned to the fronts of index terms.''
Journal of Documentation 35(3):197-206. September 1979.
-
W. B. Cavnar.
``N-gram-based text filtering for TREC-2.''
Proceedings of TREC-2: Text Retrieval Conference 2,
Donna Harman, ed.
National Bureau of Standards, August 1993.
-
W. Cavnar.
``Using
an n-gram-based document representation with a vector processing retrieval model.''
In Proceedings of TREC 3.
-
Kenneth Ward Church.
``Ngrams.''
On-line
Proceedings, 33rd Annual Meeting of the Association for
Computational Linguistics.
-
Jonathan D. Cohen,
``Highlights: Language- and domain-independent automatic indexing terms for abstracting.''
Journal of the American Society for Information Science 46(3):162-174, April 1995.
-
Robin Collier.
``N-gram cluster identification during empirical knowledge representation generation.''
?????, Department of Computer Science, University of Sheffield.
-
Fatih Mehmet Comlekoglu.
Optimizing a Text Retrieval System Utilizing N-gram Indexing.
Ph.D. Thesis, The George Washington University, May 13, 1990.
-
Grace Crowder and Charles Nicholas.
``An approach to large scale distributed information systems using statistical properties
of text to guide agent search.''
Proceedings of the CIKM Workshop on Intelligent Information Agents, Tim Finin and James Mayfield, eds.
Held in conjunction with the Fourth International Conference on Information and Knowledge Management (CIKM '95),
Baltimore MD, December 1995.
-
T. de Heer.
``The application of the concept of homeosemy to natural language information retrieval.''
Information Processing & Management 18(5):229-236. 1982.
-
S. Huffman and M. Damashek.
``Acquaintance: A novel vector-space n-gram technique for document categorization.
Proceedings of TREC 3.
-
J. J. Hull and S. N. Srihari.
``Experiments in text recognition with binary n-gram and Viterbi algorithms.''
IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-4(5):520-530. September 1982.
-
Jong Yong Kim and John Shawe-Taylor.
``Fast string matching using an n-gram algorithm.''
Software--Practice and Experience 24(1):79-88. January 1994.
-
C. P. Mah and R. J. D'Amore.
DISCIPLE Final Report.
PAR Report #83-121, PAR Technology Corporation, New Hartford, NY.
28 October 1983.
-
C. K. McElwain and M. B. Evens.
``The Degarbler--A program for correcting machine-read Morse code.''
Information and Control 5:368-384. 1962.
-
Olumide Owolabi.
``Efficient pattern searching over large dictionaries.''
Information Processing Letters 47:17-21. 1993.
-
Claudia Pearce.
A Dynamic Hypertext Environment Through n-gram Analysis.
Ph.D. thesis, University of Maryland Baltimore County, 1994.
-
Claudia Pearce and Charles Nicholas.
``Using n-gram analysis in dynamic
hypertext environments.''
In Proceedings of the Second
International Conference on Information and Knowledge Management (CIKM
'93), November 1-5 1993. (This paper was also released as UMBC
technical report CS-93-10.).
-
J. C. Scholtes.
``Unsupervised learning and the information retrieval problem.''
In Proceedings of the IEEE International Joint Conference on Neural Networks, volume 1, pp. 95-100.
-
E. J. Schuegraf and H. S. Heaps.
``Selection of equifrequent word fragments for information retrieval.
Information Storage and Retrieval 9:697-711. 1973.
-
C. Y. Suen.
``N-gram statistics for natural language understanding and text processing.''
IEEE Transactions on Pattern Analysis and Machine Ingelligence PAMI-1(2):164-172. April 1979.
-
J. R. Ullmann.
``Binary n-gram technique for automatic correction of substitution, deletion, insertion, and reversal errors in words.''
Computer Journal 20(2):141-147. May, 1977.
-
P. Willett.
``Document retrieval experiments using indexing vocabularies of varying size.
II. Hashing, truncation, digram and trigram encoding of index terms.''
Journal of Documentation 35(4):296-305. December 1979.
-
Janusz L. Wiskiewski.
``Effective Text Compression with simultaneous digram and trigram encoding.''
Journal of Information Science:Princeples & Practice, 13(3):159-164, 1987.
-
E. M. Zamora, J. J. Pollock and Antonion Zamora.
``The use of trigram analysis for spelling error detection.''
Information Processing and Management 17(6):305-316. 1981.
Patents
From the IBM Patent Server 9 January 1997.
- 5467425
Building scalable N-gram language models using maximum likelihood maximum
entropy N-gram models
- 5452442
Methods and apparatus for evaluating and extracting signatures of computer
viruses and other undesirable software entities
- 5440723
Automatic immune system for computers and computer networks
- 5444617
Method and apparatus for adaptively generating field of application
dependent language models for use in intelligent systems
- 5502791
Speech recognition by concatenating fenonic allophone hidden Markov models
in parallel among subwords
- 5418951
Method of retrieving documents that concern the same topic
- 5510981
Language translation apparatus and method using context-based translation
models
- 5448474
Method for isolation of Chinese words from connected Chinese text
Other Relevant Links
Miscellaneous
To add a link to this page, please send email to Jim Mayfield (james.mayfield@jhuapl.edu).
JCM 1/9/97