Text mining at the term level

Ronen Feldman, Moshe Gresko, Yakkov Kinar, Yehuda Lindell, Oren Liphstat, Martin Rajman, Yonatan Schler, Oren Zamir

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

87 Scopus citations

Abstract

Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. Previous work in text mining focused at the word or the tag level. This paper presents an approach to performing text mining at the term level. The mining process starts by preprocessing the document collection and extracting terms from the documents. Each document is then represented by a set of terms and annotations characterizing the document. Terms and additional higher-level entities are then organized in a hierarchical taxonomy. In this paper we will describe the Term Extraction module of the Document Explorer system, and provide experimental evaluation performed on a set of 52,000 documents published by Reuters in the years 1995-1996.

Original languageEnglish
Title of host publicationPrinciples of Data Mining and Knowledge Discovery - 2nd European Symposium, PKDD 1998, Proceedings
EditorsJan M. Zytkow, Mohamed Quafafou
PublisherSpringer Verlag
Pages65-73
Number of pages9
ISBN (Print)3540650687, 9783540650683
DOIs
StatePublished - 1998
Externally publishedYes
Event2nd European Symposium on Principles of Data Mining and Knowledge Discovery in Databases, PKDD 1998 - Nantes, France
Duration: 23 Sep 199826 Sep 1998

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1510
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd European Symposium on Principles of Data Mining and Knowledge Discovery in Databases, PKDD 1998
Country/TerritoryFrance
CityNantes
Period23/09/9826/09/98

Bibliographical note

Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1998.

Fingerprint

Dive into the research topics of 'Text mining at the term level'. Together they form a unique fingerprint.

Cite this