TY - GEN
T1 - Hierarchical indexing and document matching in BoW
AU - Geffet, Maayan
AU - Feitelson, Dror G.
PY - 2001
Y1 - 2001
N2 - BoW is an on-line bibliographical repository based on a hierarchical concept index to which entries are linked. Searching in the repository should therefore return matching topics from the hierarchy, rather than just a list of entries. Likewise, when new entries are inserted, a search for relevant topics to which they should be linked is required. We develop a vector-based algorithm that creates keyword vectors for the set of competing topics at each node in the hierarchy, and show how its performance improves when domainspecific features are added (such as special handling of topic titles and author names). The results of a 7-fold cross validation on a corpus of some 3,500 entries with a 5-level index are hit ratios in the range of 89-95%, and most of the misclassifications are indeed ambiguous to begin with.
AB - BoW is an on-line bibliographical repository based on a hierarchical concept index to which entries are linked. Searching in the repository should therefore return matching topics from the hierarchy, rather than just a list of entries. Likewise, when new entries are inserted, a search for relevant topics to which they should be linked is required. We develop a vector-based algorithm that creates keyword vectors for the set of competing topics at each node in the hierarchy, and show how its performance improves when domainspecific features are added (such as special handling of topic titles and author names). The results of a 7-fold cross validation on a corpus of some 3,500 entries with a 5-level index are hit ratios in the range of 89-95%, and most of the misclassifications are indeed ambiguous to begin with.
UR - http://www.scopus.com/inward/record.url?scp=84901288805&partnerID=8YFLogxK
U2 - 10.1145/379437.379677
DO - 10.1145/379437.379677
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84901288805
SN - 1581133456
SN - 9781581133455
T3 - Proceedings of the ACM International Conference on Digital Libraries
SP - 259
EP - 267
BT - Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2001
PB - Association for Computing Machinery
T2 - 1st ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2001
Y2 - 24 June 2001 through 28 June 2001
ER -