NLP Support for Faceted Navigation in Scholarly Collections

Hearst, M, & Stoica, E. (2009). NLP Support for Faceted Navigation in Scholarly Collections. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, ACL-IJCNLP 2009. 62-70.

Digital libraries have begun using faceted navigation for collections of scholarly holdings. Automated creation of facets is desirable in such an environment. This paper investigates the use of the Castanet algorithm, for automated creation of facets.

RQ1: How to (semi) automatically create facets for scholarly texts?

RQ2: How to automatically assign items to facets?

Methods: applied the Castanet algorithm to a collection of 3275 biological journal titles. 15 participants (biologists, doctors, medical students and medical librarians- with experience using PubMed) evaluated the results of 3 algorithms.

Results: Castanet was the most attractive of the three algorithms tested, indicating it is a good candidate for future work.

Algorithm Would use output?
Castanet 11/15 (73%)
LDA 1/7
Subsumption 1/8

Future Work: Improve the Castanet algorithm (synonyms, spelling variations, etc….). Evaluation of log information, to see which facets are used in real-world usage, for future adjustments.

