Compound descriptors in context: a matching function for classifications and thesauri

Tudhope, D., Binding, C., Blocks, D., & Cunliffe, D. (2002). Compound descriptors in context. Proceedings of the second ACM/IEEE-CS joint conference on Digital libraries – JCDL ’02 (p. 84). New York, New York, USA: ACM Press. doi: 10.1145/544220.544235.

Knowledge organization systems provide a “semantic road map” for searchers to improve the retrieval process through term suggestion, query expansion, and flexible (ranked) matching. The authors of this article set out to explore the design issues of the matching function that, “given a thesaurus semantic distance measure, yields ranked results for collections where items are indexed by multiple thesaurus terms” (84). Previous versions of the FACET study revealed that users faced conceptual problems in faceted query formulation due to window management difficulties of the search system. Issues in result ranking by the prototype’s algorithm “motivated a revised version of the system with tighter integration of the thesaurus and more support for faceted query formulation and a new matching function” (86). Thesaurus search systems that aim to exploit compound descriptors face many problems.

FACET attempts to address these issues by incorporating semantic closeness into the matching function and by a thesaurus browser that allows users to explore a term in its hierarchical context in order to interactively refine their query. There is also a need for a similarity coefficient to deal with situations where a match has extra terms, missing terms, non-matching terms, partially matching terms, or any combination of the former, in order to produce ranked results. Results from the paper suggest that an in-memory semantic network makes real-time implementation possible.

The work performed is part of the FACET project with results from scenarios using the Getty AAT. In 2001, an evaluation of the previous FACET system was conducted with eight library, museum, and IT professionals which form the basis of this study.

Key Points

  • “There are many advantages for Digital Libraries and the web in indexing with Knowledge Organization Systems, whether intellectual or automatic methods are used, but some current disincentive in the lack of flexible retrieval tools that deal with compound descriptors”
  • “Semantic term expansion can be applied to a range of hybrid query-navigation techniques, ranging from suggesting possible terms to automatic term expansion and bestmatch query rankings”
  • “Automatic traversal of relationships can augment the user’s browsing possibilities”
  • “Precoordinated indexing [21, 28] is seen to offer advantages for specificity because terms are placed within a particular context or syntax and this can lead to potential gains in precision on retrieval”
  • “An extended faceted matching function would help with recall problems caused by missing terms and partial matches (via semantic term expansion)…It would also offer bestmatch advantages over Boolean search in the generation of ranked results”

Future Work

  • The authors intend to explore further the semantic closeness measure, the weighting of relationships, and whether additional cost factors should be incorporated into the expansion
  • XML representation of thesaurus/collection mappings and query structure will be developed so they can be held externally.
This entry was posted in Applications, Evaluation and tagged , , , , , . Bookmark the permalink.