Doerr M., Fundulaki I. (1998). SIS-TMS: A thesaurus management system for distributed digital collections. Proc. 2nd European Conference on Digital Libraries (ECDL’98), (C. Nikolaou and C. Stephanidis eds.) Lecture Notes in Computer Science 1513, Springer-Verlag: Berlin, 215-234.
The focus of this paper is to present methods and an actual system suited to store, maintain and provide access to knowledge structures that are in use or needed for the respective auxiliary system interfaces and three tasks.
– Guide the user from his/her naïve request to the use of a set of terms optimal for his purpose and for the characteristics of the target information source.
– Expand naïve user terms or the terms optimal for the purpose of the user into sets of terms optimal for each different information source.
– Classify all information assets of a certain collection with controlled vocabulary from a specific thesaurus.
This paper proposes the following requirements for Thesaurus Management:
(1) interaction with the thesaurus contents except manipulations, (2) maintenance, i.e. the manipulation of the contents and the necessary and desirable support of associated work processes, and finally (3) analysis, i.e. the logical structure needed to support (1), (2), and the thesaurus semantics in the narrower sense.
The SIS-TMS is a multilingual thesaurus management system and a terminology server for classification and distributed access to electronic collections following the above analysis. The its distinct features are its capability to store, develop, display and access multiple thesauri and their interrelations under one database schema, to create arbitrary graphical views thereon and to specialize dynamically any kind of relation into new ones. It further implements the necessary version control for a cooperative development and data exchange with other applications in the environment.
It originates in the terminology management system (VCS Prototype) developed by ICS-FORTH in cooperation with the Getty Information Institute in the framework of a feasibility study. It was enhanced within the AQUARELLE project, in particular by the support of multilinguality. An earlier version is part of the AQUARELLE product. A full product version was available summer ’98.
The SIS-TMS is an application of the Semantic Index System, which is a product of the Institute of Computer Science-FORTH, is an object oriented semantic network database used for the storage and maintenance of formal reference information as well as for other knowledge representation applications. It implements an interpretation of the data model of the knowledge representation language TELOS omitting the evaluation of logical rules.
This paper discussed the thesaurus structure from several perspectives, including assumptions on concepts, modeling thesaurus notions, intrathesaurus relations, representing multiple interlinked thesauri, interthesaurus relations.
Collection Management Systems (CMS) such as digital libraries, library systems, and museum documentation systems will continuously change. The CMS can also propose new terms to the TMS. The TMS will be updated with new terms from many sides, and old concepts and terms may be renamed, revised and reorganized. The essential problem is to ensure and maintain consistency between the contents of the vocabularies in the underlying CMS and the contents of the Local Thesaurus Management Systems.
The user interacts with the SIS-TMS via its graphical user interface, which provides unconstrained navigation within and between multiple interlinked thesauri. The user can retrieve information from the SIS-TMS knowledge base using a number of predefined, configurable queries and accept the results either in textual or graphical form. SIS-TMS not only provides graphical representations but an essential feature is its ability to represent in a single graph any combination of relationships in arbitrary depth. (See fig 1, central window). The updates in the SIS-TMS are performed through the Entry Forms in a task oriented way.(See fig 1, right window).
Fig.1. SIS-TMS User Interface, Browser and Data Entry facility.
Integrated terminology services in distributed digital collections are going to become an important subject, and that the SIS-TMS provides a valuable contribution to that. It solves a major problem, the consistent maintenance of the necessarily central terminological resources between semiautonomous systems. The terminological bases themselves need not be internally distributed, as the access needs low bandwidth, read-only copies can easily be sent around at the given low update rates, and term servers can be cascaded. In the near future, the functionality of this system will be further enhanced to make its usability as wide as possible. Whereas there are several standards for thesaurus contents, no one has so far tried to standardize the three component interfaces: (1) Term Server to retrieval tools, (2) TMS to CMS, (3) TMS to Term Server. As in a distributed information system many components from many providers exist, these three interfaces must become open and standardized, to make a wide use reality.