- User-centered indexing: requires we index reflecting the approach users would take to find a document.
- Document centered: that indexing, like abstracts, creates surrogates for documents.
Purpose: to represent the content or features of a document:
“Aboutness” what is the document about?
Generally poor inter-indexer agreement.
- Process: content-analysis of the document to select concepts that represent it.
- Translation, expressing the concepts in the indexing language.
Requires rules (policy):
a. Sources of terms (controlled vocabulary, other?)
b. Specificity (how narrow or broad)
c. Weights (reflect the importance of the concept)
d. Accuracy (how to translate when there is no equivalent?)
e. Degree of precombination (decide to use pre or post)
f. User language (assign terms approximate to the users)
— and some content analysis policies —
g. Exhaustivity (how comprehensive)
h. Indexable matter (what parts of the doc should be represented)
Request-Oriented – Soergel (1985):
Checklist indexing: check each document against the descriptors in a vocabulary (but, costly and time-consuming). Classified structure for indexing can help improve efficiency of this approach.
Automated approach: computerized
– objective and consistent
– natural language requests, relevance feedback, ranked output, query expansion
– “indexing and searching are two sides of the same coin”