Controlled and uncontrolled subject descriptions in the CF database:. A comparison of optimal cluster-based retrieval results.

Autor: Shaw, W M, Jr1 (AUTHOR)
Předmět:
Zdroj: Information Processing & Management. Nov-Dec1993, Vol. 29 Issue 6, p751-763. 13p.
Abstrakt: Word stems derived from titles and abstracts are used to represent the 1,239 documents in the cystic fibrosis document collection. Evidence for clustering structure and the effectiveness of cluster-based retrieval are investigated as a function of the exhaustivity of the uncontrolled subject descriptions. Results are compared to equivalent calculations for controlled descriptions based on Medical Subject Headings (MeSH) and subheadings. For both representations, the evidence for clustering structure is inversely related to the effectiveness of cluster-based retrieval. Exhaustive subject descriptions produce the strongest evidence for clustering structure and the lowest levels of retrieval performance. Levels of retrieval performance associated with exhaustive subject descriptions can be explained by assuming that the structure imposed on documents by subject relationships is the result of a random process. Optimal levels of cluster-based retrieval performance can be detected for both representations. The optimal levels of performance provide a clear indication of the relative utility of document representations, and show that controlled and uncontrolled subject descriptions produce equivalent levels of performance and complementary outcomes. High levels of retrieval performance are achieved by optimizing the exhaustivity of document representations for each query. Retrieval performance based on combinations of retrieval outcomes from the subject descriptions is materially superior to the highest performance of each representation. Average levels of recall, precision, and effectiveness are shown to convey little information about typical outcomes. Performance standards for individual queries are suggested. [ABSTRACT FROM AUTHOR]
Databáze: Library, Information Science & Technology Abstracts