Vocabulary Control
Overview
This is an assignment.
1. what is vocabulary controlled?
Ans: Vocabulary control is employed for indexing documents, arranging catalogs, or maintaining bibliographic databases to make it easier to search for specific documents. This systematic approach helps establish relationships between documents, making it easier for users to find documents within a specific subject area. The controlled vocabulary is mainly used as a hierarchy, and it can be used as the size and scope of each topic. The controlled vocabulary identified all the available synonym terms and the desire to choose the most preferred term, so it helps to overcome the problem of choosing the natural language of the subject of any document. Some examples of vocabulary control are - Sears List of subject headings and Library of Congress subject headings and thesauri.
Example-
The Colon Classification (CC) ‘Indian History’ is described as V44. In Sear’s List of Subject Headings , it is described as: India – History. The other controlled vocabulary is - Thesaurofacet, which possess both the characteristics verbal as well as coded controlled vocabularies.
It can be characterized by different ways -
a.represents general conceptual structure of subject area
b. Presents a guide to user.
c. Supplies standard vocabulary by controlling synonyms.
d.defines ambiguous terms
e. It shows horizontal and vertical relationship among terms
2. What is the meaning of controlling a vocabulary in Information Retrieval?
Ans: Controlled vocabulary is a standardized process to organize data. Controlled vocabulary also helps in keeping uniformity among all subject headings. If controlled vocabulary is not exercised among indexers, then there will be a problem with using different terms for the same subject. Then, a problem of mismatch in the documents will arise, which will lead to a big problem in Information Retrieval.
Control of synonyms-
Controlled vocabulary would use the standard term among all synonyms.
Ambiguous-
Controlled vocabulary would clarify the terms which are ambiguous.
Hierarchical classification-
Controlled vocabulary are designed in this way to show relationship between broader term to narrower term.
3. What are the tools for vocabulary control?
Ans:
1. Simple Term Lists (Pick Lists)- This refers to a limited set of terms arranged as a simple alphabetical list or a list that is arranged in some other logically evident order.
2. Thesauri
3. Subject Heading Lists (e.g. LCSH, SLSH)
4. Authority Files (e.g. LCNAF)
5. Taxonomies
6. Alphanumeric Classification Schemes (example - DDC,UDC)
7. Ontologies
8. Folksonomies
4. What is a classaurus?
Ans: Classaurus is mainly a vocabulary-controlled tool used in POPSI and pre-coordinate indexing systems. It was developed by Ganesh Bhattacharya. It is a faceted systematic scheme for hierarchical classification including features of control of synonyms, quasi synonyms and antonyms which helps in retrieval thesaurus.
Classaurus can be designed before designing the indexing work or along with indexing work. The two parts of Classaurus is - a.The Alphabetical Index part-contains every term synonyms along with its address , ex- alphanumeric code.
b.A systematic part- contains a comon modifiers and assigned a unique alphanumeric code.
The style of a classarus have been illustrated by Ganesh Bhattacharya as follows-
• A Systematic Part-
A 1 Common Modifiers
A11 Form Modifiers
A12 Time Modifiers
A13 Environment Modifiers.
A2 Inter-Subject Relation Modifiers
A3 Disciplines and Sub-disciplines
A4 Entities
A41 Part Entities
A42 Type Entities
A5 Properties
A6 Actions
5. What are the differences between natural language and artificial language?
Ans: a.natural language is a set of codes and their admissible expression used for communication of ideas in speech and
writing in our day to day life.
An artificial language is a codes which is admissible expression used for representing the content of the
documents and queries of user.
b. A natural language is “natural” in the sense that it grows freely in the lips of human being, totally free from any control.
An indexing language is “artificial” as it may depend upon the vocabulary of a natural language, but its syntax, semantics, and orthography would be different from the natural language.
c. A natural language is developed for communication of ideas among human beings in their day to day life.
Artificial language is developed and used for a special purpose, i.e. for the representation of the main subject area of
any documents.
d.A natural language is a free language and there is no control of synonyms and homographs. One concept may be
denoted by more than one term. There is no standard of words. Anybody can use any words or terms to
express her his/ ideas.
An artificial language is a controlled language. There is a restriction in using any words in indexing language. Synonyms and homographs are controlled. There is standardization of terms/words. One term represents one idea .
e.Natural language provides auxiliaries like prepositions, conjunctions, etc. to understand the exact meaning of any sentence. Such auxiliaries are not available in artificial language.