SEARCH TECHNIQUES
Overview
Search techniques are strategies, methodologies, or tools used in locating and retrieving information, especially in large volumes of data. These techniques are used in many contexts, from search engines to academic research, data retrieval, and even physical archives. They aim at maximizing efficiency, relevance, and accuracy in finding the information you are looking for.
The search technique is largely depends on the understanding of users to execute these techniques and the sophistication of the search engine or tool in interpreting queries and giving relevant results.
INTRODUCTION
Searching is the activity of looking carefully in order to find something. In library and information science, searching refers to looking through records carefully in order to find desired information. You have already studied retrieval tools like catalogues, indexes, etc., for retrieving information. In this lesson, you will learn need and ways of searching organized information for retrieval purposes. You will also be exposed to basic concepts of search techniques for information retrieval from electronic sources.
OBJECTIVES
Upon successful completion of this lesson, you should be able to:
- Define search techniques;
- Explain organization of words in a dictionary;
- Use dictionary, numeric and numeric/classified techniques for arranging and retrieving library material;
- Define search engines;
- Identify search process and design a search query;
- Know the role of search operators;
- Define Boolean logic; • understand types of searches;
- Define, explain and differentiate the field based and full text search with examples.
Search Techniques
The search technique acts as a mechanism through which relevant information can be recovered from information systems. One will find that the information system may either be inhouse or online. In house, information system is such whereby information is stored with scope to an organization as used for retrieval purposes. Then one finds the online information system whereby electronic information sources happen to be stored centrally yet accessed through a given mechanism of communication. Most of the online information systems are compatible with World Wide Web (WWW) and are accessible through internet. The in-house information system may have information sources in both printed and electronic form. Thus, storing mechanism and search techniques are two different aspects. We will discuss these two aspects of storage and retrieval of information.
Storage Mechanism
In-house information systems and online information systems are developed to store specific information or information on a specific theme or subject. Such systems have their search mechanisms and a set of instructions to locate particular information. In library and information centres, information is available both in print and electronic format. Some of the storage mechanisms and their role in information retrieval are as follows:
- Dictionary Arrangement
- Numeric Arrangement
- Classified Arrangement
Dictionary Arrangement
Dictionary arrangement means an arrangement where words are organized in alphabetical order of the language. The alphabetical order is the sequence based on the position of a particular alphabet in the script of the language. For example, the English language uses roman alphabets and the order is A, B, C, D,……….Z. Here, the alphabet "A" is in first position, "B" at second and similarly "Z" at twenty sixth position in the sequence. Hence the words in dictionary arrangement are arranged according to the sequential position of alphabets. For instance, Action Ante Apple Art Catalogue Classification Search…. Here, the first four words are starting with "A" but their positions are fixed according to the position of second, third or fourth letter.
This is followed by another set of two words starting with "C". Thus, the words starting with "C" have been given position after the words starting with "A". Following this process, the words are organized in this arrangement. This mechanism of arrangement is followed for arranging entries in catalogues, which have words as access point. For example, author, title, subject, etc.
Numeric Arrangement
The numeric arrangement is the order where numbers are arranged in ascending or descending order. For instance,
123.45
234.15
234.51
435.21
541.23
……………………………
Here, you find that all the numbers have the same set of five digits, i.e., 1 to 5, but according to their numeric value, these are organized in ascending order and sequence has been made. In libraries, that follow Dewey Decimal Classification system, you will find that the books are kept on the shelves in numerical order.
Classified Arrangement
Most of the libraries organize their books on the shelves according to the call number of books. The call numbers are the combination of class number, book number and collection number. These three numbers may be numeric or alphanumeric as per the scheme, followed by the library. Hence, retrieving books from the shelves becomes easy when we understand the numeric, alphanumeric or classified arrangement. For example, a few call numbers based on DDC scheme have been arranged below as they are arranged on shelves.
321.4 RAM
370.1150954 DEM
370.1523 DES
371 ILLS
371.3078 KEM
371.32 NIS
371.397 GRE
371.926 BRA
Another example of information retrieval following these arrangements is taken from a book index. You might have noticed that almost all books have an index at the end. The book index is a list of words/terms along with page numbers on which those appear in the text. Depending upon the size and nature of the book, the terms in the book index are organized either in dictionary or classified order. After understanding these arrangements, you can find information on a topic from the book easily.
Search Engine
Searching information from the electronic or digital media is different from the print media. When information is stored in electronic or digital form, user interface is provided to find relevant information from the system. This user interface is a software, which has provisions to accept keywords or terms, representing required information to conduct the search. It brings the result of the search in the format defined in the software. The software meant for searching information from the information system is referred as search engine. Therefore, we can define a search engine as 'a software, meant for searching information from electronic or digital information domain, on the basis of input given by a searcher that displays the result in user friendly format'.
The input to the search engine is known as search string or query. The query may be a single term or a set of terms representing the information one is looking for. The search engine searches information based on the query and provides a list of sources which match the query. The list is displayed in a format, designed by the search engines. Depending upon the nature of the search engine, the list may contain brief description of information sources, on the basis of which, the searcher may decide to acquire or refer to full record or not. You might have searched Online Public Access Catalogue (OPAC) of your library or Library of Congress Online Catalog (LCOC) or PubMed as well as Google or Yahoo on internet. All are the search engines.
Search Process
The search process is a set of functions which are performed for searching the relevant information effectively. The process follows some basic steps to conduct search and get desired results. These steps are as follows:
(i) Recognise and State the Need
(ii) Development of Search Strategy
(iii) Execution of the Search Strategy
(iv) Review Search Results
(v) Edit Search Results
(vi) Evaluation and Feedback
Recognise and State the Need
It is important for an information professional or searcher to know why the search is being done and what purpose the search will serve.
A topic may be searched on a need for general knowledge, research and development, and so forth. With knowledge of the need and purpose of the search, a query statement should then be formulated. There needs to be an agreement between the seeker of information and the searcher of information as to the search requirements. This agreement leads to formulation of effective search strategy for relevant and effective result.
Development of Search Strategy
The development of the search strategy includes conceptual formulation of query, translation of conceptual formulation into the language of keywords, descriptors or facets, identification of synonym and associated terms, etc.
The concept of facet analysis (PMEST), given by Ranganathan as well as the concept of specific subject can be used as an effective tool for designing a query.
After this, it is important to select the information domain to be searched like, the OPAC of a library, database or likewise, depending upon requirements. The search string or query, is the combining of terms, keywords or descriptors which represent information. Since search strings contain language, the linguistic features, and their consequences for searching and retrieving have to be examined. In this place, three areas, which are Syntactic Value, Semantic Value and Boolean Operators are to be defined.
- Syntactic Value
The syntactic value of a search string deals with the kind of formula or connecting symbols through which keywords or terms are connected to represent the concept to be searched by the search engines. We will try to understand the syntactic value of a query by this example. There are two terms, say "poetry" and "Indians" connected by two different connectors, "among" and "by'. Each gives a different meaning, as follows:
- 'Poetry among Indians 'means 'What is the status of poetry among Indians?" Or 'What is the approach of Indians towards poetry?"
- 'Poetry by Indians 'means, poetry composed by Indians.
B. Semantic Value
The semantic value of a search string is the meaning of the string in the context of the required information and the interpretation by the search engine. To establish the meaning of the concept to be searched and understood by the search engines, we use operators as connectors of keywords as permitted by the search engines. We can understand the semantic value of a query through two examples given below:
- The question 'contribution of Indian society in mathematics' means the contribution of Indian society in the field of Mathematics.
- The question 'contribution of mathematics in Indian society 'means contribution of Mathematics in shaping Indian society
These two examples give us clear perception of the semantic value of a question. The same set of keywords and connectors give different meaning when written in different order.
C. Boolean Operators
Boolean Operators are simple words, such as AND, OR, and NOT, used like conjunctions to combine keywords or exclude them in searching.
They are used in order to connect and identify the relationship between the searching terms.
Hence resulting in more focused and fruitful results. These three terms have been accepted by the designers of the search engines. Their meaning is well defined while they used as operators in information search. The three operators of Boolean logic are logical sum (+) OR, logical product (x) AND, and logical difference (-) NOT. All information retrieval systems permit the use of these operators by allowing users to express their queries. Let us now examine the meaning of these three operators.
OR Operator:
The OR operator lets the user specify alternatives among the terms in the search. When a string is constructed with the help of OR operator, then search engines obtain all those resources where at least one of the terms or keywords linked with 'OR' exist.
If we construct a search string like, 'student OR education' and search for it, then the output of the search will be a list of references of all those resources, available in the system where either student or education exists.
AND Operator: The AND operator is used to combine two or more terms. When we create a string using AND operator, the search engine retrieves all those resources in which all the terms or keyword connected with 'AND 'exist. For example, if we design a search string like, 'student AND education 'and search, then the output of the search will be a list of references of all those resources where student and education, both the terms exist.
NOT Operator:
The NOT operator is used to eliminate the term from a collection of resources. For instance, if we formulate a search string like 'student NOT education' and search then the result of the search will be a list of references of all those resources, available in the system, where term student exists but not education.
Implementation of the Search Strategy
The searcher should know the data structure adopted by the information system that stores data before executing a search. The system-based search engines are designed to search information in a database according to its architecture. Like in OPAC, if we put a query as 'Tagore, Rabindra Nath' and search in author field, then only those records will be retrieved and displayed from the database which have been authored by him.
But, if we put the same query into the title field, then all those records will appear which have 'Tagore, Rabindra Nath 'in title or a portion of title. So, in result references of all such materials which are written about 'Tagore, Rabindra Nath 'will be included. Depending upon the need and purpose of the search and expertise of the searcher, the search may be conducted using the features of the search engines. Hence a searcher should know the types of search and implications to get effective output. The types of searches are:
a) Field Based Search
b) Full Text Search
c) Truncation Search
d) Proximity Search
e) Limiting Search
f) Range Search
g) Simple Search
h) Advanced Search
(a) Field Based Search
The search conducted on a particular field of the database to get required information is termed as field-based search. As you are aware, the complete information of catalogue is stored in different fields in a bibliographic database. If you wish to search an author, direct the search engine to author field or if you wish to search through title or subject, direct the search engine to title or subject field.
(b) Full Text Search
Full text search is a searching mechanism, which performs the search on each and every field of the database and retrieves all those records which match the query. For instance, the same search (Amartya Sen) when performed on LCOC with keyword option, which acts as full text search, retrieved a list of 193 records.
This indicates that, in full text search the number of hits increased as it extracted all those records which had 'Sen, Amartya 'in any fields.
(c) Truncation Search
Truncation search, is a search technique, in which, the search is conducted for different forms of a word having the same common root. It is one of the most widely adopted methods in information retrieval system. In this method, the root word is taken along with truncation mark and then search is done. Suppose we are searching for 'India*' then all the records will be fetched where term 'India 'is appearing full or partial of any word.
All the records of the domain containing, India, Indian, Indiana, Indianization or similar will be listed.
(d) Proximity Search
The proximity search, is a search technique, which allows the searcher to define the distance of two terms from each other. Whether, the two search terms, should occur adjacent to each other, or, one or more words occur in between the search terms; or the search terms should occur in the same paragraph, irrespective of the intervening words, etc. Different search engines use different set of operators for this purpose.
(e) Limiting Search
In limiting search technique, a searcher limits the string as per the architecture of database and searches different terms of the same string in different fields. For example, if a searcher is searching 'Development as freedom by Amartya Sen' then the string will be broken into two sub-strings, viz. 'Development as freedom 'and 'Amartya Sen'.
The sub-string 'Development as freedom' will be put in title field and sub string 'Amartya Sen' will be put in author filed and then search will be conducted.
Range Search
Range search technique is a technique, which allows searchers to select records within certain data ranges. This technique is more suitable for numeric data search. The operators and their meaning differ from search engine to search engine. A few commonly used operators are:
Greater than (>)
Less than (<>)
Greater than or equal to (>=)
Less than or equal to (<=) For example, if we place publication year 2000 >=, then the result will list all those resources which have been published 2000 AD onwards.
Simple Search
Simple search is such a technique where a searcher puts keywords in a simple format without understanding the behaviour of the search engine or the architecture of the database or the impact of the operators and connectors. Almost all the search engines provide the facility of using simple search technique. The simple search works on the model of Full text search discussed above.
Advanced Search
Advanced search technique is the method through which a searcher looks up information with various tools and mechanisms in order to attain exact and relevant results. In this technique, the search string is constructed using the operators and parameters supplied by the search engine by the searcher. Looking up information, by merging all of the methods mentioned above also falls under this category. Here, the scope of each and every term of the string may be defined according to facility available in the search engine.
Review Search Results
The best reviewer of the search results is the user. But the information professionals should also review the search results on the basis of criteria given for evaluating information retrieval systems.
Edit Search Results
The editing of search results is a transformation of the search results into a user-friendly format. This can involve the arrangement of the results into a well-organised package, highlighting important entities, adding more information to the entities and reformatted of information to suit the user's requirements.
Evaluation and Feedback
The evolution of search results involves participation of both, the users and the searchers. The quality and quantity of the results are assessed and if needed, the process may be redefined and restarted if the final result does not satisfy the users' needs
LEARNT ABOUT SEARCH TECHNIQUES
- The standard mechanism, called information search techniques is used for retrieving information from any information system.
- The search technology is a tool by which, one can retrieve relevant data from information systems. The information system may be inhouse or online.
- Storage technology can be dictionary, numeric and classified arrangement of data.
- Search operation is performed by a set of functions as:
- Determination of user's needs of information search;
- Designing search strategy;
- Selecting the information system to be searched and accordingly the search engine;
- Creation of search query or string using keywords and operators that, expresses the semantic value of the user's requirements and the syntactic format that the engine interprets;
- Performing the search;
- Evaluation of the result. If necessary, again filter or redefine or restart the search process; and
- Presentation of the search results in a user-friendly format.
- For getting relevant and effective search results, a searcher should have knowledge of the types of searches and skills of conducting them.
Conclusion
Search techniques are foundational to various fields, such as computer science, information retrieval, and problem-solving. They play a critical role in efficiently finding solutions, data, or information from large datasets, whether it's for navigating through the internet, exploring databases, or optimizing algorithms in problem-solving.