The index, query, relevancy
The basis of modern technology is based on two fundamental process. First, it is
indexed and accessible information processing query results with the subsequent
withdrawal. As to the first, any program (whether a desktop search engine, a
corporate information system or the Internet search engine) creates its search.
That is handles documents, and provides an index of these documents (organized
structure, which contains information on the processed data). In the future, it
created an index used to work fast and the list of necessary documents under
inquiry. Further though, and not just in terms of technology, but it is
understandable normal users. The program handles the request (on the key slovu-fraze),
and displays a list of documents which contain this key phrase. Because
information is contained in a structured index, the query processing is taking
place far (in the tens and hundreds of times!) Faster than in the case of direct
searching (sample documents is not pereborom files, and analyze text information
in the index).
Documents found in the resulting program displays a list according to relevancy-matching
text document request. In various technologies, of course, have different
methods of finding and relevance of document (number of entries ", the words"
and its frequency reference in the document, the ratio of these parameters to
the total number of words in the document, the distance between words in the
query phrases desired files, and so forth). Based on these parameters defined "weight
" of the instrument and, depending on him a file in a list of results in a
specific position. In the case of Internet-poiskom case even more difficult.
Indeed, in this case, you have to take into account a host of other factors (Google
Page Rank example). But this topic for a separate article, therefore, will not
touch the Internet.