1C:Enterprise 8.3. Developer Guide. Chapter 19. Full-text Data Search Mechanism

1C:Enterprise 8.3. Developer Guide. Contents


FULL-TEXT DATA SEARCH MECHANISM

1C:Enterprise mechanism of full-text data search allows the user to search the database using search operators (AND, OR, NOT, NEAR, etc.).

The full-text search mechanism is based on using two components:

„ full-text index created in the database and periodically updated as necessary; „ full-text search tools.

19.1. GENERAL INFORMATION ON FULL-TEXT INDEXING

Data of the following configuration objects can serve as full-text search objects:

„ exchange plans

„ catalogs

„ documents

„ charts of characteristic types

„ charts of accounts

„ charts of calculation types

„ information registers

„ accumulation registers

„ accounting registers

„ calculation registers

„ business processes

„ tasks

Each of the listed configuration objects has the Full Text Search property that enables or disables full-text indexing for the object data.

Changes of full-text search objects are registered by 1C:Enterprise in a change log. The log is written when objects are recorded in the database. Only objects set up for full-text indexing are included in change files. If transaction rollback cancels object recording to the database, entry in the change log remains unchanged.

Full-text indexing is performed in the privileged mode (in the server context) and does not require exclusive lock of the database. During full-text indexing change registration files are read and changed objects are obtained from the database, words are transliterated and Latin letters in words are replaced by Cyrillic characters (however, the index stores both forms of the word). Only the following attribute types can be indexed:

„ string

„ date

„ number

„ reference types

„ value storage

The following items are added to the full-text index for each object and attribute:

„ name of metadata objects or its attribute

„ synonym of metadata objects or its attribute (in all configuration languages)

„ presentation of metadata object (in all configuration languages)

Standard and user attributes are indexed in all languages allowing to perform searches in all configuration languages (e.g., in Russian and in English).

NOTE

The Cyrillic letter "¸" in all words is replaced by the Cyrillic letter "å" for fulltext indexing and search purposes.

For information about file location for the full-text search index see "1C:Enterprise 8.3. Administrator Guide".

Full-text indexing generates the main index; subsequent database changes create an additional index that contains information about data modified after the last update of the main index.

Search in the main index is very efficient, while search in the additional index takes more time. This is why the indexing process features an index merge functionality that adds the last data change results to the main index. Please note, however, that this operation might take a long time (if the main index if big); therefore, it is recommended to run it when the load on the system is minimal (at night time or week-ends).

Full-text search is user-initiated in the client application context and is executed in the server context. It means search is performed at the client machine in the file-mode version and at the 1C:Enterprise server cluster – in the client/server mode.

    Chapter 19. Full-Text Data Search Mechanism                                                                     2-837

Full-text data search is based on user rights (including access right restrictions at the level of database records and fields). Search results can include misspelt words: for example, if a word contains Cyrillic "ü" instead of Latin "m" or is spelt as "systeü" instead of "system" due to indeliberate language change, these words are also added to the search results.

Search results are returned in chunks of the size defined when executing the fulltext search command.

Results are ranked on the basis of the following priorities:

„ metadata object "weight": the more object attributes reference this object, the bigger is its "weight";

„ date of the object (newer objects are displayed at the beginning).

19.2. USE OF FULL-TEXT SEARCH MECHANISMS

Full-text data search is performed using the 1C:Enterprise script tools.

FullTextSearch global context property returns the full-text search manager – FullTextSearchManager object.

Methods of the full-text search manager can be used to:

„ obtain information about full-text index state

„ run full-text indexing

„ initiate the full-text data search process

The following methods are used to obtain information about full-text index state:

„ GetFullTextMode() – returns True if full-text search is allowed or False otherwise;

„ UpdateDate() – date of the last time when all data were indexed and there was no information about new objects for indexing;

„ IndexTrue() – returns True if the full-text search index entirely corresponds to the infobase current state;

„ IndexUpdateComplete() – returns True if full-text index merge is not required.

The following methods are used for full-text indexing:

„ SetFullTextMode() – sets full-text search mode (Allow or Deny). If search is disabled, calling this method with the True parameter automatically clears the existing full-text index.

„ UpdateIndex() – updates the full-text search index. If there are no indices, this method re-indices the entire database. Indexing conditions are passed as method parameters:

AllowMerge – if the True value is passed, main and additional indices are merged;

InPortions – the True value indicates that indexing is to be performed in portions of 10 thousand objects. After indexing data of a single portion the process is completed. Indexing time per portion is strongly dependent on data. For example, the process lasts 3 to 5 min. in the standard configuration "Manufacturing Enterprise Management".

„ ClearIndex() – removes all full-text index files. This method is recommended when data have been updated in their entirety or close to that (e.g., when the infobase has been loaded). After the index is cleared, indexing should be performed (if required).

To start full-text data search process, the CreateList() method is used. Two parameters are passed to this method:

„ SearchString – a string with the search expression;

„ PortionSize – a number specifying the count of objects to be returned in a single portion of full-text search.

For a description of search expression syntax see page 2-1207.

The CreateList() method returns FullTextSearchList object that can be used to perform full-text search and obtain its results. This object can be used many times to perform search with various criteria. The SearchString and PortionSize object properties allow the user to change used search expression and size of received data portion.

To perform a full-text search and obtain the first results, use the FirstPart() method that fills the list with the first found items according to the portion size. To obtain subsequent full-text search results, use the NextPart() and Pre- viousPart() methods that can receive the current start position as their parameter and fill the full-text search list with search results. If CurrentStartPosition is not specified, the start position value of the FullTextSearchList object is used instead; this value can be retrieved using the StartPosition() method. Using parameters is better as it accelerates full-text search.

The Count() method contains the number of items in the current part (for the last part it can be less than equal to the portion size) and the TotalCount() method contains total count of found items.

The NextPart() method fills the list with subsequent items according to the portion size. The current position increases by the number of data items in the received part. If there are no data to receive the next portion (end of data is reached), it raises an exception that can be processed by the Try … Except … EntTry clause.

The PreviousPart() method fills the list with previously found items according to the portion size. If there are no data to receive the next portion (beginning of data is reached), it raises an exception that can be processed by the Try … Except … EntTry clause.

The TooManyResults() method returns True if quantity of search results was truncated in order to improve performance. This can impact search exactness (not

    Chapter 19. Full-Text Data Search Mechanism                                                                     2-839

all objects are found). It is recommended to analyze the value returned by this method when receiving the last portion of found data in order to inform the user that not all results have been received from the database.

The GetDescription property contains a flag used to get search results description. If it is set to True, the Description value is filled for each search result providing context for found words. However, if this property is set to False, search is performed faster.

Full-text search list is a collection of full-text search list items that can be tabbed through using the operator For Each … From … Do.

Each full-text search list item is represented by FullTextSearchListItem object and has the following properties:

„ Value – identifies data (object or set of records) containing the search expression;

„ Metadata – metadata object describing data containing the search expression;

„ Presentation – text presentation of found object;

„ Description – contains pairs <attribute>:<value> (beginning on a new line) where:

<attribute> is object attribute whose value contains the search expression; <value> is value of this attribute.

Using the GetRepresentation() method you can obtain search results as XMLReader object or a string with HTML-text where found words are highlighted by HTML means (bold font and background color).

19.3. USE OF ADDITIONAL DICTIONARIES

Additional morphological and synonym dictionaries for full-text search extend system dictionaries and can contain special terms and words used when working with the configuration.

Binary data templates, text templates and string-type or ValueStorage type constants can be used as additional dictionaries. You can specify additional dictionaries in the Additional full-text search dictionaries property of the root metadata object.

Content of dictionaries should like the following:

<?xml  version="1.0"?>
<Dictionary>
<Words>
<lemma>mouse</lemma><forms>mice  mouse's</forms>
<lemma>ox</lemma><forms>oxen</forms>
</Words>
<Synonyms>
<item>error bug failure</item>
<item>stream thread</item>
</Synonyms>
</Dictionary>

Dictionary Item

It is used to store dictionary. A dictionary can have two sections:

„ additional words (lemmas) and their forms

„ sets of synonyms

Words Item

It contains words in the following format:

„ lemma item stores the main form of the word (nominative case); „ forms item contains other case-forms of the word.

Fuzzy search is not used by default. To run a fuzzy search use the * operator. Example: searching for mous* finds mouse, mouse and mice.

Synonyms Item

It stores sets of synonyms. Each set is enclosed in item tags.

To include synonyms in full-text search, use the ! operator. Searching for !error finds error, bug and failure. For a description of search expression syntax see page 2-1207.

Synonyms and Words items can be ordered randomly.

The system loads dictionaries when search is first called or indexing is performed. If errors are found in the template, filling dictionaries is stopped at the error position.

If the dictionary has been modified, re-start the client for the system to use the updated dictionary. However, index is not auto-updated and has to be rebuilt manually, although searches use new dictionaries.

Leave a Reply

Your email address will not be published. Required fields are marked *

1C:Enterprise Developer's Community