Typical reasons of non-optimal query operation and optimization methods

Publications: Typical reasons of non-optimal query operation and optimization methods
The article presents the typical reasons of non-optimal operation of queries diagnosed at the configuration code level and considers the techniques if query optimization.

A significant part of problems that lead to the non-optimal query operation is detected by analysing the configuration code and metadata structure. There is a list of typical errors of the code and data structure whose consequences are fairly well studied and easily predictable. The code analysis using this list allows solving most of the problems with query performance without going into the detailed technical information (query text in the SQL language, query plan, etc.).

The main reasons of non-optimal operation of the queries diagnosed at the configuration code and metadata structure level:

  • joins with subqueries;
  • joins with virtual tables;
  • disparity of query indices and conditions;
  • use of subqueries in the join condition;
  • receiving data through a point from the fields of composite type;
  • filtering virtual tables without the use of parameters.

This article discusses the listed reasons of non-optimal query operation and gives recommendations of their optimization.

Joins with subqueries

Recommendations

When writing queries? do not use the joins with subqueries. You should join only metadata objects or temporary tables. if the query uses the joins with subqueries, it should be rewritten using использованием temporary tables.

if the query contains the joins with subqueries, this can lead to the following negative consequences:

  • Very slow query execution at low load of serve equipment. Query slowdown can be very significant (up to several times).
  • unstable query operation. Under certain conditions, the query may operate fast enough, under others - very slow.
  • Significant difference in time of query execution for different DBMSs.
  • Hypersensitivity of query in the relevance and completeness of statistics. Immediately after the complete update of statistics, the query may operate fast, but after a while it can slow down again.

An example of potentially dangerous query that uses a join with subquery:

SELECT ...
FROM Document.SalesInvoices
LEFT JOIN (
   SELECT FROM InformationRegister.Limits
   WHERE ...
   GROUP BY ...
) ON ...

In this example, in the right part of join the subquery is used instead of metadata object. Note that it is not important in which part of join (right of left) the subquery is used. Similarly, it does not matter what type of join is specified (LEFT, RIGHT, etc.). In all cases, this construction is potentially dangerous and should be corrected using the temporary tables.

To optimize the query, you should split it into several separate queries (by the number of subqueries used in the joins). These queries are recommended to place in a single batch query.

Attention! Do not forget to index the created temporary table. As the index fields you should specify all the fields that are used in the join condition.

For the above example we will have the following batch query:

ManagerTT = New TempTablesManager;
Query = New Query;
Query.TempTablesManager = ManagerTT;
Query.Text = "
   | SELECT ...
   | INTO Limits
   | FROM InformationRegister.Limits
   | WHERE ...
   | GROUP BY ...
   | INDEX BY ... ;
  ...
   | SELECT ...
   | FROM Document.SalesInvoices
   | LEFT JOIN Limits
   | ON ...;"
Explanations

Optimizer of DBMS server (regardless to what DBMS you are using) can not always properly optimize such a query. In this case, the problem for optimizer is choosing the correct join method. There are several algorithms to join two samples. The choice of algorithm depends on the fact how many records will be contained in one and another sample. In case if you join two physical tables, DBMS can easily determine the volume of both samples based on the existing statistics. If one of the joined samples is a subquery, then it is very difficult to understand what number of records will it return. In this case, DBMS can be mistaken with a choice of plan that will lead to a catastrophic slowdown of query performance.

Rewriting the query by above technique is intended to simplify the work of DBMS optimizer. In the rewritten query all the samples that participate in the joins will be the physical tables and DBMS will be able to determine the size of each sample. This will allow DBMS to choose the guaranteed fastest of all possible plans. Moreover, DBMS will make the right choice regardless on any conditions. The query rewritten in this way will work equally well on any DBMSs that is especially important when developing the mass-production solutions. In addition, the query rewritten in such way is easier to read, easier to understand and debug.

You should understand that rewriting the query in such way, we probably made its slowdown due to the additional overhead charges - creation of temporary tables. If DBMS will not be mistaken with the choice of plan, then it will probably execute the old query faster than the new one. However, this slowdown will be always negligible. The size of slowdown depends on the used DBMS and equipment productivity. In the typical case, the creation of one temporary table can take up to several milliseconds. That is, these slowdowns cannot have a noticeable effect on the system productivity and usually can be neglected.

Joins with virtual tables

Recommendations

If the query uses the join with virtual table of 1С:Enterprise query language (for example, “AccumulationRegister.Goods.Balance()”) and the query works with poor performance, it is recommended to move the access to the virtual table in a separate query that stores the results in temporary table.

That is, use the same recommendation that in the case of join with subquery.

Explanations

Virtual tables used in the 1С:Enterprise query language can be deployed in the subqueries in the translation to the SQL language. This is due to the fact that the virtual table often (but not always) receives data from several physical tables of DBMS. If you use the join with virtual table, then at the SQL level it can be implemented in some cases like the join with subquery. In this case, DBMs optimizer can just choose non-optimal plan like in case of subquery used in the 1С:Enterprise language in explicit form.

Disparity of query indices and conditions

Recommendations

Make sure that for all conditions used in the query there are the appropriate indices.

The conditions are used in the following sections of query:

  • SELECT … FROM … WHERE ..Condition..
  • JOIN … ON ..Condition..
  • SELECT … FROM ..VirtualTable..(, ..Condition..)
  • HAVING ..Condition..

For each condition a suitable index must be. The index is suitable if it meets the following requirements:

  • Index contains all the fields listed in the condition;
  • These fields are at the top of index;
  • These fields are in a row, that is, the fields that do not participate in the query condition do not «wedge»;

When creating the metadata object, 1С:Enterprise automatically creates the indices which must be suitable for majority of queries.
Basic indices created by 1С:Enterprise:

  • index by unique identifier (reference) all object entities (catalogs, documents, etc.);
  • index by recorder (reference to the document) for the tables of register records subordinate to the recorder;
  • index by period and values of measures for summary tables of accounting registers;
  • index by period, calculation and values of all measures for summary tables of accounting registers.

In cases when the automatically created indices are not sufficient, it is possible to index additionally the metadata object attributes. It should be kept in mind that creation of index speed up the process of information search, but may slow down slightly the process of its modification (add, edit and delete). Therefore, the indices should be created consciously and only in case if the query is accurately known for which such index is required. Do not create the indices “just in case” or knowingly redundant indices. For example, you should never index additionally the first register measurement, the main index of summary table which will be created automatically by the platform is suitable to search by value.

Explanations

If there is not index in the database structure which satisfies all these conditions, then to get the result, DBMS will have to scan the table or one of its indices. This will increase the execution time of query, since the number of specified locks.

The requirements for index listed in recommendations are associated with the physical structure of index in DBMS. This structure is a value tree of indexed fields. At the first tree level there are the values of first index field, at the second - the second and so on. Such structure allows achieving high efficiency when searching by index. In addition, it ensures the absence of index productivity degradation with increase of the amount of data.

However, the index of such structure can be obviously used only in specific way. Firstly, it is required to search by the value of first index field, then - second and so on. If, for example, the condition by the first index field is not specified, the index will not be able to ensure the quick search. If the condition is specified by several first index fields and then one or more index fields is not specified, the index can be used only partially.

Examples
In configuration the accumulation register OrdersRegistered is described:

Publications: Typical reasons of non-optimal query operation and optimization methods

1С:Enterprise platform will automatically create for the table of remains of this register the index by period and all measures in the order in which they are listed in the designer

Consider several examples of queries and analyse whether the will be able to be executed in optimal way in such data structure.

Query 1

SELECT
	OrdersRegisteredBalance.Company,
	OrdersRegisteredBalance.Order,
	OrdersRegisteredBalance.Product,
	OrdersRegisteredBalance.QuantityBalance,
	OrdersRegisteredBalance.ShippedBalance,
	OrdersRegisteredBalance.InvoicedBalance
FROM
	AccumulationRegister.OrdersRegistered.Balance(, Product = &Product) 
AS OrdersRegisteredBalance

In this case, the requirement 2 is broken. There is no selection in the condition by the first index fields (Company and Order). Such query cannot be executed in the optimal way. For its execution DBMS server will have to search (scan) all the table records. An execution time of this operation depends directly on the number of records in the table of register remains and can be very large (and will increase with increasing amount of data).

Optimization options:

  • Index measure «Product»
  • Set measure «Product» first in the list of measures. Be careful when using this method. Other queries can be in the configuration that may slow down as a result of this exchange.

Query 2

SELECT
	OrdersRegisteredBalance.Company,
	OrdersRegisteredBalance.Order,
	OrdersRegisteredBalance.Product,
	OrdersRegisteredBalance.QuantityBalance,
	OrdersRegisteredBalance.ShippedBalance,
	OrdersRegisteredBalance.InvoicedBalance
FROM
	AccumulationRegister.OrdersRegistered.Balance(
			,
			Company = &Company
				AND Product = &Product) AS OrdersRegisteredBalance

In this case, the requirement 3 is broken. Between the measures «Company» and «Product» in the register structure there is a measure «Order» which is not specified in the query condition. This query will not be also executed in the optimal way. During its execution, DBMS will search by the first index field, but later will scan by force some its part. Scanning will increase the time of query execution and lock the redundant records in the table, that is, will reduce the overall system capacity.

Optimization options:

  • Add to the query the condition by measure «Order»
  • Remove from the query the condition by measure «Product»
  • Move «Order» from the measures to the attributes
  • Swap the measures «Order» and «Product

Query 3

SELECT
	OrdersRegisteredBalance.Company,
	OrdersRegisteredBalance.Order,
	OrdersRegisteredBalance.Product,
	OrdersRegisteredBalance.QuantityBalance,
	OrdersRegisteredBalance.ShippedBalance,
	OrdersRegisteredBalance.InvoicedBalance
FROM
	AccumulationRegister.OrdersRegistered.Balance(
			,
			Company = &Company
				AND Order = &Order) AS OrdersRegisteredBalance

In this case, the requirements of index and query correspondence are not broken. This query will be executed by DBMs with optimal way. Note that the order of conditions in the query does not have to match the order of fields in the index. This is not a problem and will be normally processed by DBMS.

Use of subqueries in the join condition

Recommendations

Do not use the subqueries in the join conditions. This can lead to a significant slowdown of the query and (in some cases) to its complete failure in some DBMSs. Query example with the use of subquery in the join condition:

SELECT
	OrdersRegisteredTurnovers.Company,
	OrdersRegisteredTurnovers.Order,
	OrdersRegisteredTurnovers.Product,
	OrdersRegisteredTurnovers.QuantityTurnover,
	OrdersRegisteredTurnovers.QuantityReceipt,
	OrdersRegisteredTurnovers.QuantityExpense,
	PriceList.Price
FROM
	AccumulationRegister.OrdersRegistered.Turnovers(, , , Company = &Company) AS OrdersRegisteredTurnovers
		INNER JOIN InformationRegister.PriceList AS PriceList
		ON OrdersRegisteredTurnovers.Product = PriceList.Product
			AND (PriceList.Period IN
				(SELECT
					MAX(PriceList.Period) AS Period
				FROM
					InformationRegister.PriceList AS PriceList
				WHERE
					PriceList.Product = &Product
				GROUP BY
					PriceList.Product))

In this case, the subquery in the join condition is used to get a kind of “last cut” at the end of previous period. Moreover, for each nomenclature the period can be different. Such query is recommended to rewrite with the use of temporary tables. For example, this can be done as follows:

// Get all data about orders
SELECT
	OrdersRegisteredTurnovers.Company,
	OrdersRegisteredTurnovers.Order,
	OrdersRegisteredTurnovers.Product,
	OrdersRegisteredTurnovers.QuantityTurnover,
	OrdersRegisteredTurnovers.QuantityReceipt,
	OrdersRegisteredTurnovers.QuantityExpense
INTO Orders
FROM
	AccumulationRegister.OrdersRegistered.Turnovers(, , , Company = &Company) AS OrdersRegisteredTurnovers
;
////////////////////////////////////////////////////////////////////////////////
// Get all prices by product from previous table
SELECT
	PriceListSliceLast.Product,
	PriceListSliceLast.Price
INTO Prise
FROM
	InformationRegister.PriceList.SliceLast(
			,
			Product IN
				(SELECT
					T.Product
				FROM
					orders AS T)) AS PriceListSliceLast
;
////////////////////////////////////////////////////////////////////////////////
// Join tables and get final result
SELECT
	Orders.Company,
	Orders.Order,
	Orders.Product,
	Orders.QuantityTurnover,
	Orders.QuantityReceipt,
	Orders.QuantityExpense,
	Prise.Price
FROM
	Orders AS Orders
		LEFT JOIN Prise AS Prise
		ON Orders.Product = Prise.Product

Receiving data through the point from the fields of composite type

Recommendations

If the query is used to obtain the value through the point from the field of composite type, then during execution of this query, the join with all tables of objects will be performed included in this composite type. As a result, SQL text becomes extremely complex, and during its execution DBMS optimizer may select non-optimal plan. This can lead to the serious performance problems and even to the query failure in certain cases.

In particular, it is not recommended to access the register recorder attributes (for example, “OrdersRegisteredTurnovers.Recorder.Date”), etc. In this case, it does not matter in what part of query you use an attribute received through the point from the field of composite type - in the list of returned values, in the condition, etc. In all cases, this access can lead to the performance problems.

The general recommendation is to limit as far as possible the number of joins in such queries. To do this, it is possible to use the following methods:

  • Avoid redundancy when creating the fields of composite reference types. Specify exactly as possible types for this fields as necessary. Do not use without the need the types “any reference” or “reference to document”, etc. Instead, you should consider more carefully the applied logic and assign for the field exactly those possible types of references that are needed to solve the task.
  • If necessary, sacrifice compactness of data storage for the performance. If you needed the value in the query obtained using the reference, then perhaps this value can be stored directly in this object. For example, if during the work with register you need information about the recorder date, you can add to the register the corresponding attribute and assign to it the value when conducting the documents. This will lead to duplication of information and some (slight) increase of its volume, but can significantly improve the performance and stability of query operation.
  • If necessary, sacrifice the code compactness and versality for the performance. Usually, to execute a particular query, all the possible types of this reference are not required in theses conditions. In this case, you should limit the number of possible types using the function CAST. If this query is universal and used in several different situations (where the reference types can be different), then you can generate the query dynamically substituting in the function CAST the type required in these conditions. This will increase the size of initial code and will probably make it less universal, but can significantly improve the performance and stability of query operation.
Example

in this query the access to recorder attributes is used. The recorder is a field of composite type which can take the values of reference to one of 56 types of documents.

SELECT
	OrdersRegistered.Recorder.Date,
	OrdersRegistered.Recorder.Number,
	OrdersRegistered.Company,
	OrdersRegistered.Order,
	OrdersRegistered.Product,
	OrdersRegistered.Quantity
FROM
	AccumulationRegister.OrdersRegistered AS OrdersRegistered
WHERE
	...

SQL-text of this query will include 56 left joins with document tables. This can lead to the serious performance problems when executing the query. However, To solve this particular task, there is no need to join with all 56 types of documents. For example, the query conditions are such that during its execution only the records of documents “SalesInvoice” and “SalesOrder” will be selected. In this case, we can greatly speed up the query operation by limiting the number of joins using the function CAST().

SELECT
	CASE
		WHEN OrdersRegistered.Recorder REFS Document.SalesInvoice
			THEN CAST(OrdersRegistered.Recorder AS Document.SalesInvoice).Number
		WHEN OrdersRegistered.Recorder REFS Document.SalesOrder
			THEN CAST(OrdersRegistered.Recorder AS Document.SalesOrder).Number
	END AS Number,
	CASE
		WHEN OrdersRegistered.Recorder REFS Document.SalesInvoice
			THEN CAST(OrdersRegistered.Recorder AS Document.SalesInvoice).Date
		WHEN OrdersRegistered.Recorder REFS Document.SalesOrder
			THEN CAST(OrdersRegistered.Recorder AS Document.SalesOrder).Date
	END AS Date,
	OrdersRegistered.Recorder.Date,
	OrdersRegistered.Recorder.Number,
	OrdersRegistered.Company,
	OrdersRegistered.Order,
	OrdersRegistered.Product,
	OrdersRegistered.Quantity
FROM
	AccumulationRegister.OrdersRegistered AS OrdersRegistered
WHERE
	OrdersRegistered.Company = &Company

This query is more bulky and perhaps less universal (it will not work correctly for other situations - when other values of recorder types are possible). However, during its execution a SQL query will be generated which will contain only two joins with document tables. such query will work mush faster and more stable than the query in its original form.

Filtering virtual tables without the use of parameters

When using virtual tables in the queries, you should pass in the parameters of tables all conditions associated with this virtual table. it is not recommended to filter virtual tables using the conditions in section WHERE, etc. Such query will return correct (in terms of functionality) result, but it will be much harder for DBMS to select the optimal plan for its execution. In some cases this can lead to the errors of DBMS optimizer and significant slowdown of query operation.

For example, the following query uses section WHERE of query to select from virtual table:

SELECT
	OrdersRegisteredTurnovers.Company,
	OrdersRegisteredTurnovers.Order,
	OrdersRegisteredTurnovers.Product,
	OrdersRegisteredTurnovers.QuantityTurnover,
	OrdersRegisteredTurnovers.QuantityReceipt,
	OrdersRegisteredTurnovers.QuantityExpense
FROM
	AccumulationRegister.OrdersRegistered.Turnovers AS OrdersRegisteredTurnovers
WHERE
	OrdersRegisteredTurnovers.Company = &Company

It is possible that as a result of this query execution firstly all the records of virtual table will be selected and then a part will be selected from them that corresponds to a given condition. It would be the best option to limit the number of selected records at the most early stage of query processing. For this, it is required to pass the conditions in the parameters of virtual table.

SELECT
	OrdersRegisteredTurnovers.Company,
	OrdersRegisteredTurnovers.Order,
	OrdersRegisteredTurnovers.Product,
	OrdersRegisteredTurnovers.QuantityTurnover,
	OrdersRegisteredTurnovers.QuantityReceipt,
	OrdersRegisteredTurnovers.QuantityExpense
FROM
	AccumulationRegister.OrdersRegistered.Turnovers(, , , Company = &Company) 
AS OrdersRegisteredTurnovers
Click to rate this post!
[Total: 0 Average: 0]

Leave a Reply

Your email address will not be published. Required fields are marked *