Monday, February 13, 2012

About Indexes & Where caluse

Hi,

I am struggling with the sequence of parameters in my where clause.

In my databse table i have index on Broadcast_Date(some table field).The Index also include some other parameters which might become part of where clause.

The first field of the index is broadcast date.

So I want to know whether it is always compoulsory to have broadcast date as my first field of where clause.

Will it not scan the index if my where clause is starting with any other field which is also part of index.?

Will it hamper the query performance any way ?

Any help will really be appreciated.

Thanks

Vyanki

Hello,

the order of conditions in your where clause has no impact on index selection. Write in any way, it doesn't matter.

SQL chooses indexes(execution plan) based on the "logic" of your query and the information(statistics) that it available about the indexes that might be used to answer your query.
First, the SQL query is compiled into a "Query tree". This is a tree of algebraic operations, each of them has its corresponding SQL-counterpart.
Then, the query optimizer makes transformations of the query tree, and obtains new trees. These new trees are semantically equivalent to the initial tree, but they have different costs in terms execution.The cost of execution of each tree obtained is evaluated, based on index statistics. The one with the least cost is chosen as query plan.
Example. The query
select * from Table1 t1 join Table2 t2 on t1.fISN=t2.fISN join Table3 t3 on t2.fCODE=t3.fCODE
can be executed at least in 2 ways: first join Table2 and Table3, then join the resulting temporary table with Table1, or the other way around-join Table1 and Table2, then join the result with Table3. The optimiser considers both opportunities and chooses the "cheaper" one.

P.S. If it is permitted in this forum, I can post links to some books/external sources that cover this topic thoroughly.

|||

No, the order is pretty much meaningless (it is completely meaningless until you have tons of joins and tons of search arguments (tons being a scientific term meaning a lot for the hardware you are using.)

Where broadcast_date = '20060101'
and other_column = 'othervalue'

is identical to:

Where other_column = 'othervalue'
and broadcast_date = '20060101'

because they are mathematically the same, the optimizer can choose to evaluate them in any order.

|||Yup. When you have really big number of tables and conditions involved in your query, there are many indexes that might be used to resolve the query, then the optimizer might consider not looking all possible execution plans. This is due to the number of such plans can be enormously big, and iterating through these plans means that a significant amount of time might be spent just to choose the appropriate plan.
In such cases, the optimizer uses heuristics to prune the number of possible execution plans. These heuristics might be sensitive to the order of conditions etc. in the query code.|||

And I missed this part before:

P.S. If it is permitted in this forum, I can post links to some books/external sources that cover this topic thoroughly.

It is strongly encouraged to do this (as long as it isn't TOO self serving. Like just posting an answer that states "buy my book, it covers this" would seem wrong, but mentioning books, resources, etc that cover the topic along with some explanation is the best of all possiblities.) More things to read the better!

Of course linking to free resources as the answer to a question is often the best way to go to, there are a lot of good resources our there that answer the questions that have been asked by everyone at one time (me included :)

Louis

|||

I think your question is the following:

index on columns (a, b)

what happens if the WHERE clause is:

WHERE b = 2

In this case, the index cannot be used to efficiently seek to the matching rows. The distribution statistics are maintained only for the primary column that is specified in a multi-column index. So you need to specify at least that to be able to use the index. Otherwise, SQL Server will pick a plan that will scan the entire index/table to find the matching rows. Additionally, you can specify some bogus condition like:

where a > 0 and b = 2

which will force the index to be used indirectly. But this might not be a good option since it will anyway result in scanning the entire index and the plan that scans the table directly might be more beneficial.

|||

Hi ,

Its really good to see lots of responses. Thanks all.

So rading all the replies, I drawn a conclusion that, putting the sequence of where clause same as index will help if you have lots of join and lagrge table size.

Thanks & Rgds

vyanki

|||

Change will to may...

...same as index may help if you have...

any you have it. And it is not the table size, it is the complexity of the query. table sizing is all taken care of by statistics which are very fast to work with. It is just that as you add more and more joins, the possible permutations of orders to evaluate criterium grows very large and unwieldly to check every possible combination. And the likely first candidate for an order of execution will be the order you code things in. If this expected cost is less than the expected cost of doing more optimizations, it might just be that the plan will coincide with the order of the tables.

All this to say, don't worry about ordering of where or join clauses for the most part. This is almost never an issue (but it is one of those fun facts that it is good to have in your head when you need it.)

|||If I read your question it is possible you are asking the question that Umachandar answered rahter than the one Louis answered. This is critical. If you have only a concatinated index and you are not querying agaist the first element in the concatination then the index will not be used at all.

Also, the order of elements in the clauses of your SQL statement can alter the execution plan. In many cases the default plan is "good enough" but a hand tuned plan can have a huge performance impact. I have personally seen hand tuned SQL result in orders of magnitude response improvemnts.

|||

Hi,

Thanks for your suggestions.

But what I observed ,even if you have concateed index(having index on more than one field) the index get used when the first field in where clause is different than first field in index defination.

Thanks

Vyanki

|||Even if the index is used, it will be a scan not a seek. So you are essentially scanning the entire index (or table if it is clustered) to get the matching rows. You need to specify the primary column of the index to make use of efficient seeks.

No comments:

Post a Comment