Sql Edition Selection: effect of more column in where clause

what are the effects of having number of columns in WHERE clause, ie: if we use more columns in where clause, what will be its impact on performance.

this is important for me to design queries

having a lot of unneccessary where clauses will surely slowdown the query if there are.

the best thing that you can do is to streamline the logic into its simplest form

|||Perhaps I misunderstand the question; my normal experience is that if the additional columns contribute to filtering out unwanted records that the additional columns in the where clause usually enhance performance.|||

yeah thats right but a poorly written where clause will definetely slowdown performance.

The trick is to use the best suited functions and keywords in the where clause

here some useful link:

http://www.sql-server-performance.com/transact_sql.asp

regards,

joey

|||

Hi joeydj,

what do you mean by "the trick is to use the best suited where clause by making use of the powerful sql server functions"?

AMB

|||

oh sorry. got a grammar problem

here i have it corrected

"The trick is to use the best suited functions and keywords in the where clause"

hmmm... thats better..

example:

select ... from

where x=1 and x=2 and x=7 and x=9
maybe written as
where x in (1,2,7,9)

|||

Joey is normally better than that; please cut him some slack. I think what he means is

select ... from

where x=1 OR x=2 OR x=7 OR x=9
maybe written as
where x in (1,2,7,9)

|||

hahaha. not been here for sometime.

hmmm thanks kent.

|||

Not to mention that there is no performance benefit in choosing one of these as opposed to the other, anyway. (There is a readability benefit, and the two forms will behave differently if you carelessly tack on AND <another condition> to the query.

Steve Kass

Drew University

http://www.stevekass.com

|||

Can you give an example of the choices you have to make? Usually if you add "more columns in where clause", you change the meaning of the query (but not always), and the first goal of designing a query is for it to ask the right question...

Steve Kass

Drew University

http://www.stevekass.com

|||

One thing that people sometimes forget is that a full table scan can sometimes be better than an indexed scan. If you are ultimately going to read every block of data from disk via an index scan then you might want to forgo the index. Modern databases with good statistics can usually establish an execution plan that is "good enough" but it can still pay to understand your data.

Here is a quick example.

Let's say I have a database of people and for some reason their geneder in overwhelmingly biased in one direction. An optimizer might look at the number of unique values for gender and assume a 50/50 split in the data. It might make good sence to always use an index on gender to access the rows of the table from the optimizer's best guess. However, that might not be the case in practice.

Let's say the table is 100k records with :

90% F

10% M

To find the men an indexed lookup on gender is probably a good thing. On the other hand an indexed lookup to find women will result in more disk IO and slower performance than a full table scan (not counting your network).

Knowing this distribution ahead of time might lead someone looking for all women to do something like

Select * from employees where gender + '' = 'F'

|||

thanks for all your views.

let me give u a specific example.

my database has "branch name" field in all the tables. and we have seperate copy of database for each branch. so a particular branch user will connect to his branch, which has records only for that branch.

in this situation, there is no need to filter the records again with "branch name", but if there is no performance issue , i wish to include it in the WHERE clause to be 100% sure that all the records that the query output does have the same branch name.

|||

In your situation. I think you could add "branch name" column without perfomance issue. But you need to create index or statictics for this column. As a result query optimizer understands that "branch name" same for all records and doesn't use them for plan.

But it any case your could check query plans and only after this decide

|||

This is a second good example of what I am talking about.

If your database is physically segragated by branch already then you want to make sure you know what the optimizer is doing when you add a "failsafe"

Where BranchId = 10

or alternatively

Where BranchName = 'Downtown'

to your queries.

If adding that clause causes the optimizer to include an index on BranchId as part of the execution plan then you will hurt your performance as a result of extra disk i/o and memory use. This might or might not be a concern for your infrastructure. Though it sounds like if you are physically partitioning your database by branch then you might be concerned about database performance.

Although it might seem counter intuative, it might be a good idea to experiment with:

Where BranchId + 0 = 10

or alternatively

Where BranchName + '' = 'Downtown'

This would ensure than no index on BranchId could be used in the execution plan.

|||

Dear AMERMSAMER,

The bottom line is that having more columns in the WHERE clause does not necessarily slow down performance and may, in fact, actually improve performance. It all depends on the indexes on the table and the "selectivity" of the columns in the WHERE. If there is a unique key and all of the columns in the key are "covered" by the WHERE clause, no more may be needed. SQL will try to optimize the query by using these columns where available.

Friday, February 17, 2012

effect of more column in where clause

No comments:

Post a Comment

Sql Edition Selection

Blog Archive

About Me