Thursday, 12 April 2012

Write better sql queries

Have you asked yourself, if the sql queries you write are efficient? Do they return the results in a fastest way possible? After writing every query I always ask such questions to myself and in the quest of making my queries more efficient I learned some techniques which I'm gonna share with you.

There are many ways of optimizing the sql queries you write, I'm gonna explain some of them which I find are actually critical to your query performance.

1. Order of the search conditions in your where clause.
2. Database indexing.

1. Order of the search conditions in your WHERE Clause

To understand this concept you have to first understand how does the sql query engine works? How does it executes the sql statement and more importantly how does it evaluates the where conditions?

So here is the order in which the various parts of the sql statements are executed.

Your FROM Clause gets executed first.
Then comes your WHERE clause.
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause

So how will you use this information to optimize your queries? Well read on...

As the FROM Clause executes first we can narrow down our result set by specifying some additional on rules.
For e.g.
See the below query:
Select * from
Employees
left join
 EmployeeAddress
on
Employees.EmployeeID=EmployeeAddress.EmployeeID
AND
Employees.IsActive=1
Where
………….

Have you observed the text in bold? I actually narrowed down my result-set (the number of rows returned by executing the from clause) by removing the Employees which are not active.
The sql engine now will send this result set to be evaluated against the conditions in the where clause.

Where Clause (I. M. P.)
The sql query engine executes the conditions specified in the where clause from left to right. The leftmost condition will be evaluated first.
For e.g.

Select * from Employees where FirstName=’Ajit’ and Age=27

The sql engine searches for all the rows where FirstName column contains ‘Ajit’. And then searches in those rows (returned from the first condition) employees whose age is 27.

So from the above understanding you can structure your where clause more cleverly. The funda is, place those conditions first which will return the fewest results possible.

Let me explain how:
Lets say you have a table of Employees which contains 5000 rows.
Now you are asked to show only those employees who are currently working with the company (Active) and above the age of 50 and having a first name say again ‘Ajit’.
Here is how I’ll write the query.

Select FirstName, LastName from Employees where
FirstName=’Ajit’ and Age > 50 and IsActive=1

Why?

With little judgment you will come to know that among the 5000 employees there will be very few people who will have there first name ‘Ajit’. Say 230 people.

From the returned results (230) and as per the average 60 years of retirement age there will be few people above 50 years of age. So that will narrow down the result to 85 people.
In the end with those few records (85) it is easy to check which ones are currently working (active).
Finally you are left with 53 people.

So what if you form your query like this

….. where IsAcive=1 and Age > 50 and FirstName=’Ajit’

Firstly after the execution of the first condition (IsActive=1) you will get a fairly large result set of all the current employees. Say 3800 people.
Then among those results (3800) the query engine will have to find all the people whose age is above 50. It comes down to say 1260 people.

From those it will find the results having first name ‘Ajit’.
You are left with 53 people.

Which one do you think is faster, huh?

2. Database indexing.

Coming Soon......................

1 comment: