SQL Tips n Tricks: TSQL

Showing posts with label TSQL. Show all posts

Saturday, January 28, 2017

Reading an Execution Plan - Part 1

Execution plan is something similar to the Bible, everyone know it is there but very few reads and understand what is in there. We here lot of people saying "Why this query is slow" but if you ask them "Have you look at the execution plan?" you rarely get the answer "Yes".

So first of all what is Execution plan, in simple terms, it is the result of the query optimizer’s attempt to calculate the most efficient way to implement the request represented by the T-SQL query you submitted. In other words it says how the query was or will be executed in side SQL Server.So when you know how to read it you can find the places which are making the query slow.

Execution Plan Types

When you look at a execution plan of a particular query SQL Server presents two types of plans;

Estimated Execution Plan

When estimated execution plans are generated, the Transact-SQL queries or batches do not execute. Instead, the execution plan that is generated displays the query execution plan that SQL Server Database Engine would most probably use if the queries were actually executed.

Actual Execution Plan

When actual execution plans are generated, the Transact-SQL queries or batches execute. The execution plan that is generated displays the actual query execution plan that the SQL Server Database Engine uses to execute the queries.

Execution Plan Formats

In SQL Server execution plan can be viewed in three different formats;

Graphical Plan
Text Plan
XML Plan

Graphical Plan

These are quick and easy to read but the detailed data for the plan is masked. Both Estimated and Actual execution plans can be viewed in graphical format.

Text Plans

These are a bit harder to read, but more information is immediately available. There are three text plan formats:

SHOWPLAN_ALL : a reasonably complete set of data showing the Estimated execution plan for the query
SHOWPLAN_TEXT : provides a very limited set of data for use with tools like osql.exe. It too only shows the Estimated execution plan
STATISTICS PROFILE: similar to SHOWPLAN_ALLexcept it represents the data for the Actual execution plan

XML Plans

XML plans present the most complete set of data available on a plan, all on display in the structured XML format. There are two varieties of XML plan:

SHOWPLAN_XML : The plan generated by the optimizer prior to execution.
STATISTICS_XML : The XML format of the Actual execution plan.

Now let's look at a query and how it's execution plan looks like. I am using the AdventureWorks sample database for the examples.

Here I am writing a full select on person table. And you can see the Actual Execution Plan below in which it does a full Clustered Index Scan. This is because SQL Server has to return all the values in the table using the clustered index.

Let's write a query which will do a restrict (WHERE filter) to the primary key, now you can see it is using a Seek to the clustered index to get the value.

Scan operations are very costly and Seek operations are performance friends since they use Indexes to get information. Simply a Clustered index scan means a full table scan which is the most costly operation to retrieve data.

Now let's look at how a non-clustered index being used is shown in an execution plan. Below query is asking for the people who has PersonType attribute is set to 'EM'. In the table PersonType column does not have an index so the Query Optimizer has no any other option but to go for a full clustered index scan to get the results.

Now let's ask SQL Server to return people who has first name Alan. Person table has a composite index for FirstName, MiddleName and LastName. As you can see it uses a index scan to on that composite index first and then do a Key_Lookup on the clustered index to present the data.

Let's ask SQL Server to get the people who has name Dylan A Millar. In this scenario it uses an Index Seek on the composite index. Which is fully utilizing the index for optimal performance. In simple terms Index Seek is the most optimal way to access the data but in the previous scenario it had to scan through the whole index since the first name didn't have a proper index.

In conclusion you must look at the execution plan when optimizing a bad performing query. If a Index is missing you can simply identify it by looking at the execution plan as explained above. We will be discussing more about reading different type of execution plans in the next chapter of this article series.

Tuesday, September 13, 2011

Round UP with CEILING

Rounding up in TSQL is bit tricky but if you want just to round up a decimal value to the nearest integer then you can use tsql Math function CEILING function to achieve this easily.

CEILING function return the next integer that is larger than the number entered.

SELECT CEILING(43.98) -- Returns 44
SELECT CEILING(43.18) -- Returns 44
SELECT CEILING(0.18) -- Returns 1
SELECT CEILING(-0.18) -- Returns 0
SELECT CEILING(-43.18) -- Returns -43

MSDN reference : http://msdn.microsoft.com/en-us/library/ms189818.aspx

Wednesday, August 24, 2011

Formatting Dates in SSIS

When you want to format a date in a SSIS package it is not straightforward. However there are two ways you can do this.

First method is formatting the date from your TSQL. You can use the CONVERT() function to do this;

CONVERT(VARCHAR(20),GETDATE(),101) -- 08/25/2011
CONVERT(VARCHAR(20),GETDATE(),103) -- 25/08/2011
CONVERT(VARCHAR(20),GETDATE(),105) -- 25-08-2011
CONVERT(VARCHAR(20),GETDATE(),112) -- 20110825

For more formatting codes with CONVERT() function please refer this http://www.sql-server-helper.com/tips/date-formats.aspx

The second method is converting it in a expression in SSIS. Below is a sample code for that;

(DT_WSTR,4)YEAR(GETDATE()) + "/" + RIGHT(“00″ + (DT_WSTR,2)MONTH(getdate()),2) + "/" + RIGHT(“00″ + (DT_WSTR,2)DAY(getdate()),2)

This will format the date to the format of YYYY/MM/DD. You can switch the statements and format as you desires.

Please leave a comment if you have any questions regarding this.

Tuesday, August 16, 2011

Create table script from Excel file

In practice always the database design is not done using CASE tools. Some times there are projects comes from clients with database table definitions in Excel files. Trying to create tables from these Excel definitions can be a tedious task for large tables. So here is a spreadsheet sample template you can use to minimize your effort. This helped me to create few tables within matter of minutes as oppose to few hours it would have taken.

The A1 Cell should contain the table name and the last row's G column should contain a close bracket ")". After pasting your table structure in between simply copy the whole G column values and run it on the SQL Server database and you are done.

Download the template from here

Monday, August 15, 2011

Update sequential number column over another column values

This is something I had to do recently. If you don't get this right this can be really complex.

Lets say you have a table named Invoice which has data as in the below screenshot. So you want to create a new ID column which will be auto incremented for each Customer. i.e CustomerID 1 will have 1,2,3,4 values and then for CustomerID 2 it will again start from 1..

You can do this using a after update trigger. But in order to populate column for existing data what would you do. Most of you might say use a Cursor, but more often than not there is a SET based query we can use to do things Cursors do. So here also you can use ROW_NUMBER() OVER to accomplish this task.

Lets create the table for this example and lets keep it simple;

CREATE TABLE dbo.Invoice
(
InvoiceID INT IDENTITY(1,1) PRIMARY KEY,
CustomerID INT,
Amount MONEY,
PAID BIT DEFAULT 0
)

Now lets insert some test data;

INSERT INTO dbo.Invoice VALUES 
(1,200,0),
(1,250,0),
(1,400,0),
(1,320,0),
(2,120,0),
(2,50,0),
(3,10,0),
(4,150,0),
(4,100,0),
(4,140,0)

Now lets add the new column to be populated;

ALTER TABLE dbo.Invoice 
ADD CustomerInvoiceNumber INT

Now is the real query which populates the column based on the CustomerID;

;WITH CTE (InvoiceID,CustomerInvoiceNumber) AS 
(
SELECT InvoiceID,ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY InvoiceID) FROM dbo.Invoice 
)

UPDATE I 
SET CustomerInvoiceNumber = CTE.CustomerInvoiceNumber
FROM 
dbo.Invoice I INNER JOIN CTE ON I.InvoiceID = CTE.InvoiceID

We are using a CTE because we can not use PARTITION BY in the Update clause directly. This will update the CustomerInvoiceNumber Column with the desired values as below;

Thursday, August 11, 2011

Find column names in a table

Thought this might save sometime going through Books Online to find this; There are few ways to do this in SQL Server 2005 onwards.

Useing Information_Schema.Columns is the best out of those. This is because it returns results in a readable format. Below is how you can do this;

SELECT * FROM information_schema.columns  WHERE Table_Name = 'YourTableName'

Alternately you can use SP_Columns as well, usage is as below;

EXEC sp_Columns 'YourTableName'

In addition to this if you want to say get all the columns in a database which has a particular name or part of the name in it, this can be also achieved using a script like this;

SELECT * FROM information_schema.columns
WHERE Column_Name LIKE  '%url%'

This will return all columns which are having url as part of the name.

Friday, May 20, 2011

Find referenced tables of a stored procedure with sys.sql_expression_dependencies

During fixing one of the performance issues of a stored procedure I wanted to find the tables being used in this complex stored procedure. Doing this manually would have been a nightmare, but there is a system view (Microsoft called them as Object Catalog Views) which we can query to get these results.

If you want to get the referenced tables of a particular stored procedure you can use below query;

SELECT

OBJECT_NAME(referencing_id) AS SP_NAME,

referenced_class_desc,

referenced_database_name,

referenced_entity_name

FROM sys.sql_expression_dependencies

WHERE

OBJECT_NAME(referencing_id) = 'YourStoredProcedureName'

There are quite a bit of additional columns with additional details in the view. You can find more details from this MSDN article.

http://msdn.microsoft.com/en-us/library/bb677315.aspx

This is only available in SQL Server versions 2008 and upwards.

Wednesday, May 18, 2011

Using WAITFOR to run a SQL Statement on a given time

Recently I came across a situation where I wanted to run the same SP at the same exact time from several sessions. The purpose of this was to simulate and see if there are any blocking or deadlock situations. As this is not possible to do by manually pressing F5 from few sessions, I looked out for a solution.

And I found this WAITFOR function in SQL Server very much do the trick. WAITFOR function has two behaviors one we can give a DELAY, a specific number of hours:minutes:seconds to wait till it executes the next statement(s) in the batch. Or by using TIME we can provide a specific time to run the next statement(s) in the batch.

Usage would be as follows;

WAITFOR DELAY '00:00:30'

EXECUTE spSampleSP @ID = 23945

When we run the above script the process will wait 30 seconds before it executes second statements which in this case the stored procedure.

WAITFOR TIME '13:11:30'

EXECUTE spSampleSP @ID = 23945

When we use the above script with the TIME function it will wait till the system time reaches 1 PM, 11 minutes and 30 seconds to execute the stored procedure.

So I used WAITFOR TIME in my situation where I could set same time for few sessions to execute the same stored procedure and could simulate the scenario. Hope you will also find this useful.

LEFT OUTER JOIN Vs RIGHT OUTER JOIN Vs FULL OUTER JOIN

There seems to be some confusion among some people about LEFT JOIN (Same as LEFT OUTER JOIN) , RIGHT JOIN (Same as RIGHT OUTER JOIN) and FULL JOIN (Same as FULL OUTER JOIN). Let me explain these three with a simple example

Lets say we have two tables called tblCustomer and tblInvoice where as tblInvoice holds a foriegn key for tblCustomer in fCustomerID column. Below are the data exists in our two sample tables;

tblCustomer

tblInvoice

You may be already aware of the most popular JOIN type we use INNER JOIN (Or just JOIN). In INNER JOIN what happens is on the given conditional column data it displays matching rows from both the tables. Below is an example of INNER JOIN results on our two sample tables.

How to write multiple INSERT INTO statements in a single statement

This is a very simple trick but can save you some valubale time. As DBA's and Database Developers we write lot of INSERT INTO statements in our day to day life. So when we want to insert multiple rows we tend to copy and paste the same statement and change the values. An example would be somthing like this;

INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES (1, 'Kaminda', 2000, GETDATE());
INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES (2, 'Hemali', 8500, GETDATE());
INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES (3, 'Jude', 3450, GETDATE());
INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES (4, 'Charith', 4500, GETDATE());
INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES (5, 'Lolitha', 2300, GETDATE());

So to do this in a single statement there are two different ways, first one is for SQL Server 2005 and the second method which is much more easier is only supported by SQL Server 2008 upwards;

In SQL SERVER 2005, you can use UNION ALL to accomplish this;

INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified)
SELECT 1, 'Kaminda', 2000, GETDATE()
UNION ALL
SELECT 2, 'Hemali', 8500, GETDATE()
UNION ALL
SELECT 3, 'Jude', 3450, GETDATE()
UNION ALL
SELECT 4, 'Charith', 4500, GETDATE()
UNION ALL
SELECT 5, 'Lolitha', 2300, GETDATE();

This is supported only SQL Server 2008 onwards;

INSERT INTO tblCustomer (ID,Customer_Name, Credit_Limit, Date_Modified) VALUES
(1, 'Kaminda', 2000, GETDATE()),
(2, 'Hemali', 8500, GETDATE()),
(3, 'Jude', 3450, GETDATE()),
(4, 'Charith', 4500, GETDATE()),
(5, 'Lolitha', 2300, GETDATE());

Thursday, May 5, 2011

Why we use ANSI_NULLS ON/OFF

You may have seen many times SET ANSI_NULLS ON statement on top of MS SQL Server stored procedures sometime on functions. Have you ever wondered what is it really means? Well if you havent you may be not reading this blog post.

So the main purpose of this option is to give SQL Server the ISO complient standard to its queries which uses Equal (=) and Not Equal (<>) operators on nullable columns. We should note using these two operators on nullable columns is not ANSII (or SQL 92) standartd. If you set this option ON then in SELECT statements with any conditions in the WHERE cluase which uses column = NULL or column <> NULL can get affected.. These conditions will not return any data if there are any NULL values in that column.

I know this is not that clear to you, so lets do a simple example to clear things out.

First we'll create a sample table and populate it with some test data;

CREATE TABLE tblTestNulls (ID INT IDENTITY NOT NULL PRIMARY KEY, [Name] VARCHAR(20) NULL)

INSERT INTO tblTestNulls
SELECT 'Johon' UNION ALL
SELECT 'Mayor' UNION ALL
SELECT NULL UNION ALL
SELECT NULL Then we'll run below two queries with ANSI_NULLS ON.

SET ANSI_NULLS ON
SELECT * FROM tblTestNulls WHERE [Name] = NULL
SELECT
* FROM tblTestNulls WHERE [Name] <> NULL

As you can see it does not return anything since there are NULL values in Name column and we have set ANSI_NULLS ON.
So we'll run the same queries with ANSI_NULLS OFF
SET ANSI_NULLS OFF
SELECT * FROM tblTestNulls WHERE [Name] = NULL SELECT * FROM tblTestNulls WHERE [Name] <> NULL

So here we get the desired results. If we dont set any option for ANSI_NULLS default would be database default, usually OFF unless you have changed it. The database's default ANSI_NULLS settings you can find under database properties then going to options page.

Also if you use ANSII syntax for NULL comparison such is IS NULL you can get the desired results regardless of whether ANSII_NULLS is ON or OFF. So you can write our two queries as below;

SELECT * FROM tblTestNulls WHERE [Name] IS NULL
SELECT * FROM tblTestNulls WHERE [Name] IS NOT NULL

This is one of the good example why it is always better to use ANSII syntax in Transact SQL.
Conclusion here is dont use SET ANSII_NULL ON simply because it is there in the SQL Server template when you try to create a Stored Procedure or Function. If it is ON adhere to the ANSII syntaxes. Also it is important if your queries run on hetrogenius databases.

Please feel free to leave your comments.