This article is a continuation of Cursors : An In Depth Look – Part 1
Testing Cursors and Set-Based Operations (Single Query Execution)
To help speed up the testing (not the process its self) you can check the box “Discard Results After Execution” option in SSMS. This will ensure that when your query executes it will throw away the data and will not generate the output, hence giving you a little more time for testing.
We are first going to setup our test environment and then we will scan the same amount of data with set-based code and a cursor. We will be testing with 1 million records, each row containing a bit more than 200 bytes.
Note: Testing results may vary on the computer you are using. Hard drive and processor differences will drastically change the results of these operations.
If you would like to follow along with the examples I have step by step instructions on what you need to do. I entered these examples as I was doing them in SSMS so there should not be anything missing.
Setup the Environment
We are going to create a temporary database for these examples. This same database will be used in all parts of this article.
Create the database:
USE Master IF EXISTS ( SELECT NULL FROM sys.databases WHERE name = 'CursorTest' ) BEGIN ALTER DATABASE CursorTest SET SINGLE_USER WITH ROLLBACK IMMEDIATE DROP DATABASE CursorTest END GO CREATE DATABASE CursorTest GO USE CursorTest
Next we are going to create a table that we will be using to scan data from..
IF OBJECT_ID('dbo.T1') IS NOT NULL DROP TABLE dbo.T1 GO CREATE TABLE T1 ( keycol INT NOT NULL IDENTITY(1, 1) ,filler CHAR(200) )
Okay, now we are going to insert the 1,000,000 records into this table. You can do this simply by adding an insert statement with GO 1000000 after the statement (by the way, this is an undocumented feature of TSQL).
INSERT INTO T1 (filler) VALUES ('a') GO 1000000
When you execute this statement you will see in your messages window, after it has finished, the following message:
Beginning execution loop Batch execution completed 1000000 times.
Now that you have a million rows, you want to apply a clustered unique index to the table. This will improve your performance greatly. Not only that, we don’t want to wait hours for a simple test like this.
CREATE UNIQUE CLUSTERED INDEX idx_keycol on dbo.T1(keycol);
Okay, we now have our data setup. We can write the code for the cursor and the set-based operation. Before we run the code we want to make sure that our cache is cleared so we do not use memory for this operation. In a production environment, your queries will most likely hit the cache, but for demonstration purposes, we want to create as much as a what-if environment as possible.
So, how do you clear the cache in SQL? Run the following statement:
DBCC DROPCLEANBUFFERS;
We will be using this statement quite a bit during this process, so if I am referring to the “Clear Cache” statement, we will be running this statement.
Let the Testing Begin
Okay, we are finally to the fun part. The actual testing. We are going to run a query against what is called “cold cache” and then run the same query against what is called “warm cache”.
Make sure you have executed your clear cache statement first.
Execute the following query:
SELECT keycol, filler FROM dbo.T1
When running the query against cold cache, it took 6 seconds on my machine to execute.
Note: If you had not enabled the “Discard Results After Execution" your query would have taken substantially longer to produce the results in your output window.
Execute the query again. You are now running this against the warm cache. This time it only took 1 second to execute on my machine. So, as you can see, cached queries will outperform non cached queries quite a bit.
The Cursor
Now we are going to test our cursor operation.
Clear your cache with the clear cache statement and write the following code. We are going to execute the code twice, once against cold cache and the other against the warm cache. Initially we are only going to look at the speed of the query execution. We will dive into the statistics a bit later.
DECLARE @keycol AS INT ,@filler AS CHAR(200) DECLARE C CURSOR FAST_FORWARD FOR SELECT keycol ,filler FROM dbo.T1 OPEN C FETCH NEXT FROM C INTO @keycol, @filler WHILE @@fetch_status = 0 BEGIN FETCH NEXT FROM C INTO @keycol, @filler END CLOSE C DEALLOCATE C
Running against cold cache on my machine this query took 27 seconds to execute. When I executed it again against the warm cache, it took 15 seconds. Using the warm cache you can see that the query exhibited the same type of speed increase based on our initial query (1 second result). However, as you can see, we have a significantly higher execution time for both the cold and warm cache for the same query. The only difference is we now have the overhead of the cursor.
If you really look at the cursor, you will notice that it is not doing anything. It is just a simple cursor that loops. There is no complex row by row manipulation happening. Imagine the overhead if we added more complex code to it.
Set-Based Operation
First we want to clear our cache (using the clear cache statement above). We then want to execute the following set-based operation query.
DECLARE @keycol AS INT, @filler AS CHAR(200) SELECT @keycol = keycol, @filler = filler FROM (SELECT TOP (1) keycol,filler FROM dbo.T1 ORDER BY keycol) AS D WHILE @@rowcount = 1 BEGIN SELECT @keycol = keycol, @filler = filler FROM (SELECT TOP (1) keycol, filler FROM dbo.T1 WHERE keycol > @keycol ORDER BY keycol ) AS D END
Against the cold cache, this process took 29 seconds. This was 2 seconds slower then a cursor against the cold cache! Against the warm cache it took 11 seconds, which comes out to 4 seconds faster. At this point you are probably thinking there is not much of a difference between the set-based operation and the cursor.
You may also be thinking that a cursor might be faster then a set-based operation. Well, as I pointed out initially, they can be faster in some operations. Looking at the queries we have here, they are not full of complex operations inside of the loops.
The important part here is the process is slightly cleaner then working with a cursor. Also, it is not just the speed of the execution that is important, it is what goes on behind the scenes. We are getting to that, just be patient.
Temporary Table Option (Set-Based Operation Modified)
There is another way to write this query (which is not in the book) by using a temporary table. Using a temporary table would allow you to gather specific data first versus doing key column identification on a table.
Here is how the temporary table would work.
DECLARE @Counter INT DECLARE @TempValue CHAR(1) CREATE TABLE #tmpCursor ( id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY, filler CHAR(200) ) CREATE NONCLUSTERED INDEX [IX_CursorDump] ON #tmpCursor ( filler ASC ) INSERT INTO #tmpCursor SELECT filler FROM dbo.T1 TABLESAMPLE(10 PERCENT) SELECT @Counter = COUNT(*) FROM #tmpCursor PRINT @Counter WHILE( @Counter > 0 ) BEGIN SET @Counter = @Counter - 1 END DROP TABLE #tmpCursor
As you can see, I used the TABLESAMPLE at 10 percent. This will gather ~10% of the rows in the table. Which is what you are most likely going to do when working with row by row operations. Now, if you really did need to process every single row in our 1 million rows in the table, I would not recommend using a temporary table as the overhead of filling up the tempdb would be tremendous versus using the TOP operator.
Running this query on cold cache took 10 seconds and returned 97432 rows. On warm cache it took 9 seconds at 98800 rows. Not much of a difference because we are gathering a random 10% of rows. If we use the REPEATTABLE(SEED) and gather the same rows each time, cold cache took 10 seconds at 100738 rows and warm cache took 5 seconds resulting in the same rows.
As you can see, if you need to do row by row operations with a specific set of query based data, you can get substantial improvements to your set-based operations.
Continue to Cursors : An In Depth Look Part 3

[...] Cursors : An In Depth Look – Part 2 [...]