Oct 302012
 

My co-worker Steve Wright (blog | @SQL_Steve) prodded me with a question recently on a strange result he was seeing. In order to test some functionality in our latest tool, SQL Sentry Plan Explorer PRO, he had manufactured a wide and large table, and was running a variety of queries against it. In one case he was returning a lot of data, but STATISTICS IO was showing that very few reads were taking place. I pinged some people on #sqlhelp and, since it seemed nobody had seen this issue, I thought I would blog about it.

TL;DR Version

In short, be very aware that there are some scenarios where you can't rely on STATISTICS IO to tell you the truth. In some cases (this one involving TOP and parallelism), it will vastly under-report logical reads. This can lead you to believe you have a very I/O-friendly query when you don't. There are other more obvious cases – such as when you have a bunch of I/O hidden away by the use of scalar user-defined functions. We think Plan Explorer makes those cases more obvious; this one, however, is a little trickier.

The problem query

The table has 37 million rows, up to 250 bytes per row, about 1 million pages, and very low fragmentation (0.42% on level 0, 15% on level 1, and 0 beyond that). There are no computed columns, no UDFs in play, and no indexes except a clustered primary key on the leading INT column. A simple query returning 500,000 rows, all columns, using TOP and SELECT *:

SET STATISTICS IO ON;
 
SELECT TOP 500000 * FROM dbo.OrderHistory 
WHERE OrderDate < (SELECT '19961029');

(And yes, I realize I am violating my own rules and using SELECT * and TOP without ORDER BY, but for the sake of simplicity I am trying my best to minimize my influence on the optimizer.)

Results:

(500000 row(s) affected)
Table 'OrderHistory'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We're returning 500,000 rows, and it takes about 10 seconds. I immediately know that something is wrong with the logical reads number. Even if I didn't already know about the underlying data, I can tell from the grid results in Management Studio that this is pulling more than 23 pages of data, whether they are from memory or cache, and this should be reflected somewhere in STATISTICS IO. Looking at the plan…

Original plan (click to enlarge)

…we see parallelism is in there, and that we've scanned the entire table. So how is it possible that there are only 23 logical reads?

Another "identical" query

One of my first questions back to Steve was: "What happens if you eliminate parallelism?" So I tried it out. I took the original subquery version and added MAXDOP 1:

SET STATISTICS IO ON;
 
SELECT TOP 500000 * FROM dbo.OrderHistory 
WHERE OrderDate < (SELECT '19961029') OPTION (MAXDOP 1);

Results and plan:

(500000 row(s) affected)
Table 'OrderHistory'. Scan count 1, logical reads 149589, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Results with MAXDOP 1 (click to enlarge)

We have a slightly less complex plan, and without the parallelism (for obvious reasons), STATISTICS IO is showing us much more believable numbers for logical read counts.

What is the truth?

It's not hard to see that one of these queries is not telling the whole truth. While STATISTICS IO might not tell us the whole story, maybe trace will. If we retrieve runtime metrics by generating an actual execution plan in Plan Explorer, we see that the magical low-read query is, in fact, pulling the data from memory or disk, and not from a cloud of magic pixie dust. In fact it has *more* reads than the other version:

Runtime results of both queries

So it is clear that reads are happening, they're just not appearing correctly in the STATISTICS IO output.

What is the problem?

Well, I'll be quite honest: I don't know, other than the fact that parallelism is definitely playing a role, and it seems to be some kind of race condition. STATISTICS IO (and, since that's where we get the data, our Table I/O tab) shows a very misleading number of reads. It's clear that the query returns all of the data we're looking for, and it's clear from the trace results that it uses reads and not osmosis to do so. I asked Paul White (blog | @SQL_Kiwi) about it and he suggested that only some of the pre-thread I/O counts are being included in the total (and agrees that this is a bug).

If you want to try this out at home, all you need is AdventureWorks (this should repro against 2008, 2008 R2 and 2012 versions), and the following query:

SET STATISTICS IO ON;
DBCC SETCPUWEIGHT(1000) WITH NO_INFOMSGS;
GO
 
SELECT TOP (15000) * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate < (SELECT '20080101');
 
SELECT TOP (15000) * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate < (SELECT '20080101') 
OPTION (MAXDOP 1);
 
DBCC SETCPUWEIGHT(1) WITH NO_INFOMSGS;

(Note that SETCPUWEIGHT is only used to coax parallelism. For more info, see Paul White's blog post on Plan Costing.)

Results:

Table 'SalesOrderHeader'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Paul pointed out an even simpler repro:

SET STATISTICS IO ON;
GO
 
SELECT TOP (15000) *
FROM Production.TransactionHistory
WHERE TransactionDate < (SELECT '20080101')
OPTION (QUERYTRACEON 8649, MAXDOP 4);
 
SELECT TOP (15000) *
FROM Production.TransactionHistory AS th
WHERE TransactionDate < (SELECT '20080101');

Results:

Table 'TransactionHistory'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TransactionHistory'. Scan count 1, logical reads 110, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So it seems that we can easily reproduce this at will with a TOP operator and a low enough DOP. I've filed a bug:

And Paul has filed two other somewhat-related bugs involving parallelism, the first as a result of our conversation:

(For the nostalgic, here are six other parallelism bugs I pointed out a few years ago.)

What is the lesson?

Be careful about trusting a single source. If you look solely at STATISTICS IO after changing a query like this, you may be tempted to focus on the miraculous drop in reads instead of the increase in duration. At which point you may pat yourself on the back, leave work early and enjoy your weekend, thinking you have just made a tremendous performance impact on your query. When of course nothing could be further from the truth.

  5 Responses to “Beware misleading data from SET STATISTICS IO”

  1. Excellent post, Aaron, and very revealing. SET STATISTICS IO is one of my favorite tuning techniques. So now I know that I have to use it with a more skeptical eye.

    But you also highlight another tuning axiom that I sometimes forget – true performance problems are best revealed by cross-checking the performance metrics across more than one source. Any time you rely on just one source of tuning information, you're bound to encounter a weakness at some point or another.

    So it's better to use a combination of sources when tuning true important aspects of system performance.

    -Kev

  2. Great post, very interesting. What happens if you cold-start the question, do you get the same effect in the physical and read-ahead reads?
    Regards Ove

    • Thanks Ove, I haven't tested that yet. However, I'm not sure how relevant that would be, since 99% of the time you're not running queries with a cold cache. :-)

  3. Nice post Aaron, as usual. I have noticed eroneous statistics IO reporting when accessing column store indexes too.

  4. Interesting post.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">