Sep 182013
 

SQL Server 2014 CTP1 introduces low priority lock wait options for use with online index operations and partition switches.

For those leveraging online index management or index partitioning and partition switch operations in SQL Server 2012 Enterprise Edition, you may have at one point experienced blocking of your DDL operation as these operations still have some locking requirements.

To illustrate, imagine that I execute the following single-partition online index rebuild in SQL Server 2014 CTP1:

ALTER INDEX [ClusteredIndex_on_ps_ShipDate]
ON [dbo].[FactInternetSales]
REBUILD PARTITION = (37)
WITH (ONLINE= ON);

And let’s take a look at the locks acquired and released during this rebuild operation using Extended Events and the following session definition (this is target-less session and I watched the results via the “Watch Live Data” pane in SQL Server Management Studio):

CREATE EVENT SESSION [Online_Index_Rebuild_Locks_Taken] ON SERVER 
ADD EVENT sqlserver.lock_acquired(
    WHERE ([object_id]=(309576141))),
ADD EVENT sqlserver.lock_released(
    WHERE ([object_id]=(309576141)))
WITH 
(
  MAX_MEMORY=4096 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
  MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB,
  MEMORY_PARTITION_MODE=NONE, TRACK_CAUSALITY=OFF, STARTUP_STATE=OFF
);
GO

The value 309576141 represents the object ID of the FactInternetSales table.

My online index rebuild of a single partition took 56 seconds to complete and after completion, I saw the following lock acquire and release activity:

Lock activity for single-partition online rebuild
Lock activity for single-partition online rebuild

As you can see from the output, although the rebuild is an online operation, it does involve the acquiring of locks in various modes over the lifecycle of the operation. Ideally the lock duration is minimal (for example – the timestamp is identical for the first SCH_S lock acquired and released). But even with a minimal amount of locking you can certainly encounter concurrency issues depending on the transactions running against the index being rebuilt or switched into.

I mentioned at the beginning of this post that Microsoft introduced low priority lock wait options for online operations and partition switch operations in SQL Server 2014 CTP1. On the subject of partition switches, imagine I execute the following operation:

ALTER TABLE [AdventureWorksDW2012].[dbo].[FactInternetSales] 
SWITCH PARTITION 37 TO  [AdventureWorksDW2012].[dbo].[staging_FactInternetSales];

To see the locks acquired and released for this operation, I modified my previously defined Extended Event session to include the applicable objects (source and target table). I saw the following:

Lock activity for a partition switch operation
Lock activity for a partition switch operation

The switch operation to an empty partition occurred in less than a second, but we still see that SCH_S and SCH_M locks were required during the operation lifecycle on both the source and destination (309576141 being FactInternetSales and 398624463 being staging_FactInternetSales).

So again, while the duration of the locking can be extremely brief when there are no concurrent transactions accessing the objects in question, we know this isn’t always possible and so our online index rebuild and partition switch operations can indeed get blocked.

So with this reality, SQL Server 2014 introduces the WAIT_AT_LOW_PRIORITY argument that can be adjusted with MAX_DURATION and ABORT_AFTER_WAIT options for both the ALTER INDEX and ALTER TABLE commands that we can use for both online index and partition switch operations.

What does this allow us to do? First of all, let’s talk about what the behavior was prior to SQL Server 2014. As an example, imagine that I have the following transaction open and uncommitted:

BEGIN TRANSACTION;
DELETE [dbo].[staging_FactInternetSales];

If I tried to perform an ALTER TABLE SWITCH to the staging_FactInternetSales table as a destination in a separate session, I’m going to be blocked and the request is just going to wait. Specifically for this example, I would be waiting with a LCK_M_SCH_M wait type. Once I rollback or commit my transaction, the operation can move forward and complete.

Now if I’m using SQL Server 2014’s WAIT_AT_LOW_PRIORITY with MAX_DURATION and ABORT_AFTER_WAIT, I can leverage a few different options depending on my application requirements.

MAX_DURATION allows me to specify the number of minutes the online index rebuild or partition switch operation will wait. If the MAX_DURATION value is reached, we can set what happens next based on the setting of ABORT_AFTER_WAIT, which can be a value of NONE, SELF or BLOCKERS:

  • NONE means that the index operation will continue attempting the operation.
  • SELF means that if the MAX_DURATION is reached, the operation (the online index rebuild or partition switch) will be cancelled.
  • If BLOCKERS is used, it will kill any transactions that are blocking the online index rebuild or partition switch operation (not an option, in my opinion, to be used lightly). BLOCKERS also requires ALTER ANY CONNECTION permission for the request issuing the online index rebuild or partition switch operation.

The following code examples demonstrate different configuration variations.

    Pre-2014 default behavior (wait indefinitely)

      Executing the following will result in the behavior we’re used to seeing pre-SQL Server 2014 – and it might still be what you’ll want or expect for certain scenarios:

      ALTER TABLE [AdventureWorksDW2012].[dbo].[FactInternetSales] 
      SWITCH PARTITION 37 TO  [AdventureWorksDW2012].[dbo].[staging_FactInternetSales] 
      WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = NONE));

    Wait 1 Minute and Cancel the DDL Operation

      The following example waits for 1 minute if there is a blocking transaction and will get a “lock request time out period exceeded” for the SWITCH operation if the maximum duration is reached:

      ALTER TABLE [AdventureWorksDW2012].[dbo].[FactInternetSales] 
      SWITCH PARTITION 37 TO  [AdventureWorksDW2012].[dbo].[staging_FactInternetSales] 
      WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = SELF));

    Wait 1 Minute and Kill the Blocker(s)

      This example waits for 1 minute if there is a blocking transaction and then will kill the blocking transactions (source or destination included), allowing the SWITCH operation to complete.

      ALTER TABLE [AdventureWorksDW2012].[dbo].[staging_FactInternetSales] 
      SWITCH PARTITION 37 TO  [AdventureWorksDW2012].[dbo].[FactInternetSales] 
      WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));

      In my example of a DELETE inside an uncommitted transaction, there was no error in my SQL Server Management Studio window as I didn’t have an actively running statement, but attempting another statement within that session returned the following error message (as my session had been killed):

      Msg 233, Level 20, State 0, Line 3
      A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 – No process is on the other end of the pipe.)

    Kill the Blocker(s) Immediately (Source or Destination for SWITCH)

      The following is an example of killing the blocker immediately – and in my example the switch happened in sub-second and indeed the session that was the blocker did get killed:

      ALTER TABLE [AdventureWorksDW2012].[dbo].[FactInternetSales] 
      SWITCH PARTITION 37 TO  [AdventureWorksDW2012].[dbo].[staging_FactInternetSales]
      WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = BLOCKERS));

One last positive aspect I wanted to call out…

The SQL Server Error Log provides some by-default auditing of the low priority lock wait usage, including information about the ABORT_AFTER_WAIT operation inline with the victim information:

Date       9/10/2013 1:37:15 PM
Log        SQL Server (Current – 9/10/2013 12:03:00 PM)
Source     spid51
Message
Process ID 57 was killed by an ABORT_AFTER_WAIT = BLOCKERS DDL statement on database_id = 5, object_id = 309576141.

And you’ll also see separate entries for the original operation itself. For example:

An ALTER TABLE SWITCH statement was executed on database 'AdventureWorksDW2012', table 'staging_FactInternetSales' by hostname 'WIN-4T7S36VMSD9', host process ID 1360 with target table 'AdventureWorksDW2012.dbo.FactInternetSales' using the WAIT_AT_LOW_PRIORITY options with MAX_DURATION = 1 and ABORT_AFTER_WAIT = BLOCKERS. Blocking user sessions will be killed after the max duration of waiting time.

This kind of logging is very useful for troubleshooting and auditing purposes and I’m glad to see it.

  One Response to “Exploring Low Priority Lock Wait Options in SQL Server 2014 CTP1”

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">