T-SQL Tuesday #33 : Trick Shots : Schema Switch-A-Roo

This month's T-SQL Tuesday is being hosted by Mike Fal (blog | twitter), and the topic is Trick Shots, where we're invited to tell the community about some solution we used in SQL Server that felt, at least to us, as a sort of "trick shot" – something similar to using massé, "English" or complicated bank shots in billiards or snooker. After working with SQL Server for some 15 years, I've had the occasion to come up with tricks to solve some pretty interesting problems, but one that seems to be quite reusable, easily adapts to many situations, and is simple to implement, is something I call "schema switch-a-roo."

Let's say you have a scenario where you have a large lookup table that needs to get refreshed periodically. This lookup table is needed across many servers and can contain data that gets populated from an external or 3rd party source, e.g. IP or domain data, or can represent data from within your own environment.

The first couple of scenarios where I needed a solution for this were making metadata and denormalized data available to read-only "data caches" – really just SQL Server MSDE (and later Express) instances installed on various web servers, so the web servers pulled this cached data locally instead of bothering the primary OLTP system. This may seem redundant, but off-loading read activity away from the primary OLTP system, and being able to take the network connection out of the equation completely, led to a real bump in all-around performance and, most notably, for end users.

These servers did not need up-to-the minute copies of the data; in fact, a lot of the cache tables were only updated daily. But since the systems were 24×7, and some of these updates could take several minutes, they often got in the way of real customers doing real things on the system.

The Original Approach(es)

At the very beginning, the code was rather simplistic: we deleted rows that had been removed from the source, updated all the rows that we could tell had changed, and inserted all the new rows. It looked something like this (error handling etc. removed for brevity):

BEGIN TRANSACTION;

DELETE dbo.Lookup 
  WHERE [key] NOT IN 
  (SELECT [key] FROM [source]);

UPDATE d SET [col] = s.[col]
  FROM dbo.Lookup AS d
  INNER JOIN [source] AS s
  ON d.[key] = s.[key]
  -- AND [condition to detect change];

INSERT dbo.Lookup([cols]) 
  SELECT [cols] FROM [source]
  WHERE [key] NOT IN 
  (SELECT [key] FROM dbo.Lookup);

COMMIT TRANSACTION;

Needless to say this transaction could cause some real performance issues when the system was in use. Surely there were other ways to do this, but every method we tried was equally slow and expensive. How slow and expensive? "Let me count the scans…"

Since this pre-dated MERGE, and we had already discarded "external" approaches like DTS, through some testing we determined that it would be more efficient to just wipe the table and re-populate it, rather than to try and sync to the source:

BEGIN TRANSACTION;

TRUNCATE TABLE dbo.Lookup;

INSERT dbo.Lookup([cols]) 
  SELECT [cols] FROM [source];

COMMIT TRANSACTION;

Now, as I explained, this query from [source] could take a couple of minutes, especially if all of the web servers were being updated in parallel (we tried to stagger where we could). And if a customer was on the site and trying to run a query involving the lookup table, they had to wait for that transaction to finish. In most cases, if they're running this query at midnight, it wouldn't really matter if they got yesterday's copy of the lookup data or today's; so, making them wait for the refresh seemed silly, and actually did lead to a number of support calls.

So while this was better, it was certainly far from perfect.

My Initial Solution : sp_rename

My initial solution, back when SQL Server 2000 was cool, was to create a "shadow" table:

CREATE TABLE dbo.Lookup_Shadow([cols]);

This way I could populate the shadow table without interrupting users at all, and then perform a three-way rename – a fast, metadata-only operation – only after the population was complete. Something like this (again, grossly simplified):

TRUNCATE TABLE dbo.Lookup_Shadow;

INSERT dbo.Lookup_Shadow([cols]) 
  SELECT [cols] FROM [source];

BEGIN TRANSACTION;

  EXEC sp_rename N'dbo.Lookup',        N'dbo.Lookup_Fake';
  EXEC sp_rename N'dbo.Lookup_Shadow', N'dbo.Lookup';

COMMIT TRANSACTION;

-- if successful:
EXEC sp_rename N'dbo.Lookup_Fake', N'dbo.Lookup_Shadow';

The downside to this initial approach was that sp_rename has a non-suppressible output message warning you about the dangers of renaming objects. In our case we performed this task through SQL Server Agent jobs, and we handled a lot of metadata and other cache tables, so the job history was flooded with all these useless messages and actually caused real errors to be truncated from the history details. (I complained about this in 2007, but my suggestion was ultimately dismissed and closed as "Won't Fix.")

A Better Solution : Schemas

Once we upgraded to SQL Server 2005, I discovered this fantastic command called CREATE SCHEMA. It was trivial to implement the same type of solution using schemas instead of renaming tables, and now the Agent history wouldn't be polluted with all of these unhelpful messages. Basically I created two new schemas:

CREATE SCHEMA fake   AUTHORIZATION dbo;
CREATE SCHEMA shadow AUTHORIZATION dbo;

Then I moved the Lookup_Shadow table into the cache schema, and renamed it:

ALTER SCHEMA shadow TRANSFER dbo.Lookup_Shadow;

EXEC sp_rename N'shadow.Lookup_Shadow', N'Lookup';

(If you are just implementing this solution, you'd be creating a new copy of the table in the schema, not moving the existing table there and renaming it.)

With those two schemas in place, and a copy of the Lookup table in the shadow schema, my three-way rename became a three-way schema transfer:

TRUNCATE TABLE shadow.Lookup;

INSERT shadow.Lookup([cols]) 
  SELECT [cols] FROM [source];

-- perhaps an explicit statistics update here

BEGIN TRANSACTION;

  ALTER SCHEMA fake TRANSFER     dbo.Lookup;
  ALTER SCHEMA dbo  TRANSFER  shadow.Lookup;

COMMIT TRANSACTION;

ALTER SCHEMA shadow TRANSFER fake.Lookup;

At this point you can of course empty out the shadow copy of the table, however in some cases I found it useful to leave the "old" copy of the data around for troubleshooting purposes:

TRUNCATE TABLE shadow.Lookup;

Anything further that you do with the shadow copy, you'll want to make sure you do outside of the transaction – the two transfer operations should be as concise and quick as possible.

Some Caveats

Foreign Keys
This won't work out of the box if the lookup table is referenced by foreign keys. In our case we didn't point any constraints at these cache tables, but if you do, you may have to stick with intrusive methods such as MERGE. Or use append-only methods and disable or drop the foreign keys before performing any data modifications (then re-create or re-enable them afterward). If you stick with MERGE / UPSERT techniques and you're doing this between servers or, worse yet, from a remote system, I highly recommend getting the raw data locally rather than trying to use these methods between servers.
Statistics
Switching the tables (using rename or schema transfer) will lead to statistics flipping back and forth between the two copies of the table, and this can obviously be an issue for plans. So you may consider adding explicit statistics updates as part of this process.
Other Approaches
There are of course other ways to do this that I simply haven't had the occasion to try. Partition switching and using a view + synonym are two approaches I may investigate in the future for a more thorough treatment of the topic. I'd be interested to hear your experiences and how you've solved this problem in your environment. And yes, I realize that this problem is largely solved by Availability Groups and readable secondaries in SQL Server 2012, but I consider it a "trick shot" if you can solve the problem without throwing high-end licenses at the problem, or replicating an entire database to make a few tables redundant. :-)

Conclusion

If you can live with the limitations here, this approach may well be a better performer than a scenario where you essentially take a table offline using SSIS or your own MERGE / UPSERT routine, but please be sure to test both techniques. The most significant point is that the end user accessing the table should have the exact same experience, any time of the day, even if they hit the table in the middle of your periodic update.

11 thoughts on “T-SQL Tuesday #33 : Trick Shots : Schema Switch-A-Roo”

Pingback: T-SQL Tuesday #33 Wrap-up (#tsql2sday) | Art of the DBA
Jeremy says:

May 12, 2014 at 2:56 PM

Aaron – curious if you ever run into issues with Schema Mod locks and blocking with this approach?
Aaron Bertrand says:

May 14, 2014 at 7:12 PM

Hi Jeremy,

No, this was never an issue in my environments. I made sure to keep the schema transfer operations in their own transaction, so begin, perform the metadata switch, commit. This would never hold anybody up, and I was careful to not implement this on tables where there were ever very large, long-running transactions. Even if I did, then these switch operations would wait, but once they got their turn in line, they'd happen so quickly that there wouldn't be any trickle-down effect to having waited…
Ian Pike says:

March 14, 2015 at 8:43 AM

Hi

We did something similar but used a view over the data table. Then an SSIS prepared the data in a data prefixed table and updated the view to the new table. We have a back-up of the table if required.

We tested the update while querying the view to ensure it could be done and there was minimal downtime (if any).

This means the calling app (web service in our case) just needs to reference the view and the process does the rest in the background.

Ian
Paul says:

May 21, 2015 at 3:03 PM

I switch like this, tables need to be identical though:

alter table schema1.tableName switch to schema2.tableName ;
Shalom says:

January 4, 2017 at 10:26 PM

Hi – I tried this with a table name "test" and I'm getting these errors because the tables already exist:
Msg 15530, Level 16, State 1, Line 10
The object with name "test" already exists.
Msg 15530, Level 16, State 1, Line 11
The object with name "test" already exists.
Msg 15530, Level 16, State 1, Line 15
The object with name "test" already exists.

Perhaps I missed something. Please let me know what I did wrong.

Thanks
Aaron Bertrand says:

January 4, 2017 at 10:52 PM

Hard to be sure without seeing your actual code.
SHalom says:

January 4, 2017 at 11:09 PM

CREATE SCHEMA fake AUTHORIZATION dbo;
CREATE SCHEMA shadow AUTHORIZATION dbo;

CREATE TABLE [dbo].[Test](
[Test] [nchar](10) NULL
) ON [PRIMARY]

INSERT INTO dbo.Test
SELECT 'A'

CREATE TABLE fake.[Test](
[Test] [nchar](10) NULL
) ON [PRIMARY]

CREATE TABLE shadow.[Test](
[Test] [nchar](10) NULL
) ON [PRIMARY]

TRUNCATE TABLE shadow.test;

INSERT shadow.test
SELECT 'B'

BEGIN TRANSACTION;

ALTER SCHEMA fake TRANSFER dbo.test;
ALTER SCHEMA dbo TRANSFER shadow.test;

COMMIT TRANSACTION;

ALTER SCHEMA shadow TRANSFER fake.test;
Aaron Bertrand says:

January 5, 2017 at 2:32 AM

I see. You created three copies of the table.

It's a shell game – one of the schemas has to be empty to start, otherwise you're stuck and can't transfer any tables anywhere.

Try again but this time leave out the CREATE TABLE fake.[Test](...); command.
Shalom says:

January 5, 2017 at 3:33 PM

Got it
Thank you
chami says:

February 10, 2018 at 8:17 AM

thanks lot. you make my life easy man!!

Comments are closed.