Saturday, August 17, 2019

Tracking Data Changes - Change Tracking, Part 2

In my last post, I showed you how to enable, configure, and disable change tracking at the database level. Very good, as far as that goes, but you're not going to get very far without tracking some actual data changes!

To recap, enabling change tracking at the database level is very simple:
ALTER DATABASE <database name here> 
SET CHANGE_TRACKING = <ON | OFF>
(CHANGE_RETENTION = <positive integer> <DAYS | HOURS | MINUTES>, AUTO_CLEANUP = <ON | OFF>);

(Note on the code snippets: I'm using Markdown for this post to simplify my blogging workflow; please let me know if you'd rather I keep using Gists for code samples)

If you don't provide a retention period, SQL Server's default is 2 days. Auto-cleanup defaults to ON unless you tell it otherwise.

Easy!

The table level commands aren't any more complicated. Before we get started, please note that change tracking requires a primary key on the table you want to track. This is reasonable - you need some kind of unique identifier to tell you which row has changed.

With the PK requirement in mind, on we go!

Much as enabling change tracking at the database level uses a simple ALTER DATABASE command, enabling change tracking at the table level uses a simple ALTER TABLE command:
ALTER TABLE Demo_CT
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);

There's not a lot going on here either. You have three things to specify:
  1. The table you want to track changes on
  2. Enabling or (spoiler alert!) disabling change tracking
  3. Whether you want to track which column(s) were updated or not
As you may have guessed, if you want to disable change tracking, just change ENABLE to DISABLE and omit the WITH clause.

Speaking of that WITH, if you don't specify a setting for TRACK_COLUMNS_UPDATED, SQL Server will default it to OFF in order to reduce space requirements for change tracking.

While we're talking about TRACK_COLUMNS_UPDATED, it does just what it says on the tin. This is a good time to remind you that unlike other methods of tracking changes, all this does is tell you that a given record was updated and which columns were updated - it does not give you any state data. We'll cover this in more detail in a future post.

A word of warning: you can't modify the TRACK_COLUMNS_UPDATED setting without disabling and re-enabling change tracking on the table.

To see which tables have change tracking enabled, query the catalog view sys.change_tracking_tables:

As with sys.change_tracking_databases, the information is fairly basic. Kendra Little's fantastic article I referenced in my last post has a great script for this, too:
-- Kendra Little made this too; it's also posted at https://www.brentozar.com/archive/2014/06/performance-tuning-sql-server-change-tracking/
SELECT sc.name AS tracked_schema_name,
    so.name AS tracked_table_name,
    ctt.is_track_columns_updated_on,
    ctt.begin_version /*when CT was enabled, or table was truncated */,
    ctt.min_valid_version /*syncing applications should only expect data on or after this version */ ,
    ctt.cleanup_version /*cleanup may have removed data up to this version */
FROM sys.change_tracking_tables AS ctt
JOIN sys.objects AS so 
    ON ctt.[object_id]=so.[object_id]
JOIN sys.schemas AS sc 
    ON so.schema_id=sc.schema_id;

Of course, feel free to write your own if you need different information!

For our simple example, it gives us:

Joining on sys.schemas and sys.objects, Kendra's script yet again gives us human-readable results. I can't emphasize enough how important it is to script for readability.

As I mentioned before, disabling change tracking at the table level is equally simple:
ALTER TABLE Demo_CT
DISABLE CHANGE_TRACKING;

Don't forget - if you wish to disable change tracking at the database level, you'll need to disable change tracking for all tables first.

Thank you for continuing to read this series! Please join me next time when I show you around working with change tracking!

Sunday, August 11, 2019

Tracking Data Changes - Change Tracking, Part 1

Let's begin our adventure into tracking data changes in SQL Server with change tracking. Change tracking is a simple tool in SQL Server that does exactly what it says on the tin: tracks changes.

Microsoft provides excellent documentation on change tracking here. Microsoft has done a good job of retaining a lot of documentation for versions of SQL Server prior to 2014, but there are some gaps. Change tracking happily hasn't changed much since it was introduced in SQL Server 2008, so most of this information is valid for SQL Server 2012, 2008R2, and 2008.

SQL Server 2014 SP2 does introduce a documented stored procedure that you can invoke to clean up the internal change table (and don't worry, we'll cover this in due course!), but that's about it as far as documented functional changes go (note: if any of you hardcore internals folks know of under-the-hood changes between versions, even if they work the same on the surface, please let me know! I love that kind of detail!). That said, there's always the chance that future versions will introduce differences, so keep that in mind.

Anyway, moving forward! Here is a quick overview of change tracking:

  • The change source is an in-memory rowstore, flushed to disk with each checkpoint
  • It answers two questions:
    • Has a row changed?
    • What rows have changed?
  • The only values from the source table stored in the change table are the primary key values for the changed rows
  • You have to use the built-in functions to get at the change data
  • The database you want to track changes in must have a compatibility level of at least 90
    • Filed under 'ways the database engine will let you score an own goal', SQL Server will still let you enable change tracking on a database if the compatibility level is lower than 90, but the functions used to retrieve change data will give you an error

Change tracking is a fairly simple tool to set up and use. In order to enable change tracking, you must first enable it at the database level via an ALTER DATABASE statement, and then at the table level via an ALTER TABLE statement.

Let's begin at the database level. The ALTER DATABASE statement looks something like...

There aren't a lot of parameters here. You can set change tracking on or off, you can specify your retention period, and you can specify whether to enable auto-cleanup or not.

For the retention period, you have the choice of DAYS, HOURS, or MINUTES. If you don't specify retention, SQL Server will configure change tracking with a retention period of 2 days. If you provide a number but don't specify the interval, SQL Server will default to days. The minimum retention period is 1 minute.

For auto-cleanup, if you specify OFF SQL Server will not automatically clean up change tracking data. If you don't specify auto-cleanup, SQL Server will default to ON. Unless you want to be responsible for cleaning up after change tracking, ON is your best bet.

To see which databases have change tracking enabled, and to look at their configuration, query the sys.change_tracking_databases catalog view:



It covers the basics, but it could be more informative. Kendra Little has a nicer script:

For our simple example, it gives us:

By joining on sys.databases to provide the database name and omitting the retention_period_units in favor of the description, Kendra's script gives us much more readable results. You'll notice that this script omits the max_cleanup_version column. We'll come back to that, and when it's useful, in a future post.

As an aside, while it can sometimes be a pain to script for readability, it almost always pays off in the long run.

Disabling change tracking is similarly simple. As with enabling change tracking, you disable it with a very straightforward ALTER DATABASE command:

You will need to disable change tracking on all tables before disabling it at the database level, but as we haven't looked at the table level commands yet we'll just pretend we've already done that.

You can change the database-level settings for change tracking with a similar ALTER DATABASE script. Just omit the = ON or OFF:

You can modify either or both the retention period and the auto-cleanup setting.

That wraps it up for this post! Join me next time when we look at enabling change tracking at the table level!