Sunday, April 21, 2019

Tracking Data Changes - Introduction

Unless you're managing a static archive database, the data in your databases will change. Odds are, at some point you're going to need to know about these changes for one reason or another. Perhaps you're using a data warehouse and need to retain history, or you need to keep track of slowly-changing dimensions. Perhaps you have an application that needs the latest data, and you want an easy way to feed it without querying the entire table for timestamps or update flags. Perhaps you have audit requirements and need to keep detailed change history.

Whatever the reasons, SQL Server has a few options to satisfy various use cases. Your options range from the 'roll up your sleeves and do it yourself' triggers, to Change Tracking, Change Data Capture, and Temporal Tables.  In this series, I'll show you these options and how to use them. We'll begin with Change Tracking, then move on to Change Data Capture, Temporal Tables, and conclude with triggers.

If you're using a platform other than SQL Server, please refer back to your RDBMS's documentation (or - let's be honest here - StackOverflow followed by your search engine of choice) for information about tracking data changes. Many platforms have similar functionality to handle this need, but if nothing else triggers are almost always a viable option, if a bit more manual work than we might like.

In SQL Server, some of the functionality depends on your version and edition. The following table provides a quick summary:

Tool Debut Edition
Change Tracking SQL Server 2008 All
Change Data Capture SQL Server 2008 Standard as of SQL Server 2016 SP1, Enterprise prior
Temporal Tables SQL Server 2016 All
Triggers Back in the dim mists All

Here is a (very) quick overview of the various options:
Change Tracking:
  • Uses an internal versioning mechanism to track changes
  • Just tells you that a change happened (it can tell you WHICH column was changed) - no history
  • Snapshot isolation, so be mindful of TempDB
  • Reasonably tolerant of schema changes
  • Requires a primary key
  • Automatically cleans up after itself
Change Data Capture:
  • Uses a log reader agent (hello sp_replcmds!) to capture change data
  • Can give you full change history
  • Does not play nicely with schema changes 
  • Allows two capture instances per table
  • Highly configurable - capture only columns you want
  • Automatically cleans up after itself
Temporal Tables:
  • Also requires a primary key
  • Also gives you the full change history
  • Uses period columns in the source table to feed the history table
  • Easy to query
  • In SQL Server 2017, introduced the capability to clean up after itself
Triggers:
  • Offer you amazing flexibility in designing your tracking
  • Let you capture metadata and stuff the other solutions may not allow
  • Require you to manage everything from soup to nuts
There is a lot more about each of these methods, of course, and in the course of this series I'll walk you through each in some detail. Please check back soon for an introduction to Change Tracking!

No comments: