Change Data Capture(CDC) in SQL Server

Change Data Capture (CDC) is a technology that can be used in incremental data transfers from OLTP systems to data warehouse systems. By storing the insert, update, and delete operations, it stores the first and last version of the record in a change table. This table where the changes are stored is called a change table.

It then reflects these changes to the destination that you want to transfer. We can do similar operations using trigger, but since CDC takes the changes from the log file, it both works more efficiently and makes our processes considerably easier.

Let’s continue with an example as we always do.

Check CDC Enabled Databases

First of all, with the help of the following script, let’s check if there is a database on the instance where CDC is enabled.

Enable CDC

Then, enable the CDC for the database we want to use the CDC with the help of the following script.

After running the above script, a schema named cdc and some tables will be created in this schema. The list of these tables is as follows. You can see these tables under the system tables in the database in which the CDC is enabled.

cdc.captured_columns This table stores the list of captured columns
cdc.change_tables This table stores the list of captured tables
cdc.ddl_history This table stores the history of all DDL operations after the CDC is enabled
cdc.index_columns This table stores the changes of the indexes of the columns captred with CDC.
cdc.lsn_time_mapping This table stores the change and time of transactions

List CDC Enabled Tables

CDC can be enabled based on the table in the database. You can use the following script to list the tables where the CDC was enabled.

Enable CDC on a Table

You can enable CDC on a table with the help of the following script

If the SQL Server Agent service is not started, you will receive a message as follows. To avoid this error, you must start the SQL Server Agent service. The reason you receive the error is that the CDC creates two jobs and the SQL Server Agent Service is not started, so these jobs cannot be executed. After starting the service, you don’t need to run the script above again.

SQLServerAgent is not currently running so it cannot be notified of this action.

After running the script, you will see that the CDC creates a new table named “cdc.dbo_Address_CT” under the system tables. In this table, you will see that there are some extra columns except the columns of the “dbo.Address” table.

 

Check if the CDC is working correctly

Perform an update as follows to check if the CDC is working correctly.

Check whether the change is reflected to the “change table”.

The values you will see in the _$operation column are as follows.

1 Delete Process
2 Insert Process
3 Before Update
4 After Update

When you enable CDC with the script I shared above on the table, it monitors and logs the changes made to all columns in the table. If you want to do this for specific columns in the table, you need to add the columns you want with the @captured_column_list parameter to the end of the script.

A function called “fn_cdc_get_all_changes_dbo_Address” has been created under the database when the CDC is enabled. You can also get specific results by querying this function. Example use;

Disable CDC

Finally, you can disable CDC as follows.

You can access Capture Instance by running the following stored procedure.

sys.sp_cdc_help_change_data_capture

After enabling the CDC, you can feed dataware house applications with SSIS packages. Below you will find examples.

https://docs.microsoft.com/en-us/sql/integration-services/change-data-capture/change-data-capture-ssis

dbtut
Author: dbtut

We are a team with over 10 years of database management and BI experience. Our Expertises: Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Elasticsearch, Kibana, Grafana.

Leave a Reply

Your email address will not be published. Required fields are marked *