Experience building CDC With Debezium, Kafka and Kafka Connect

1 minute read

Change Data Capture (or CDC) has been around for a while and there are many ways to achieve it with the most famous widely used is from reading logs of source database and replicate into the sink database. MySQL has achieved this with Binary Logs and PostgreSQL with its Write Ahead Log (or WAL). However, it is still currently not possible to do this cross-database-engine, from MySQL to PostgreSQL or vice versa. To be precise, there are solutions but they are commercial and may not use log-reading method like the original. However, that has changed with the coming of Debezium, the new open-source projects from Red Hat Enterprise created as a Kafka Connect modules that can read the log of different database engines, product a universal JSON format that can be ingested by another Kafka Connect connector to update the tables in the sink database.

In this post, I would not talk about how to implement them (yet) because it would be too long and because it has been written nicely in this article Streaming data to a downstream database by the staff working there so you can check it out. This post focuses on what are the problems when deploying into production. Of course, I will show you the solutions that I used, even though it may not be optimal, and there are some that I have not solved yet. However, I hope to share my knowledge with others and glad that I can contribute something back to the community. So no more introduction, let’s get to the show!

Share on

Twitter Facebook LinkedIn

Loc Nguyen Huu

Experience building CDC With Debezium, Kafka and Kafka Connect

Share on

Leave a comment

You may also enjoy

Setup beautiful Terminal for Windows 10 with WSL 2

Setting Up Airflow on Ubuntu 18.04 and Python 3.6.7- Step by Step Guide

Re-build Materialized View with Relationship Constraint

Data Users Categorization - Simplified