Welcome to the latest edition of the Debezium community newsletter, in which we share all things CDC related including blog posts, group discussions, as well as StackOverflow questions that are relevant to our user community.
In case you missed our last edition, you can check it out here.
Upcoming Events
Due to the corona virus situation, many conferences the Debezium team had planned to attend, have been postponed or even cancelled. E.g. JavaDay Istanbul has been moved to September, and QCon Sao Paulo to December. We hope the situation will have improved by then and look forward to meeting again with the Debezium community in person eventually.
Until then, there’s a few virtual events you can enjoy; there’ll be a Debezium session at the Red Hat Summit 2020 - Virtual Experience. We’re also planning to do another episode on Debezium at DevNation Live. If you’d like to have a session on Debezium at your virtual meetup or conference, please get in touch!
Articles
There have been a number of blog posts about Debezium lately; here are some of the latest ones that you should not miss:
-
A two-part series discussing tailing a database transaction log using Debezium by Abdullah Yildirim: Part 1, Part 2
-
Streaming data changes to a Data Lake with Debezium and Delta Lake Pipeline by Yinon D. Nahamu.
-
Implementing the Outbox Pattern with CDC using Debezium by Thorben Janssen.
-
Debezium and Apache Camel integration scenario: Original blog by Jiri Pechanec (English), republished blog (Japanese)
-
Recording and slides from QCon and JokerConf where Gunnar Morling discusses practical CDC streaming use cases with Apache Kafka and Debezium.
-
Approaches to running Change Data Capture for Db2 by Luis Garcés-Erice, Sean Rooney, and Peter Urbanetz.
-
Lessons learned from running Debezium with PostgreSQL on Amazon RDS by Ashhar Hasan.
-
The 5 minute introduction to Log-based Change Data Capture with Debezium by Shekhar Gulati.
-
Distributed Data for Microservices — Event Sourcing vs. Change Data Capture: A Original post by Eric Murphy, Japanese translation
-
Series of blog posts about Debezium by Bhuvanesh "The Data Guy":
-
From PostgreSQL to Data Lake using Kafka and Debezium (Portuguese)
-
Google Cloud Platform recently published this repository illustrating an example of how to capture data from a MySQL database and sync it with BigQuery using Cloud Dataflow and Debezium.
-
A recent spike of interest in being able to use Debezium with GCPcloud’s managed PostgreSQL service. We recommend if you’re interested in seeing CloudSQL support for Debezium, give the issue an up-vote.
-
A very special episode of the Data Engineering Podcast by Tobias Macey, together with Debezium project founder Randall Hauch and Gunnar Morling
Please also check out our compiled list of resources around Debezium for even more related posts, articles and presentations.
Examples
An example is an excellent way to get a better understanding of how or why something behaves as it does. Debezium’s examples repository has undergone several changes recently we’d like to highlight:
We also discovered a very helpful tool for visualizing the contents of Docker Compose files. So we’ve begun to add diagrams like this one for the kstreams-live-update demo to the examples, helping to familiarize with the examples more easily:
KStreams Live Update Example Topology
Time to Upgrade
Debezium version 1.1.0.Final was released last week. If you are using an older version, we urge you to check out the latest major release. For details on the bug fixes, enhancements, and improvements that spanned 5 releases, check out the release-notes.
The Debezium team has also begun active development on the next major version, 1.2. The major focus in 1.2 is implementing a standalone container to run Debezium without Apache Kafka and Connect, enabling users to send change events to Kinesis and other platforms more easily.
Keep an eye on our releases page to get a jump start on what bug fixes, enhancements, and changes will be coming in 1.2 as they become available.
Questions and Answers
Using Debezium?
Our community users page includes a variety of organizations that are currently using Debezium. If you are a user of Debezium and would like to be included, please send us a GitHub pull request or reach out to us directly through our community channels found here.
And if you haven’t yet done so, please consider adding a ⭐ for the GitHub repo; keep them coming, we’re almost at 3,000 stars!
Getting Involved
It can often be overwhelming when starting to work on an existing code base.
We welcome community contributions and we want to make the process of getting started extremely easy.
Below is a list of open issues that are currently labeled with easy-starter
if you want to dive in quick.
-
Configure Avro serialization automatically when detecting link to schema registry (DBZ-59)
-
Support CREATE TABLE … LIKE syntax for blacklisted source table (DBZ-1496)
-
Explore SMT for Externalizing large column values (DBZ-1541)
-
Update the tutorial to use the Debezium tooling container image (DBZ-1572)
-
Debezium for SQL Server does not support reconnecting after the connection is broken (DBZ-1882)
Feedback
We intend to publish new additions to this newsletter periodically. Should anyone have any suggestions on changes or what could be highlighted here, we welcome that feedback. You can reach out to us via any of our community channels found here.
And most importantly, stay safe and healthy wherever you are!
About Debezium
Debezium is an open source distributed platform that turns your existing databases into event streams, so applications can see and respond almost instantly to each committed row-level change in the databases. Debezium is built on top of Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, so your application can be stopped and restarted at any time and can easily consume all of the events it missed while it was not running, ensuring that all events are processed correctly and completely. Debezium is open source under the Apache License, Version 2.0.
Get involved
We hope you find Debezium interesting and useful, and want to give it a try. Follow us on Twitter @debezium, chat with us on Zulip, or join our mailing list to talk with the community. All of the code is open source on GitHub, so build the code locally and help us improve ours existing connectors and add even more connectors. If you find problems or have ideas how we can improve Debezium, please let us know or log an issue.