Debezium logging
Debezium has extensive logging built into its connectors, and you can change the logging configuration to control which of these log statements appear in the logs and where those logs are sent. Debezium (as well as Kafka, Kafka Connect, and Zookeeper) use the Log4j logging framework for Java.
By default, the connectors produce a fair amount of useful information when they start up, but then produce very few logs when the connector is keeping up with the source databases. This is often sufficient when the connector is operating normally, but may not be enough when the connector is behaving unexpectedly. In such cases, you can change the logging level so that the connector generates much more verbose log messages describing what the connector is doing and what it is not doing.
Logging concepts
Before configuring logging, you should understand what Log4J loggers, log levels, and appenders are.
Loggers
Each log message produced by the application is sent to a specific logger
(for example, io.debezium.connector.mysql
).
Loggers are arranged in hierarchies.
For example, the io.debezium.connector.mysql
logger is the child of the io.debezium.connector
logger,
which is the child of the io.debezium
logger.
At the top of the hierarchy,
the root logger defines the default logger configuration for all of the loggers beneath it.
Log levels
Every log message produced by the application also has a specific log level:
-
ERROR
- errors, exceptions, and other significant problems -
WARN
- potential problems and issues -
INFO
- status and general activity (usually low-volume) -
DEBUG
- more detailed activity that would be useful in diagnosing unexpected behavior -
TRACE
- very verbose and detailed activity (usually very high-volume)
Appenders
An appender is essentially a destination where log messages are written. Each appender controls the format of its log messages, giving you even more control over what the log messages look like.
To configure logging, you specify the desired level for each logger and the appender(s) where those log messages should be written. Since loggers are hierarchical, the configuration for the root logger serves as a default for all of the loggers below it, although you can override any child (or descendant) logger.
Understanding the default logging configuration
If you are running Debezium connectors in a Kafka Connect process,
then Kafka Connect uses the Log4j configuration file (for example, /opt/kafka/config/connect-log4j.properties
) in the Kafka installation.
By default, this file contains the following configuration:
log4j.rootLogger=INFO, stdout (1)
log4j.appender.stdout=org.apache.log4j.ConsoleAppender (2)
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout (3)
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n (4)
...
1 | The root logger, which defines the default logger configuration.
By default, loggers include INFO , WARN , and ERROR messages.
These log messages are written to the stdout appender. |
2 | The stdout appender writes log messages to the console (as opposed to a file). |
3 | The stdout appender uses a pattern matching algorithm to format the log messages. |
4 | The pattern for the stdout appender (see the Log4j documentation for details). |
Unless you configure other loggers,
all of the loggers that Debezium uses inherit the rootLogger
configuration.
Configuring logging
By default, Debezium connectors write all INFO
, WARN
, and ERROR
messages to the console.
However, you can change this configuration in the following ways:
There are other methods that you can use to configure Debezium logging with Log4j. For more information, search for tutorials about setting up and using appenders to send log messages to specific destinations. |
Changing the logging level
The default Debezium logging level provides sufficient information to show whether a connector is healthy or not. However, if a connector is not healthy, you can change its logging level to troubleshoot the issue.
In general, Debezium connectors send their log messages to loggers with names that match the fully-qualified name of the Java class that is generating the log message. Debezium uses packages to organize code with similar or related functions. This means that you can control all of the log messages for a specific class or for all of the classes within or under a specific package.
-
Open the
log4j.properties
file. -
Configure a logger for the connector.
This example configures loggers for the MySQL connector and the database history implementation used by the connector, and sets them to log
DEBUG
level messages:log4j.properties... log4j.logger.io.debezium.connector.mysql=DEBUG, stdout (1) log4j.logger.io.debezium.relational.history=DEBUG, stdout (2) log4j.additivity.io.debezium.connector.mysql=false (3) log4j.additivity.io.debezium.relational.history=false (3) ...
1 Configures the logger named io.debezium.connector.mysql
to sendDEBUG
,INFO
,WARN
, andERROR
messages to thestdout
appender.2 Configures the logger named io.debezium.relational.history
to sendDEBUG
,INFO
,WARN
, andERROR
messages to thestdout
appender.3 Turns off additivity, which results in log messages not being sent to the appenders of parent loggers (this can prevent seeing duplicate log messages when using multiple appenders). -
If necessary, change the logging level for a specific subset of the classes within the connector.
Increasing the logging level for the entire connector increases the log verbosity, which can make it difficult to understand what is happening. In these cases, you can change the logging level just for the subset of classes that are related to the issue that you are troubleshooting.
-
Set the connector’s logging level to either
DEBUG
orTRACE
. -
Review the connector’s log messages.
Find the log messages that are related to the issue that you are troubleshooting. The end of each log message shows the name of the Java class that produced the message.
-
Set the connector’s logging level back to
INFO
. -
Configure a logger for each Java class that you identified.
For example, consider a scenario in which you are unsure why the MySQL connector is skipping some events when it is processing the binlog. Rather than turn on
DEBUG
orTRACE
logging for the entire connector, you can keep the connector’s logging level atINFO
and then configureDEBUG
orTRACE
on just the class that is reading the binlog:log4j.properties... log4j.logger.io.debezium.connector.mysql=INFO, stdout log4j.logger.io.debezium.connector.mysql.BinlogReader=DEBUG, stdout log4j.logger.io.debezium.relational.history=INFO, stdout log4j.additivity.io.debezium.connector.mysql=false log4j.additivity.io.debezium.relational.history=false log4j.additivity.io.debezium.connector.mysql.BinlogReader=false ...
-
Adding mapped diagnostic contexts
Most Debezium connectors (and the Kafka Connect workers) use multiple threads to perform different activities. This can make it difficult to look at a log file and find only those log messages for a particular logical activity. To make the log messages easier to find, Debezium provides several mapped diagnostic contexts (MDC) that provide additional information for each thread.
Debezium provides the following MDC properties:
dbz.connectorType
-
A short alias for the type of connector. For example,
MySql
,Mongo
,Postgres
, and so on. All threads associated with the same type of connector use the same value, so you can use this to find all log messages produced by a given type of connector. dbz.connectorName
-
The name of the connector or database server as defined in the connector’s configuration. For example
products
,serverA
, and so on. All threads associated with a specific connector instance use the same value, so you can find all of the log messages produced by a specific connector instance. dbz.connectorContext
-
A short name for an activity running as a separate thread running within the connector’s task. For example,
main
,binlog
,snapshot
, and so on. In some cases, when a connector assigns threads to specific resources (such as a table or collection), the name of that resource could be used instead. Each thread associated with a connector would use a distinct value, so you can find all of the log messages associated with this particular activity.
To enable MDC for a connector,
you configure an appender in the log4j.properties
file.
-
Open the
log4j.properties
file. -
Configure an appender to use any of the supported Debezium MDC properties.
In the following example, the
stdout
appender is configured to use these MDC properties:log4j.properties... log4j.appender.stdout.layout.ConversionPattern=%d{ISO8601} %-5p %X{dbz.connectorType}|%X{dbz.connectorName}|%X{dbz.connectorContext} %m [%c]%n ...
The configuration in the preceding example produces log messages similar to the ones in the following output:
... 2017-02-07 20:49:37,692 INFO MySQL|dbserver1|snapshot Starting snapshot for jdbc:mysql://mysql:3306/?useInformationSchema=true&nullCatalogMeansCurrent=false&useSSL=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&zeroDateTimeBehavior=convertToNull with user 'debezium' [io.debezium.connector.mysql.SnapshotReader] 2017-02-07 20:49:37,696 INFO MySQL|dbserver1|snapshot Snapshot is using user 'debezium' with these MySQL grants: [io.debezium.connector.mysql.SnapshotReader] 2017-02-07 20:49:37,697 INFO MySQL|dbserver1|snapshot GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'debezium'@'%' [io.debezium.connector.mysql.SnapshotReader] ...
Each line in the log includes the connector type (for example,
MySQL
), the name of the connector (for example,dbserver1
), and the activity of the thread (for example,snapshot
).
Configuring the log level in the Debezium container images
The Debezium container images for Zookeeper, Kafka, and Kafka Connect all set up their log4j.properties
file to configure the Debezium-related loggers.
All log messages are sent to the Docker container’s console (and thus the Docker logs).
The log messages are also written to files under the /kafka/logs
directory.
The containers use a LOG_LEVEL
environment variable to set the log level for the root logger.
You can use this environment variable to set the log level for the service running in the container.
When you start the container and set the value of this environment variable to a log level (for example, -e LOG_LEVEL=DEBUG
),
all of the code within the container then uses that log level.
There is also an option to override other log4j properties. If you want to configure log4j.rootLogger
differently, then use the environment variable CONNECT_LOG4J_LOGGERS
. For example to log only to stdout
(without appender
), you can use CONNECT_LOG4J_LOGGERS=INFO, stdout
. You can also set other supported
log4j environment variables with the CONNECT_LOG4J prefix, which will be mapped to properties in the log4j.properties
file by removing the CONNECT_
prefix, lowercasing all characters, and converting all '_' characters to '.'.
If you need more control over the logging configuration,
create a new container image that is based on ours,
except that in your Dockerfile
, copy your own log4j.properties
file into the image.
For example:
...
COPY log4j.properties $KAFKA_HOME/config/log4j.properties
...