Kafka source connector once only semantics
Added as a suppport case here : https://support.mongodb.com/case/00634630
When using the connector as a Source, i.e we capture change streams from the Source Mongo DB and stream that to a Kafka endpoint.
Imagine these are updates on financial transactions in mongodb and they are NOT tolerant to
1) missed data and
2) duplicated data
in that order.
So, we need to make sure that the Change Streams that we are observing(matching) on, are delivered once and exactly once to the Kafka pipeline. (Blog on the same : https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/). If exactly-once semantics are enabled, it makes commits transactional by default.
We were informed verbally that the connector supports once-only-semantics, however if it is not available in the worst case, we would need atleast-once-semantics enabled to make sure that we do not loose data in the any case.
-
Robert commented
Thank you for your feedback, we clarified the message delivery support.
https://docs.mongodb.com/kafka-connector/current/kafka-source#message-delivery-guarantee