Connectors (BI, Kafka, Spark)

← MongoDB Feedback Engine

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Kafka Sink Connector ObjectId Support

The sink connector should support ObjectIds for the document.id.strategy. In other words, if a ObjectId hex string is provided for _id, the sink connector should be able to convert this to an ObjectId. I've scoured the documentation and the rest of the internet and have not been able to determine how to do this. I see an open PR on the github repo, but there has been no movement in almost 3 years. It's astonishing that I cannot use an ObjectId...

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Retry/reconnect mechanism Mongodb Source Connectors on MongoTimeoutException

The MongoDB Kafka Source and Sink connectors for the data streaming are working fine seamlessly until there are any errors in Kafka Source connectors.

If there is any error occurs, the connectors are not recovered from the timeout exceptions and need to be reposted/restarted connectors.

Exception:
com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
support regex in topic.namespace.map ( kafka source connector )

Currently, topic.namespace.map supports wildcard ( * ).
It will be helpful if it also support regex as well.
Or, since it can be breaking change, it would be nice if a new mapping is introduced that supports regex format.

The workaround can be using custom mapper.
However, if the there is an builtin version, then the wheel can invented once and used by many ppl.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
CFLE support for Kafka connector

Using MongoDB CFLE, The data need to be passed to downstream using Kafka connector which is consumed by downstream datalake for further processing. When data is pushed into Kafka it will remain encrypted and same encrypted data will land into the datalake, as encryption and decryption is happening at the driver level.

If Kafka connector supports CFLE encryption/decryption this would work seamlessly

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dynamic Topic Mapping on the basis of message content

Currently mongodb source connector only supports dynamic topic mapping on the bases of collections/databases. Can we extend it to support routing on the basis of message content?

Why it's important?
We've currently setup connector for a very large collection but since we wanted to route the data for different sections to different topics, we had to setup separate connector for each and define the filter in the pipeline section (it runs as an aggregate query on changestreams to filter out the relevant data). Now obviously it created performance concerns as large number of collscan queries started to run on these changestreams to filter out the data for respective connector. And also there's a restriction to create indexes on change streams so we can't take advantage of that. This became a problem for us.

Thus adding support for above would solve the issues for us as well for anyone who's following above as the workaround. I've seen some community posts of the same as well.

Currently mongodb source connector only supports dynamic topic mapping on the bases of collections/databases. Can we extend it to support routing on the basis of message content?

Why it's important?
We've currently setup connector for a very large collection but since we wanted to route the data for different sections to different topics, we had to setup separate connector for each and define the filter in the pipeline section (it runs as an aggregate query on changestreams to filter out the relevant data). Now obviously it created performance concerns as large number of collscan queries started to run on these…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change stream total ordering within a transaction

Right now, change stream will flatten out a transaction into individual operations, i.e., if we do two operations within a transaction, change stream will generate two events.

However there is no total ordering of the events that happens within the transaction. This is important if we do two updates on the same object within a transaction, since both operations share the same optime (clusterTime), there is no way to provide the ordering between these two events.

It would be helpful if we can provide an ordering number on every events from the same transaction for this purpose.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Kafka connector to support Kafka Schema Registry

One of the issues that our team has been talking about is when getting data from MongoDB, via a Kafka connector, and sending it through to Kafka we try to enforce schemas in Kafka but that schema is not enforced on the MongoDB data. This leads to developers needing to make sure they let the Data Engineering team know when their schema evolves so we can accommodate that change in the Avro schema. Our thought is to potentially have the developers use the Confluent Schema Registry to serialize their data to Avro prior to writing it to MongoDB. This would ensure that downstream consumption is aware of any new fields without having to alert anyone. Is that something that MongoDB supports or recommends? We are completely open to any other recommendations.

One of the issues that our team has been talking about is when getting data from MongoDB, via a Kafka connector, and sending it through to Kafka we try to enforce schemas in Kafka but that schema is not enforced on the MongoDB data. This leads to developers needing to make sure they let the Data Engineering team know when their schema evolves so we can accommodate that change in the Avro schema. Our thought is to potentially have the developers use the Confluent Schema Registry to serialize their data to Avro prior to writing it to MongoDB. This would…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
built-in CDC to Kafka

Hi,

it is still hard to set up a MongoDB oplog CDC connection to Kafka to publish changes from e.g. a microservice-local MongoDB. You typically have to use Kafka Connect and either the official MongoDB Atlas Connector, or the Debezium Open Source Connector.

One of the databases competing with MongoDB, CockroachDB, has a built-in feature to publish "change feeds" to Kafka (see https://www.cockroachlabs.com/docs/stable/stream-data-out-of-cockroachdb-using-changefeeds.html).

I'd love to see a similar feature for MongoDB, since this would allow us to keep MongoDB and Kafka in sync much easier and more conveniently - without having to care about yet another (probably centralized) cluster (=Kafka Connect).

Best regards,
Ralph Debusmann (ralph.debusmann@bosch.com)

Hi,

it is still hard to set up a MongoDB oplog CDC connection to Kafka to publish changes from e.g. a microservice-local MongoDB. You typically have to use Kafka Connect and either the official MongoDB Atlas Connector, or the Debezium Open Source Connector.

One of the databases competing with MongoDB, CockroachDB, has a built-in feature to publish "change feeds" to Kafka (see https://www.cockroachlabs.com/docs/stable/stream-data-out-of-cockroachdb-using-changefeeds.html).

I'd love to see a similar feature for MongoDB, since this would allow us to keep MongoDB and Kafka in sync much easier and more conveniently - without having to care about yet another (probably centralized)…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Ignore heartbeats-mongodb topic by default

As per KAFKA-208, SMTs can't be applied to the heartbeats-mongodb topic. Users should not have to configure each connector to ignore this topic. Please either ignore this topic by default or provide a command-line switch so it can be ignored.

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Get schema validation "feedback" in Kafka Mongo Sink Connector

Objective :

We want to be able to validate that data matches some requirements. We would like to to perform this data validation by adding a JSON schema in Mongo (such as it is described here : https://docs.mongodb.com/manual/core/schema-validation/).

Problem is that current implementation of the current Mongo DB Kafka Sink connector does not implement the required elements to benefit from features brought by this KIP : https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors

So if we define such a validation on Mongo, if a message has a value that does not match the definition, it would not go in the dead letter queue, and the connector would "die".

We would like to enhance the current implementation of the Mongo Sink connector to benefit from it.

Objective :

We want to be able to validate that data matches some requirements. We would like to to perform this data validation by adding a JSON schema in Mongo (such as it is described here : https://docs.mongodb.com/manual/core/schema-validation/).

Problem is that current implementation of the current Mongo DB Kafka Sink connector does not implement the required elements to benefit from features brought by this KIP : https://cwiki.apache.org/confluence/display/KAFKA/KIP-610%3A+Error+Reporting+in+Sink+Connectors

So if we define such a validation on Mongo, if a message has a value that does not match the definition, it would not go in the dead letter queue, and the…

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
MongoDB Sink Connector CDC default handler

I would like to have a default CDC handler that can process data produced from MongoDB Source Connector without Debezium https://docs.mongodb.com/kafka-connector/master/kafka-sink-cdc#cdc-handler-configuration

3 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

2 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dropbox

Lead with the drop that will very allow. Permit to be inside and all the time useful !!
stay tuned.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Kafka source connector once only semantics

Added as a suppport case here : https://support.mongodb.com/case/00634630

When using the connector as a Source, i.e we capture change streams from the Source Mongo DB and stream that to a Kafka endpoint.

Imagine these are updates on financial transactions in mongodb and they are NOT tolerant to
1) missed data and
2) duplicated data
in that order.

So, we need to make sure that the Change Streams that we are observing(matching) on, are delivered once and exactly once to the Kafka pipeline. (Blog on the same : https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/). If exactly-once semantics are enabled, it makes commits transactional by default.

We were informed verbally that the connector supports once-only-semantics, however if it is not available in the worst case, we would need atleast-once-semantics enabled to make sure that we do not loose data in the any case.

Added as a suppport case here : https://support.mongodb.com/case/00634630

When using the connector as a Source, i.e we capture change streams from the Source Mongo DB and stream that to a Kafka endpoint.

Imagine these are updates on financial transactions in mongodb and they are NOT tolerant to
1) missed data and
2) duplicated data
in that order.

So, we need to make sure that the Change Streams that we are observing(matching) on, are delivered once and exactly once to the Kafka pipeline. (Blog on the same : https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/). If exactly-once semantics are enabled, it makes commits transactional by default.
…

4 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

Don't see your idea?

Connectors (BI, Kafka, Spark)

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

Kafka Sink Connector ObjectId Support

Retry/reconnect mechanism Mongodb Source Connectors on MongoTimeoutException

support regex in topic.namespace.map ( kafka source connector )

CFLE support for Kafka connector

Dynamic Topic Mapping on the basis of message content

Change stream total ordering within a transaction

Kafka connector to support Kafka Schema Registry

built-in CDC to Kafka

Ignore heartbeats-mongodb topic by default

Get schema validation "feedback" in Kafka Mongo Sink Connector

MongoDB Sink Connector CDC default handler

Dropbox

Kafka source connector once only semantics

Feedback

Connectors (BI, Kafka, Spark)

Feedback and Knowledge Base

Searching…

Give feedback

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Connectors (BI, Kafka, Spark)

Categories

Searching…