Connectors (BI, Kafka, Spark)

← MongoDB Feedback Engine

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Kafka connector to support Kafka Schema Registry

One of the issues that our team has been talking about is when getting data from MongoDB, via a Kafka connector, and sending it through to Kafka we try to enforce schemas in Kafka but that schema is not enforced on the MongoDB data. This leads to developers needing to make sure they let the Data Engineering team know when their schema evolves so we can accommodate that change in the Avro schema. Our thought is to potentially have the developers use the Confluent Schema Registry to serialize their data to Avro prior to writing it to MongoDB. This would ensure that downstream consumption is aware of any new fields without having to alert anyone. Is that something that MongoDB supports or recommends? We are completely open to any other recommendations.

One of the issues that our team has been talking about is when getting data from MongoDB, via a Kafka connector, and sending it through to Kafka we try to enforce schemas in Kafka but that schema is not enforced on the MongoDB data. This leads to developers needing to make sure they let the Data Engineering team know when their schema evolves so we can accommodate that change in the Avro schema. Our thought is to potentially have the developers use the Confluent Schema Registry to serialize their data to Avro prior to writing it to MongoDB. This would…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Provide an alternative sampling technique for views in MongoBI Connector

As stated in the docs (https://docs.mongodb.com/manual/reference/operator/aggregation/sample/#behavior), the $sample stage will perform a full COLLSCAN if some criteria won't be met. The problem is that no view (https://docs.mongodb.com/manual/core/views/) will ever meet these, as the $sample won't be the first aggregation stage.

My proposal is to implement additional, $skip+$limit based sampling.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Re-sample only part of the MongoBI Connector schema

It'd be great if we could run "FLUSH SAMPLE collectionName". It's rarely the case that a lot of collections have changed at once and such a "light re-sample" might be a good alternative.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Implement error handling in MongoBI Connector

Right now, if the $sample query (used to create the DRDL schema) fails, it'll re-run the query once again, forever. The only place when the error is accessible is in the "Profiler" tab of a given node. In our case - and I believe it's common - that it was a secondary node, which "Profiler" is far less accessible than the one for the primary node.

We learned it the hard way, as our cluster was silently maxing out one CPU core for over a week. After a lot of debugging, it turned out that one of our views has had a problem ($arrayToObject received a null key).

Right now, if the $sample query (used to create the DRDL schema) fails, it'll re-run the query once again, forever. The only place when the error is accessible is in the "Profiler" tab of a given node. In our case - and I believe it's common - that it was a secondary node, which "Profiler" is far less accessible than the one for the primary node.

We learned it the hard way, as our cluster was silently maxing out one CPU core for over a week. After a lot of debugging, it turned out that one of our views has…

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Can't fetch data on MongoDB ODBC via BI connector

Can't fetch data on MongoDB ODBC via BI connector. Test connections are successful but couldn't fetch the data from database servers, only getting 'information_schema' and 'mysql' as by default databases
- image2.png 19 KB
- image.png 16 KB
2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

3 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Deploy MongoDB BI Connector product using the MongoDB Kubernetes Operator

We would like to have the ability to deploy and run MongoDB BI connector as a container under MongoDB Kubernetes Operator. Currently there is no support on such deployments.

2 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Kafka Sink Connector ObjectId Support

The sink connector should support ObjectIds for the document.id.strategy. In other words, if a ObjectId hex string is provided for _id, the sink connector should be able to convert this to an ObjectId. I've scoured the documentation and the rest of the internet and have not been able to determine how to do this. I see an open PR on the github repo, but there has been no movement in almost 3 years. It's astonishing that I cannot use an ObjectId...

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Entity Framework (Preview 2) release date

Hello,

is there any ETA of PR2 (nuget)?

Thank you

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

2 comments · Other · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Extend the support for the MongoDB ODBC driver to RHEL8

At present there is only support for RHEL7, which is effectively out of date. This would support more migrations to RHEL8.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
We have a MongoDB Sharded Cluster host thousands of collections. The schema file for bi connector is quite large. Every time we run a new qu

Current BI Connector behaviour is that it will do a listCollections command after an existing connection to BIC has been idle for 1-2 minutes. For databases with thousands of collections, this will take several minutes to complete. Afte rthe connection to BIC has been idle for a few minutes, the connection from BIC to MongoDB is dropped and when a new connection is created, the listCollecion command is called again. This overhead is unnecessary.
Request to have an option to keep the connection alive or some form of connection pooling mechanism so that listCollection will not be called over and over again.

Support case for reference: https://support.mongodb.com/case/01189673

Current BI Connector behaviour is that it will do a listCollections command after an existing connection to BIC has been idle for 1-2 minutes. For databases with thousands of collections, this will take several minutes to complete. Afte rthe connection to BIC has been idle for a few minutes, the connection from BIC to MongoDB is dropped and when a new connection is created, the listCollecion command is called again. This overhead is unnecessary.
Request to have an option to keep the connection alive or some form of connection pooling mechanism so that listCollection will not be called over and…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
support regex in topic.namespace.map ( kafka source connector )

Currently, topic.namespace.map supports wildcard ( * ).
It will be helpful if it also support regex as well.
Or, since it can be breaking change, it would be nice if a new mapping is introduced that supports regex format.

The workaround can be using custom mapper.
However, if the there is an builtin version, then the wheel can invented once and used by many ppl.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dynamic Topic Mapping on the basis of message content

Currently mongodb source connector only supports dynamic topic mapping on the bases of collections/databases. Can we extend it to support routing on the basis of message content?

Why it's important?
We've currently setup connector for a very large collection but since we wanted to route the data for different sections to different topics, we had to setup separate connector for each and define the filter in the pipeline section (it runs as an aggregate query on changestreams to filter out the relevant data). Now obviously it created performance concerns as large number of collscan queries started to run on these changestreams to filter out the data for respective connector. And also there's a restriction to create indexes on change streams so we can't take advantage of that. This became a problem for us.

Thus adding support for above would solve the issues for us as well for anyone who's following above as the workaround. I've seen some community posts of the same as well.

Currently mongodb source connector only supports dynamic topic mapping on the bases of collections/databases. Can we extend it to support routing on the basis of message content?

Why it's important?
We've currently setup connector for a very large collection but since we wanted to route the data for different sections to different topics, we had to setup separate connector for each and define the filter in the pipeline section (it runs as an aggregate query on changestreams to filter out the relevant data). Now obviously it created performance concerns as large number of collscan queries started to run on these…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Build a version of the BI Connector that is compatible with Alpine

We deploy the BI Connector within a container that runs the Alpine OS (so it's small and clean).
However Alpine is bundled with MULS and not the GLIBC libs which means the BI Connector cannot run in that container.

We have managed to get around this by using gcompat to layer in glibc. But that is not ideal for us as it adds complexity and rather defeats the purpose of using Alpine.

You build several versions of the BI Connector binary, are you able to build an Alpine compatible version as well?

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Change stream total ordering within a transaction

Right now, change stream will flatten out a transaction into individual operations, i.e., if we do two operations within a transaction, change stream will generate two events.

However there is no total ordering of the events that happens within the transaction. This is important if we do two updates on the same object within a transaction, since both operations share the same optime (clusterTime), there is no way to provide the ordering between these two events.

It would be helpful if we can provide an ordering number on every events from the same transaction for this purpose.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Two tiered model for authentication

The BI Connector facilitates large scale (1000s) of "personal data marts" by acting as a controlled go-between enduser tools like Tableau and a "main" data collection. It is not practical or even desirable to have pass-thru authentication of all these users to the backend mongodb database. Instead, the BI connector could use a special collection in a mongodb instance (not necessarily the target!) to hold SHA(password), name, and YAML equivalent. When started, the mongosqld would verify command line inputs of SHA(password) and name and context, etc. and if OK, would exec an appropriately password-protected endpoint at 3307 with the config already loaded from the special collection. Security is still end-to-end enabled, but now it becomes MUCH easier to manage many users because the capability scope has been both narrowed to read-only and expanded via the config. And the config should be stored as real MongoDB data, not YAML. In this way, the config itself is easily queryable. I can ask "what users are having customer ABC123 filtered out between date 1 and 2" or "when did user X have an initial setup" or "lock out users X and Y", etc. It is "almost" possible to do this now but too much information is exposed in getting configs and I do not think there is a practical way to expose an endpoint that is password protected for a single user. The passthru auth would force me to actually create that user on the target DB and that is what I want to avoid. If you're wondering, this whole thing would sit behind a self-service "warehouse data service" website. If entitled, the service would fire up a docker with mongosqld and the additional info required for it to hit the special collection as described above. After successful launch, the site would say "congrats; you now have a MySQL endpoint at machine:3307. Please connect using your name and password."

The BI Connector facilitates large scale (1000s) of "personal data marts" by acting as a controlled go-between enduser tools like Tableau and a "main" data collection. It is not practical or even desirable to have pass-thru authentication of all these users to the backend mongodb database. Instead, the BI connector could use a special collection in a mongodb instance (not necessarily the target!) to hold SHA(password), name, and YAML equivalent. When started, the mongosqld would verify command line inputs of SHA(password) and name and context, etc. and if OK, would exec an appropriately password-protected endpoint at 3307 with the config…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Allow per-field length declaration for varchar and char types in BI Connector

Usually in defining schemas for SQL databases, you can specify a max length for the size of a char or varchar column. It'd be nice to have the ability to do that in a schema that's passed to the mongosqld BI Connector process.

The only option now is to specify a max varchar size that applies to all varchar fields. It'd be nice to be able to define this on a per field basis.

This is an issue for a customer I'm working with because of the way their BI tool allocates memory for temporary objects created when bridging the SQL data source and their front-end app -- shorter varchars can significantly improve memory utilization and latency.

Usually in defining schemas for SQL databases, you can specify a max length for the size of a char or varchar column. It'd be nice to have the ability to do that in a schema that's passed to the mongosqld BI Connector process.

The only option now is to specify a max varchar size that applies to all varchar fields. It'd be nice to be able to define this on a per field basis.

This is an issue for a customer I'm working with because of the way their BI tool allocates memory for temporary objects created when bridging the…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
built-in CDC to Kafka

Hi,

it is still hard to set up a MongoDB oplog CDC connection to Kafka to publish changes from e.g. a microservice-local MongoDB. You typically have to use Kafka Connect and either the official MongoDB Atlas Connector, or the Debezium Open Source Connector.

One of the databases competing with MongoDB, CockroachDB, has a built-in feature to publish "change feeds" to Kafka (see https://www.cockroachlabs.com/docs/stable/stream-data-out-of-cockroachdb-using-changefeeds.html).

I'd love to see a similar feature for MongoDB, since this would allow us to keep MongoDB and Kafka in sync much easier and more conveniently - without having to care about yet another (probably centralized) cluster (=Kafka Connect).

Best regards,
Ralph Debusmann (ralph.debusmann@bosch.com)

Hi,

it is still hard to set up a MongoDB oplog CDC connection to Kafka to publish changes from e.g. a microservice-local MongoDB. You typically have to use Kafka Connect and either the official MongoDB Atlas Connector, or the Debezium Open Source Connector.

One of the databases competing with MongoDB, CockroachDB, has a built-in feature to publish "change feeds" to Kafka (see https://www.cockroachlabs.com/docs/stable/stream-data-out-of-cockroachdb-using-changefeeds.html).

I'd love to see a similar feature for MongoDB, since this would allow us to keep MongoDB and Kafka in sync much easier and more conveniently - without having to care about yet another (probably centralized)…

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · Kafka Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Atlas BI Connector Connection Timeout options

Default connection timeouts using the BI Connector in mongo atlas are limited to 30 seconds. Which can be limiting when running longer queries. By exposing the maxTimeMs and related connection parameters in the Atlas web interface, users could increase the timeout for longer running queries.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Make BI Connector for Atlas pricing clear upfront

From this page https://docs.atlas.mongodb.com/bi-connection/ it appears that the BI Connector for Atlas is available if I have an M10 or larger cluster. I upgraded to an M10 cluster to get the BI connector only to discover that if I enable the BI connector, it charges me additional costs. I could not find these additional BI connector prices detailed anywhere on your website except for in my account after I upgraded to the M10 cluster. And even these pricing details are not clear:

BI Connector
$1.47/day for sustained monthly usage
pricing for M10
or $3.84/day, up to $45.00/month maximum

What is sustained monthly usage? When does the $1.47 rate apply vs the $3.84 rate? This is very confusing. Please detail these costs somewhere on the website so I can understand them before I purchase an M10 cluster. Thanks!

From this page https://docs.atlas.mongodb.com/bi-connection/ it appears that the BI Connector for Atlas is available if I have an M10 or larger cluster. I upgraded to an M10 cluster to get the BI connector only to discover that if I enable the BI connector, it charges me additional costs. I could not find these additional BI connector prices detailed anywhere on your website except for in my account after I upgraded to the M10 cluster. And even these pricing details are not clear:

BI Connector
$1.47/day for sustained monthly usage
pricing for M10
or $3.84/day, up to $45.00/month maximum

What is…
- Screen Shot 2021-03-31 at 10.19.01 AM.png 32 KB
1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · BI Connector · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Dropbox

Lead with the drop that will very allow. Permit to be inside and all the time useful !!
stay tuned.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

1 comment · Kafka Connector · Edit… · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

← Previous 1 2 3 Next →

Don't see your idea?

Connectors (BI, Kafka, Spark)

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

Kafka connector to support Kafka Schema Registry

Provide an alternative sampling technique for views in MongoBI Connector

Re-sample only part of the MongoBI Connector schema

Implement error handling in MongoBI Connector

Can't fetch data on MongoDB ODBC via BI connector

Deploy MongoDB BI Connector product using the MongoDB Kubernetes Operator

Kafka Sink Connector ObjectId Support

Entity Framework (Preview 2) release date

Extend the support for the MongoDB ODBC driver to RHEL8

We have a MongoDB Sharded Cluster host thousands of collections. The schema file for bi connector is quite large. Every time we run a new qu

support regex in topic.namespace.map ( kafka source connector )

Dynamic Topic Mapping on the basis of message content

Build a version of the BI Connector that is compatible with Alpine

Change stream total ordering within a transaction

Two tiered model for authentication

Allow per-field length declaration for varchar and char types in BI Connector

built-in CDC to Kafka

Atlas BI Connector Connection Timeout options

Make BI Connector for Atlas pricing clear upfront

Dropbox

Feedback

Connectors (BI, Kafka, Spark)

Feedback and Knowledge Base

Searching…

Give feedback

Share your feedback and ideas on the MongoDB Connectors (BI, Spark and Kafka)

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

We're glad you're here

Connectors (BI, Kafka, Spark)

Categories

Searching…