Connectors (BI, Kafka, Spark)
41 results found
-
Kafka connector to support Kafka Schema Registry
One of the issues that our team has been talking about is when getting data from MongoDB, via a Kafka connector, and sending it through to Kafka we try to enforce schemas in Kafka but that schema is not enforced on the MongoDB data. This leads to developers needing to make sure they let the Data Engineering team know when their schema evolves so we can accommodate that change in the Avro schema. Our thought is to potentially have the developers use the Confluent Schema Registry to serialize their data to Avro prior to writing it to MongoDB. This would…
2 votes -
Provide an alternative sampling technique for views in MongoBI Connector
As stated in the docs (https://docs.mongodb.com/manual/reference/operator/aggregation/sample/#behavior), the
$sample
stage will perform a fullCOLLSCAN
if some criteria won't be met. The problem is that no view (https://docs.mongodb.com/manual/core/views/) will ever meet these, as the$sample
won't be the first aggregation stage.My proposal is to implement additional,
$skip
+$limit
based sampling.2 votes -
Re-sample only part of the MongoBI Connector schema
It'd be great if we could run "FLUSH SAMPLE collectionName". It's rarely the case that a lot of collections have changed at once and such a "light re-sample" might be a good alternative.
2 votes -
Implement error handling in MongoBI Connector
Right now, if the
$sample
query (used to create the DRDL schema) fails, it'll re-run the query once again, forever. The only place when the error is accessible is in the "Profiler" tab of a given node. In our case - and I believe it's common - that it was a secondary node, which "Profiler" is far less accessible than the one for the primary node.We learned it the hard way, as our cluster was silently maxing out one CPU core for over a week. After a lot of debugging, it turned out that one of our views has…
2 votes -
Can't fetch data on MongoDB ODBC via BI connector
Can't fetch data on MongoDB ODBC via BI connector. Test connections are successful but couldn't fetch the data from database servers, only getting 'information_schema' and 'mysql' as by default databases
2 votes -
Deploy MongoDB BI Connector product using the MongoDB Kubernetes Operator
We would like to have the ability to deploy and run MongoDB BI connector as a container under MongoDB Kubernetes Operator. Currently there is no support on such deployments.
2 votes -
Kafka Sink Connector ObjectId Support
The sink connector should support ObjectIds for the document.id.strategy. In other words, if a ObjectId hex string is provided for _id, the sink connector should be able to convert this to an ObjectId. I've scoured the documentation and the rest of the internet and have not been able to determine how to do this. I see an open PR on the github repo, but there has been no movement in almost 3 years. It's astonishing that I cannot use an ObjectId...
1 vote -
Entity Framework (Preview 2) release date
Hello,
is there any ETA of PR2 (nuget)?
Thank you
1 vote -
Extend the support for the MongoDB ODBC driver to RHEL8
At present there is only support for RHEL7, which is effectively out of date. This would support more migrations to RHEL8.
1 vote -
We have a MongoDB Sharded Cluster host thousands of collections. The schema file for bi connector is quite large. Every time we run a new qu
Current BI Connector behaviour is that it will do a listCollections command after an existing connection to BIC has been idle for 1-2 minutes. For databases with thousands of collections, this will take several minutes to complete. Afte rthe connection to BIC has been idle for a few minutes, the connection from BIC to MongoDB is dropped and when a new connection is created, the listCollecion command is called again. This overhead is unnecessary.
Request to have an option to keep the connection alive or some form of connection pooling mechanism so that listCollection will not be called over and…1 vote -
support regex in topic.namespace.map ( kafka source connector )
Currently,
topic.namespace.map
supports wildcard ( * ).
It will be helpful if it also support regex as well.
Or, since it can be breaking change, it would be nice if a new mapping is introduced that supports regex format.The workaround can be using custom mapper.
However, if the there is an builtin version, then the wheel can invented once and used by many ppl.1 vote -
Dynamic Topic Mapping on the basis of message content
Currently mongodb source connector only supports dynamic topic mapping on the bases of collections/databases. Can we extend it to support routing on the basis of message content?
Why it's important?
We've currently setup connector for a very large collection but since we wanted to route the data for different sections to different topics, we had to setup separate connector for each and define the filter in the pipeline section (it runs as an aggregate query on changestreams to filter out the relevant data). Now obviously it created performance concerns as large number of collscan queries started to run on these…1 vote -
Build a version of the BI Connector that is compatible with Alpine
We deploy the BI Connector within a container that runs the Alpine OS (so it's small and clean).
However Alpine is bundled with MULS and not the GLIBC libs which means the BI Connector cannot run in that container.We have managed to get around this by using gcompat to layer in glibc. But that is not ideal for us as it adds complexity and rather defeats the purpose of using Alpine.
You build several versions of the BI Connector binary, are you able to build an Alpine compatible version as well?
1 vote -
Change stream total ordering within a transaction
Right now, change stream will flatten out a transaction into individual operations, i.e., if we do two operations within a transaction, change stream will generate two events.
However there is no total ordering of the events that happens within the transaction. This is important if we do two updates on the same object within a transaction, since both operations share the same optime (clusterTime), there is no way to provide the ordering between these two events.
It would be helpful if we can provide an ordering number on every events from the same transaction for this purpose.
1 vote -
Two tiered model for authentication
The BI Connector facilitates large scale (1000s) of "personal data marts" by acting as a controlled go-between enduser tools like Tableau and a "main" data collection. It is not practical or even desirable to have pass-thru authentication of all these users to the backend mongodb database. Instead, the BI connector could use a special collection in a mongodb instance (not necessarily the target!) to hold SHA(password), name, and YAML equivalent. When started, the mongosqld would verify command line inputs of SHA(password) and name and context, etc. and if OK, would exec an appropriately password-protected endpoint at 3307 with the config…
1 vote -
Allow per-field length declaration for varchar and char types in BI Connector
Usually in defining schemas for SQL databases, you can specify a max length for the size of a char or varchar column. It'd be nice to have the ability to do that in a schema that's passed to the mongosqld BI Connector process.
The only option now is to specify a max varchar size that applies to all varchar fields. It'd be nice to be able to define this on a per field basis.
This is an issue for a customer I'm working with because of the way their BI tool allocates memory for temporary objects created when bridging the…
1 vote -
built-in CDC to Kafka
Hi,
it is still hard to set up a MongoDB oplog CDC connection to Kafka to publish changes from e.g. a microservice-local MongoDB. You typically have to use Kafka Connect and either the official MongoDB Atlas Connector, or the Debezium Open Source Connector.
One of the databases competing with MongoDB, CockroachDB, has a built-in feature to publish "change feeds" to Kafka (see https://www.cockroachlabs.com/docs/stable/stream-data-out-of-cockroachdb-using-changefeeds.html).
I'd love to see a similar feature for MongoDB, since this would allow us to keep MongoDB and Kafka in sync much easier and more conveniently - without having to care about yet another (probably centralized)…
1 vote -
Atlas BI Connector Connection Timeout options
Default connection timeouts using the BI Connector in mongo atlas are limited to 30 seconds. Which can be limiting when running longer queries. By exposing the
maxTimeMs
and related connection parameters in the Atlas web interface, users could increase the timeout for longer running queries.1 vote -
Make BI Connector for Atlas pricing clear upfront
From this page https://docs.atlas.mongodb.com/bi-connection/ it appears that the BI Connector for Atlas is available if I have an M10 or larger cluster. I upgraded to an M10 cluster to get the BI connector only to discover that if I enable the BI connector, it charges me additional costs. I could not find these additional BI connector prices detailed anywhere on your website except for in my account after I upgraded to the M10 cluster. And even these pricing details are not clear:
BI Connector
$1.47/day for sustained monthly usage
pricing for M10
or $3.84/day, up to $45.00/month maximumWhat is…
1 vote -
Dropbox
Lead with the drop that will very allow. Permit to be inside and all the time useful !!
stay tuned.1 vote
- Don't see your idea?