Skip to content

Data Federation and Data Lake

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback

51 results found

  1. Simplified JSON support for $out to S3

    The ability to $out to S3 from a federated database instance is a game-changer for those working with their own data warehouses and data lakes.

    One improvement that would make it better would be to support simplified JSON for json exports. Currently, $out uses extended json v2, which may not be compatible for systems reading from the destination S3 bucket, which require simplified JSON (which aligns with other tools like kafka source connector). Technically, it is possible to make this conversion yourself with clever use of the $toString aggregation pipeline operator in stages preceding $out. However there are several challenges:…

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  File Formats  ·  Admin →
    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. Add eu-west-3 as a option for AWS private endpoint

    Currently Data Lake doesn't support France/Paris eu-west-3 to set up a private endpoint. It would be great to support eu-west-3 as well.

    7 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. a GUI for setting SQL Query sampling size (like in bi connector Atlas console)

    Provide the ability to set SQL Query sampling size (like in bi connector Atlas console). This would allow our business customers that use the Power BI/ Tableu to easily set and manage sampling without having to use cli command (i.e., sqlGenerateSchame) whenever a new document is added to the database.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. sqlGetSchema Sampling

    There is currently no way to know what a current sampling size is on a collection. I would recommend adding this to the sqlGetSchema output.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  5. Support for Superset and other Python DB-API / SQLAlchemy connections to SQL Atlas

    Superset uses SQL Alchemy and/or Python DB-API drivers, not JDBC or ODBC drivers. Superset is the most popular, open-source Apache visualization tool.

    Others have made it work like this: https://preset.io/blog/building-database-connector/

    6 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Connectors  ·  Admin →
    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. Modify SQL Query Schema

    Add a gui front end to sqlSetSchema. I would be good for customers to return a schema, then have the ability to 'surgically' add fields. This would reduce reliance on large sample sizes. With BI Connector in Atlas, sometimes we have to wait hours for the sqlDaemon to restart, resulting in long downtime.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  7. AWS IAM AuthN for Atlas SQL

    Support AWS IAM Authentication mechanism in JDBC and ODBC drivers (Atlas SQL)

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Connectors  ·  Admin →
    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  8. Support to export backups to Azure Blob Storage in Atlas

    I would like the capability to export my cloud snapshots to Azure blob storage.

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  9. Create a read/write Data Federation connection string

    Some customers need a connection string both to the cluster and to Online Archive with the ability to write to the cluster only.

    So far, the only option is to use more than a connection string in the application.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  10. Implement a feature to track data download volume per DB user

    In order to enhance data security and prevent unauthorized data exfiltration, our team proposes the implementation of a metric within MongoDB Atlas that allows administrators to monitor and measure the amount of data downloaded by each database user over a specified period. This feature would provide critical insights into user behavior, helping to identify unusual data access patterns or potential data breaches. By tracking network data usage at the user level, we can more effectively audit data access and transfer, ensuring that data is used appropriately and in compliance with organizational data governance policies. This granularity in monitoring would be…

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Reporting  ·  Admin →
    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  11. 6 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  12. Combine data lake snapshots into a single federated collection

    A common use case for data analytics is to analyse how your data evolve over time.
    For example, imagine you have an e-commerce database and your products have their price change every day. You may only store the price in your database but you'd like to make a chart that shows the evolution of your product prices over time (price y axis and time for x axis).

    It is possible today to make this happen with the combination of Data Lake and Data Federation, but the Storage Configuration JSON need to be manually updated like this:

    {
      "databases": [
    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  13. Support AWS IAM for Data Federation Authentication

    We would like to be able to connect to the Federated Database Instance using AWS IAM for Authentication just like you can for Atlas Clusters.

    4 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  14. readPreference=Secondary for Federated Data Store

    We're using Online Archives to ensure our cluster data size stays manageable, while at the same time enabling our data extraction process to have access to older date on an exception basis.

    Not being able to set the read preference on our mongoexport connection string for our federated data source ( https://www.mongodb.com/docs/atlas/app-services/mongodb/read-preference/) is a significant issue for our use case.

    12 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  15. Add last modified timestamp to Data Federation Provenance for S3

    It would be great to have the last modified timestamp of a file in S3 returned with the provenance functionality in Atlas Data Federation.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  16. Schema inference

    Schemaless is flexible but it has a big impact for the downstreams especially for data exchange and DW/AI.

    It is a must-have effort to derive & infer the schema from the actual documents, so that we can understand/track/evolve/translate the document schema.

    https://www.mongodb.com/blog/post/engblog-implementing-online-parquet-shredder is a great article.

    I'd like to propose an additional feature in ADL/ADF to make schema inference as a 1st-class citizen with faster turnaround & less operation cost.

    After the $out operation of ADL/ADF, please collect the Parquet schema from each data files and union/unify them into a single schema. This schema will be stored in a .schema.json…

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  17. Make mongodump work with Online Archive

    Update mongodump so that it can be used against an Online Archive.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Automation  ·  Admin →
    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  18. Support Geo Queries on Object Storage

    I'd like to be able to query using the Geo functionality inside of MongoDB Query Language on data stored in Object Storage.

    Maybe using a format like: https://github.com/opengeospatial/geoparquet

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  19. Filtered Data Lake Ingestions

    Our immediate need is that our applications are multi-tenant, so it would be very useful if we could create tenant-specific data lakes, by setting particular constraints in the ingestion configuration (ex. only ingest the documents with tenantId = 'specificTenantId').
    However, the usefulness of filtered data lake ingestions can be multifaceted. The ingestion could be done only for archived=false documents, documents with status=ACTIVE, etc.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  20. Atlas Data Explorer to support using Aggregation Builder against Atlas Data Lake

    You can use the Atlas Data Explorer and Aggregation Builder in the MongoDB Atlas web dashboard on regular collections and views. Unfortunately there appears to be no way to use them against a Data Lake within the web dashboard, either directly or while constructing new Data Sources for Charts. Attempting to use Aggregation Builder on a Data Lake while defining a Data Source forwards to a URL that returns 404.

    It would be great if the same functionality was available for Data Lake as well.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
← Previous 1 3
  • Don't see your idea?

Feedback and Knowledge Base