Skip to content

Data Federation and Data Lake

14 results found

  1. Allow GridFS to use Atlas and object storage (via ADL) when connecting to the cloud MDB

    Many users of MongoDB store metadata in MDB and PDFs and other files in object storage. With GridFS already built into drivers, it seems like a nice change would allow ADL to federate GridFS functionality across Atlas and the file in object storage

    12 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. Add Incremental Materialized Views

    Add the ability to create a view where the result is pre-computed and is updated incrementally as more data becomes available.

    6 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. a GUI for setting SQL Query sampling size (like in bi connector Atlas console)

    Provide the ability to set SQL Query sampling size (like in bi connector Atlas console). This would allow our business customers that use the Power BI/ Tableu to easily set and manage sampling without having to use cli command (i.e., sqlGenerateSchame) whenever a new document is added to the database.

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. sqlGetSchema Sampling

    There is currently no way to know what a current sampling size is on a collection. I would recommend adding this to the sqlGetSchema output.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  5. Support Geo Queries on Object Storage

    I'd like to be able to query using the Geo functionality inside of MongoDB Query Language on data stored in Object Storage.

    Maybe using a format like: https://github.com/opengeospatial/geoparquet

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. Atlas Data Explorer to support using Aggregation Builder against Atlas Data Lake

    You can use the Atlas Data Explorer and Aggregation Builder in the MongoDB Atlas web dashboard on regular collections and views. Unfortunately there appears to be no way to use them against a Data Lake within the web dashboard, either directly or while constructing new Data Sources for Charts. Attempting to use Aggregation Builder on a Data Lake while defining a Data Source forwards to a URL that returns 404.

    It would be great if the same functionality was available for Data Lake as well.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  7. Ability to use GUID field as a partition field for online archive

    Hi,

    Today there is no way to partition the archive data based on a field that is of type GUID (legacy GUID). For example, I tried selecting a field which had Binary('0TfYLb3Qg0WT2mZu0wbq8Q==', 3) as the value but I got an error saying that the field is not supported to be a partition field. It makes sense to do this because archived data is usually old and at that time most people were using legacy guids as opposed to object ids.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  8. Simplify interface for query commands

    User friendly data filtering, queries for updating or deleting data from collections.

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  9. Modify SQL Query Schema

    Add a gui front end to sqlSetSchema. I would be good for customers to return a schema, then have the ability to 'surgically' add fields. This would reduce reliance on large sample sizes. With BI Connector in Atlas, sometimes we have to wait hours for the sqlDaemon to restart, resulting in long downtime.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  10. The "Date field to archive on" option under Archiving Rule tab should also accept date in timestamp format.

    The "Date field to archive on" option under Archiving Rule tab in Online Archive should also accept date field having timestamp format instead of only having date format.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  11. Allow a single timestamp field to be split by Year Month Day and Hour for folders instead of just one field like Year in filepath for Azure

    I checked internally, and it has been confirmed that an attribute can only appear once in a template. If Atlas Data Federation (ADF) has a template like the one you are using, it wouldn't know what value to assign to StatusDatetime because it's being assigned multiple values. Unfortunately, ADF doesn't support defining a single field value across multiple segments of the path. Instead, each of those segments should be different attributes.

    {
    "path": "/HistoryCollection/{StatusDatetime isodate:Year}/StatusDatetime isodate:Month}/StatusDatetime isodate:Day}/StatusDatetime isodate:Hour}/{RecordSource string}/{Status string}/*",
    "storeName": "sampledatabase"
    }

    We would like to have the store we are creating as an archive be queried by StatusDatetime…

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  12. 1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  13. Create a read/write Data Federation connection string

    Some customers need a connection string both to the cluster and to Online Archive with the ability to write to the cluster only.

    So far, the only option is to use more than a connection string in the application.

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  14. Schema inference

    Schemaless is flexible but it has a big impact for the downstreams especially for data exchange and DW/AI.

    It is a must-have effort to derive & infer the schema from the actual documents, so that we can understand/track/evolve/translate the document schema.

    https://www.mongodb.com/blog/post/engblog-implementing-online-parquet-shredder is a great article.

    I'd like to propose an additional feature in ADL/ADF to make schema inference as a 1st-class citizen with faster turnaround & less operation cost.

    After the $out operation of ADL/ADF, please collect the Parquet schema from each data files and union/unify them into a single schema. This schema will be stored in a .schema.json…

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  • Don't see your idea?

Feedback and Knowledge Base