Skip to content

Data Federation and Data Lake

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback

57 results found

  1. build atlas on prem

    Allow atlas to control on prem instance

    4 votes
    How important is this to you?
  2. Online Archive

    Hi Team - With regards to Atlas Data lake and using Online Archive customer request to be able to have (Time + Query) i.e. anything that is older than 60 days that match X query.

    4 votes
    1 comment  ·  Automation  ·  Admin →
    How important is this to you?
  3. sqlGetSchema Sampling

    There is currently no way to know what a current sampling size is on a collection. I would recommend adding this to the sqlGetSchema output.

    3 votes
    How important is this to you?
  4. Support Geo Queries on Object Storage

    I'd like to be able to query using the Geo functionality inside of MongoDB Query Language on data stored in Object Storage.

    Maybe using a format like: https://github.com/opengeospatial/geoparquet

    3 votes
    How important is this to you?
  5. Atlas Data Explorer to support using Aggregation Builder against Atlas Data Lake

    You can use the Atlas Data Explorer and Aggregation Builder in the MongoDB Atlas web dashboard on regular collections and views. Unfortunately there appears to be no way to use them against a Data Lake within the web dashboard, either directly or while constructing new Data Sources for Charts. Attempting to use Aggregation Builder on a Data Lake while defining a Data Source forwards to a URL that returns 404.

    It would be great if the same functionality was available for Data Lake as well.

    3 votes
    How important is this to you?
  6. Add eu-north-1 as a option for AWS hosting

    Sweden is a very innovative country with many startups and scaleups and AWS is used very often for hosting of services and data. Sweden is also very strict on rules where and how to store data and that is why AWS has eu-north-1 as a location to choose for storing data (which is in Sweden). Currently Data Lake doesn't support that option, the closest one is Germany. It would be great to support eu-north-1 as well, so that we don't have to live with the unnecessary latency.

    3 votes
    How important is this to you?
  7. On-line Archive survives region outage

    I understand that even with a geo-replicated cluster if that cluster is configured with an online archive and there's a region outage, access to the online archive data is lost. It is still unclear to me if queries against collections configured with online would fail in this scenario. In any case, it would make sense to me to enable the S3 bucket backing the on-line archive to itself be replicated using "Amazon S3 Cross-Region Replication (CRR)"

    3 votes
    How important is this to you?
  8. Import and Export archiving rules

    Ability to import and export archiving rules to be able to restore them if/when we need to restore the cluster. Also useful when replicating prod clusters to our stage environment

    3 votes
    0 comments  ·  Automation  ·  Admin →
    How important is this to you?
  9. Ability to use GUID field as a partition field for online archive

    Hi,

    Today there is no way to partition the archive data based on a field that is of type GUID (legacy GUID). For example, I tried selecting a field which had Binary('0TfYLb3Qg0WT2mZu0wbq8Q==', 3) as the value but I got an error saying that the field is not supported to be a partition field. It makes sense to do this because archived data is usually old and at that time most people were using legacy guids as opposed to object ids.

    3 votes
    How important is this to you?
  10. Support Online Archives in Charts

    We use Atlas Charts and would like to keep the data moved to Online Archive accessible for reporting/visualizations purposes.

    3 votes
    0 comments  ·  Reporting  ·  Admin →
    How important is this to you?
  11. Add support to $out to S3 for Standard JSON

    I'd like to be able to use $out but output to Standard JSON instead of Extended JSON as the tool I'm using needs to consume standard JSON.

    3 votes
    How important is this to you?
  12. Add support for Text format files

    I have a custom log format that I'd like to be able query. I imagine I would describe the format of the text files to Atlas Data Lake and then be able to query them.

    3 votes
    How important is this to you?
  13. Simplify interface for query commands

    User friendly data filtering, queries for updating or deleting data from collections.

    3 votes
    How important is this to you?
  14. Specify Delimiter for CSV Files

    I need to specify the delimiter for my CSV files.

    3 votes
    How important is this to you?
  15. Support the PDF File Format

    I would like to be able to query PDF files using Atlas Data Lake.

    3 votes
    How important is this to you?
  16. Support for Iceberg Partitions

    Apache Iceberg uses URL encoded partitions, see:

    https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L218

    Atlas Data Federation S3/Parquet currently does not support URL encoded partitions, this a potential blocker to use Data Federation with Iceberg.

    2 votes
    How important is this to you?
  17. Add last modified timestamp to Data Federation Provenance for S3

    It would be great to have the last modified timestamp of a file in S3 returned with the provenance functionality in Atlas Data Federation.

    2 votes
    How important is this to you?
  18. Make mongodump work with Online Archive

    Update mongodump so that it can be used against an Online Archive.

    2 votes
    0 comments  ·  Automation  ·  Admin →
    How important is this to you?
  19. Filtered Data Lake Ingestions

    Our immediate need is that our applications are multi-tenant, so it would be very useful if we could create tenant-specific data lakes, by setting particular constraints in the ingestion configuration (ex. only ingest the documents with tenantId = 'specificTenantId').
    However, the usefulness of filtered data lake ingestions can be multifaceted. The ingestion could be done only for archived=false documents, documents with status=ACTIVE, etc.

    2 votes
    How important is this to you?
  20. Connect Atlas Data Lake to my self managed cloud object storage (S3)

    I'd like to be able to connect Atlas Data Lake to my self managed cloud object storage (S3 Compatible) in my data center or private cloud

    2 votes
    How important is this to you?
  • Don't see your idea?

Data Federation and Data Lake

Categories

Feedback and Knowledge Base