Data Federation and Data Lake
57 results found
-
Online Archive
Hi Team - With regards to Atlas Data lake and using Online Archive customer request to be able to have (Time + Query) i.e. anything that is older than 60 days that match X query.
4 votes -
Support Azure Data Federation private endpoint
Now you have supported Azure blobs for data federation it will be great to have a private endpoint connection to the storage account
3 votes -
sqlGetSchema Sampling
There is currently no way to know what a current sampling size is on a collection. I would recommend adding this to the sqlGetSchema output.
3 votes -
Support Geo Queries on Object Storage
I'd like to be able to query using the Geo functionality inside of MongoDB Query Language on data stored in Object Storage.
Maybe using a format like: https://github.com/opengeospatial/geoparquet
3 votes -
Atlas Data Explorer to support using Aggregation Builder against Atlas Data Lake
You can use the Atlas Data Explorer and Aggregation Builder in the MongoDB Atlas web dashboard on regular collections and views. Unfortunately there appears to be no way to use them against a Data Lake within the web dashboard, either directly or while constructing new Data Sources for Charts. Attempting to use Aggregation Builder on a Data Lake while defining a Data Source forwards to a URL that returns 404.
It would be great if the same functionality was available for Data Lake as well.
3 votes -
Add eu-north-1 as a option for AWS hosting
Sweden is a very innovative country with many startups and scaleups and AWS is used very often for hosting of services and data. Sweden is also very strict on rules where and how to store data and that is why AWS has eu-north-1 as a location to choose for storing data (which is in Sweden). Currently Data Lake doesn't support that option, the closest one is Germany. It would be great to support eu-north-1 as well, so that we don't have to live with the unnecessary latency.
3 votes -
On-line Archive survives region outage
I understand that even with a geo-replicated cluster if that cluster is configured with an online archive and there's a region outage, access to the online archive data is lost. It is still unclear to me if queries against collections configured with online would fail in this scenario. In any case, it would make sense to me to enable the S3 bucket backing the on-line archive to itself be replicated using "Amazon S3 Cross-Region Replication (CRR)"
3 votes -
Import and Export archiving rules
Ability to import and export archiving rules to be able to restore them if/when we need to restore the cluster. Also useful when replicating prod clusters to our stage environment
3 votes -
Ability to use GUID field as a partition field for online archive
Hi,
Today there is no way to partition the archive data based on a field that is of type GUID (legacy GUID). For example, I tried selecting a field which had
Binary('0TfYLb3Qg0WT2mZu0wbq8Q==', 3)
as the value but I got an error saying that the field is not supported to be a partition field. It makes sense to do this because archived data is usually old and at that time most people were using legacy guids as opposed to object ids.3 votes -
Support Online Archives in Charts
We use Atlas Charts and would like to keep the data moved to Online Archive accessible for reporting/visualizations purposes.
3 votes -
Add support to $out to S3 for Standard JSON
I'd like to be able to use $out but output to Standard JSON instead of Extended JSON as the tool I'm using needs to consume standard JSON.
3 votes -
Add support for Text format files
I have a custom log format that I'd like to be able query. I imagine I would describe the format of the text files to Atlas Data Lake and then be able to query them.
3 votes -
Simplify interface for query commands
User friendly data filtering, queries for updating or deleting data from collections.
3 votes -
Specify Delimiter for CSV Files
I need to specify the delimiter for my CSV files.
3 votes -
Support the PDF File Format
I would like to be able to query PDF files using Atlas Data Lake.
3 votes -
Modify SQL Query Schema
Add a gui front end to sqlSetSchema. I would be good for customers to return a schema, then have the ability to 'surgically' add fields. This would reduce reliance on large sample sizes. With BI Connector in Atlas, sometimes we have to wait hours for the sqlDaemon to restart, resulting in long downtime.
2 votes -
Add last modified timestamp to Data Federation Provenance for S3
It would be great to have the last modified timestamp of a file in S3 returned with the provenance functionality in Atlas Data Federation.
2 votes -
Make mongodump work with Online Archive
Update mongodump so that it can be used against an Online Archive.
2 votes -
Filtered Data Lake Ingestions
Our immediate need is that our applications are multi-tenant, so it would be very useful if we could create tenant-specific data lakes, by setting particular constraints in the ingestion configuration (ex. only ingest the documents with tenantId = 'specificTenantId').
However, the usefulness of filtered data lake ingestions can be multifaceted. The ingestion could be done only for archived=false documents, documents with status=ACTIVE, etc.2 votes -
Connect Atlas Data Lake to my self managed cloud object storage (S3)
I'd like to be able to connect Atlas Data Lake to my self managed cloud object storage (S3 Compatible) in my data center or private cloud
2 votes
- Don't see your idea?