Skip to content

Data Federation and Data Lake

6 results found

  1. Cross Project Access to Atlas Clusters from Data Lake

    I would like by Data Lake in Project A to be able to query data in a Cluster in Project B.

    15 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  2. S3 alternative provider support

    A lot of providers support the same API of AWS. I think it will be simple to integrate them !

    9 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  3. Ability to "rehydrate" Atlas cluster from online archive

    Consider an archive scenario when a user of a given app has not logged into the app in [x] number of weeks/months, so all their data is moved to Online Archive. Once they log back into the app again, their "cold" data should now be considered "hot" and be moved back into Atlas. While we can use $out to copy data back to Atlas, there is no current way to remove the "rehydrated" data from S3 once it's been copied back to Atlas

    5 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  4. Add last modified timestamp to Data Federation Provenance for S3

    It would be great to have the last modified timestamp of a file in S3 returned with the provenance functionality in Atlas Data Federation.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  5. Filtered Data Lake Ingestions

    Our immediate need is that our applications are multi-tenant, so it would be very useful if we could create tenant-specific data lakes, by setting particular constraints in the ingestion configuration (ex. only ingest the documents with tenantId = 'specificTenantId').
    However, the usefulness of filtered data lake ingestions can be multifaceted. The ingestion could be done only for archived=false documents, documents with status=ACTIVE, etc.

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  6. Combine data lake snapshots into a single federated collection

    A common use case for data analytics is to analyse how your data evolve over time.
    For example, imagine you have an e-commerce database and your products have their price change every day. You may only store the price in your database but you'd like to make a chart that shows the evolution of your product prices over time (price y axis and time for x axis).

    It is possible today to make this happen with the combination of Data Lake and Data Federation, but the Storage Configuration JSON need to be manually updated like this:

    {
      "databases": [
    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    How important is this to you?

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)
  • Don't see your idea?

Feedback and Knowledge Base