Skip to content

Drivers

10 results found

  1. Update PyArrow version required for PyMongoArrow to use with Fireducks

    Update PyArrow version to use with Fireducks - right now I want to use these three libraries in one data science project:
    - fireducks-1.2.2 (pip install fireducks)
    - pyarrow-19.0.1 (pip install pyarrow) <-- required by the Fireducks
    - pymongoarrow-1.6.4 (pip install pymongoarrow) <-- it requires pyarrow 18.0.0

    My pipeline connects to MongoDB through PyMongoArrow, downloads big DataFrames and preprocess them with Fireducks superb optimized vector transformations. I tried to side-step the need for PyMongoArrow with code like the attached function, but the end result differs from just using the PyMongoArrow.

    What would be needed to update the PyMongoArrow requirements?

    4 votes
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  2. 2 votes
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  3. pymongoarrow replaces existing codecs

    The api.write() method replaces any existing codecs.

    The collection TypeRegistry is replaced with a new instance, effectively removing any existing custom codecs.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L419

    A related issue is that pymongoarrow uses pyarrow.to_pylist() to convert a pyarrow.Table to Python objects as an intermediate step before converting to raw BSON.

    Pyarrow converts Arrow decimal128 types to Python Decimal objects. However, BSON cannot handle Decimal types, necessitating the use of a custom codec.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L385

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  4. pymongoarrow replaces existing codecs

    The api.write() method replaces any existing codecs.

    The collection TypeRegistry is replaced with a new instance, effectively removing any existing custom codecs.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L419

    A related issue is that pymongoarrow uses pyarrow.to_pylist() to convert a pyarrow.Table to Python objects as an intermediate step before converting to raw BSON.

    Pyarrow converts Arrow decimal128 types to Python Decimal objects. However, BSON cannot handle Decimal types, necessitating the use of a custom codec.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L385

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  5. pymongoarrow replaces existing codecs

    The api.write() method replaces any existing codecs.

    The collection TypeRegistry is replaced with a new instance, effectively removing any existing custom codecs.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L419

    A related issue is that pymongoarrow uses pyarrow.to_pylist() to convert a pyarrow.Table to Python objects as an intermediate step before converting them to raw BSON.

    Pyarrow converts Arrow decimal128 types to Python Decimal objects. However, BSON cannot handle Decimal types, necessitating the use of a custom codec.
    https://github.com/mongodb-labs/mongo-arrow/blob/1.6.0/bindings/python/pymongoarrow/api.py#L385

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  6. Add Date support to pymongoarrow

    pymongoarrow already supports datetime but that is not ideal to store partitioned values. I assume we would ideally store dates as BSON Date in MongoDB and then get back a Parquet Date. Y'all already support many data types (https://github.com/mongodb-labs/mongo-arrow/blob/main/bindings/python/docs/source/data_types.rst). It's just that partitioning by date is very useful for us.

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  7. single-thread pymongo

    Hi.

    I am using MongoDB Atlas and have recently moved my site to Google App Engine.
    GAE supports only limited multithreading and outputs errors about PyMongo.

    Disabling multi-threading option of PyMongo will help me.

    Thanks.

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  8. Normalize maximum time in MS parameter name, maxTimeMS vs max_time_ms

    Currently, some operations such as aggregate() or count_documents use maxTimeMS as parameter name, but others such as the find(), use max_time_ms, so unless you memorize which method uses which nomenclature, you have to check the docs every time.
    This is specially confusing since, for example, find_one() uses max_time_ms but all the find_one_and_...() methods use maxTimeMS.

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  9. Generic count property in _WriteResult class

    Hi,

    I'm using a MongoDB as a backend for my API and often come to a case where i want to validate whether I updated, inserted or deleted any records. At the moment I use the deletedcount and machtedcount properties in the DeleteResult and UpdateResult object. To provide a more generic way I thought it may makes sense to provide a new property in the generic WriteResult class called affectedrecords or something like that so you can generic react to the result of your database transaction and maybe return a 404 if nothing was updated, deleted or…

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  10. Add non-EJSON as option for json utils

    python and other drivers offer to-json utils. This makes taking a native rich shape complete with datetimes and byte[] and such and easily turning into bson (the best way!) or JSON. The utils offer options to modify the output representations of types but always do so in an EJSON way, namely with $date/$numberDecimal etc. Sometimes a consumer cannot (or will not) accept data in this fashion. I'd like to see a "safePureJSON" option (or similar) for bson.json_utils.dumps() that emits the safe string or number equivalent of the EJSON.
    fld: {$date: "ISOdate"} becomes fld: "ISOdate"
    fld: {$numberDecimal: "99.9"} becomes fld: "99.9"…

    1 vote
    0 comments  ·  Python  ·  Admin →
    How important is this to you?
  • Don't see your idea?

Drivers

Categories

Feedback and Knowledge Base