Comments (7)
@ion-elgreco I was using version 0.15.1. But with 0.14.0, I see the same issue (although the __delta_rs_path is no longer listed in either the LogicalPlan or ExecutionPlan schemas).
from delta-rs.
Does the error in this case happen consistently? Would be nice to have the exact stack trace / error.
Since this data is partitioned on two columns another possibility is the the distinct partition value scan might have an issue.
from delta-rs.
@Blajda yeah happens consistently, I'll try to see if I find some time to reproduce it with an MRE
I can try giving you the full stack trace but I have to rename a bunch of stuff since it's confidential, I'll get back to on that tomorrow:)
from delta-rs.
I am seeing the exact same behavior. My table is partitioned on two columns. When I change it to be partitioned by a single column, the error no longer occurs.
from delta-rs.
@sylvanayelda which version do you see the issue? Could you try it on 0.14.0 as well
from delta-rs.
@sylvanayelda Are you merging using the polars interface?
Please also provide the schema of the table you are using.
from delta-rs.
@Blajda I'm not using polars. Here is a sample of my code:
from deltalake import DeltaTable, write_deltalake
import pyarrow as pa
# schema is a pyarrow schema
source_table = pa.Table.from_pylist(records, schema=schema)
target_table = DeltaTable(table_path, storage_options=storage_opts)
(
target_table.merge(
source=source_table,
source_alias=SOURCE_ALIAS,
target_alias=TARGET_ALIAS,
predicate=get_predicate(),
large_dtypes=False,
)
.when_not_matched_insert(updates=get_inserts(schema.names))
.execute()
)
The table has the following schema:
Schema(
[
Field(partition_col1, PrimitiveType("string"), nullable=True),
Field(col2, PrimitiveType("string"), nullable=True),
Field(col3, PrimitiveType("string"), nullable=True),
Field(col4, PrimitiveType("long"), nullable=True),
Field(col5, PrimitiveType("long"), nullable=True),
Field(col6, PrimitiveType("string"), nullable=True),
Field(col7, PrimitiveType("string"), nullable=True),
Field(col8, PrimitiveType("long"), nullable=True),
Field(col9, PrimitiveType("long"), nullable=True),
Field(col10, PrimitiveType("long"), nullable=True),
Field(col11, PrimitiveType("long"), nullable=True),
Field(
col12,
ArrayType(PrimitiveType("long"), contains_null=True),
nullable=True,
),
Field(
col13,
ArrayType(PrimitiveType("long"), contains_null=True),
nullable=True,
),
Field(partition_date, PrimitiveType("string"), nullable=True),
]
)
I should point out that we are also storing the data in ADLS2. Could that be causing any issues here?
from delta-rs.
Related Issues (20)
- load_cdf() issue : Generic S3 error: request or response body error: operation timed out HOT 1
- invalid peer certificate: BadSignature when connecting to s3 from arm64/aarch64 HOT 2
- Running the basic_operations examples fails with `Error: Transaction { source: WriterFeaturesRequired(TimestampWithoutTimezone) `} HOT 1
- write_deltalake with rust engine fails when mode is append and overwrite schema is enabled
- documentation: concurrent writes for non-S3 backends HOT 6
- Confusing "Cast Error: cannot cast string to Int64" error, even when field is a string HOT 1
- `raise_if_not_exists` for properties not configurable on CreateBuilder
- Enforcing/using SetTransactionRetentionDuration HOT 2
- schema merging doesn't work when overwriting with a predicate HOT 2
- CDC is not generated for Structs and Lists
- `get_constraints` returns `*` for names
- Checkpoint stats maxValues is incorrect HOT 18
- Properly handle nested fields when computing stats / stats-schema HOT 1
- Merge is slower than expected and loads more than expected into memory. HOT 2
- Delete appears to be single threaded.
- Compaction is not idempotent as claimed
- Error on import on MacOS with new release HOT 5
- Unable to read delta table created using Uniform HOT 2
- Deletion `_change_type` does not appear in change data feed HOT 3
- Don't commit if commit has transaction identifier that has been committed already HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from delta-rs.