Comments (2)
Note also that if a column is missing in one CSV, and present in the other, there will be a ShapeError
. I expect it to just be present in the final LazyFrame, with null for the rows from the CSV where the column doesn't exist.
from polars.
Note also that if a column is missing in one CSV, and present in the other, there will be a
ShapeError
. I expect it to just be present in the final LazyFrame, with null for the rows from the CSV where the column doesn't exist.
For this, I would suggest to scan the files individually and then use a diagonal concat, i.e.:
pl.concat([pl.scan_csv(path) for path in files], how='diagonal')
I don't think we should add this to the CSV reader would cause issues with some other functionality that relies on the CSV files having the same number of columns.
from polars.
Related Issues (20)
- map_elements applied to dataframe with empty column or batch with empty column returns series with length 0. HOT 7
- `scan_csv`does not support a list of datatypes in `schema_overrides` HOT 1
- `read_csv` and `read_ipc` do not use native `storage_options` configuration keys
- DataFrame construction from numpy with dtype object HOT 2
- Arithmetic with nested arrays gives wrong results HOT 2
- Rename: Support for ignore_missing parameter HOT 1
- Cloud paths with square brackets in paths are not treated as non-glob paths, even with `glob=False`
- `.list.to_struct()` PanicException when used on non-list column
- `pl.col(col).last()` example shows both "max" and "last" behaviour HOT 1
- `any_horizontal` and `all_horizontal` treat null values differently than their vertical counterpart HOT 2
- Allow `os.PathLike` objects in `read_*` and `scan_*` functions HOT 1
- Improve error message for schema column mismatch.
- fill_null panic/not implemented for Array
- ColumnNotFoundError in `polars.DataFrame.join` with `how='right'` and different `left_on` and `right_on` columns HOT 2
- Delta Lake Duration data type
- Add `mask` operations on `Expr` HOT 2
- Sorting a LazyFrame in streaming mode with a `Struct` column on a Mac, panics and then hangs HOT 3
- Make pivot args as positional instead of keyword only
- Get random state (python)
- Support native writes to s3 and gs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.