Code Monkey home page Code Monkey logo

Comments (5)

mukunku avatar mukunku commented on June 25, 2024 1

Appending a suffix might be the only way to handle these but it's not straightforward. Might not be worth investing time if it's such a rare use-case. Let's see if anyone else needs this kind of support. If demand increases I can take a look.

from parquetviewer.

mukunku avatar mukunku commented on June 25, 2024

I gave this a shot but it turns out DataTables are case insensitive when it comes to column names. So it's not possible to show two fields with the same name.

For now I've added logic to gracefully exclude duplicate fields from the output. It's not ideal but at least the utility won't crash when opening such files.

Give it a shot here if you get the chance: https://github.com/mukunku/ParquetViewer/releases/tag/v2.5.1

I'll leave this ticket open since the original issue hasn't been solved and it should be possible, albeit difficult, to handle case sensitive field names.

from parquetviewer.

MCRE-BE avatar MCRE-BE commented on June 25, 2024

So the issue is not with you but with the underlying library you are using to parse Parquet files? I can open a bug report there.

I'll test the fix, but indeed it's a workaround...

from parquetviewer.

mukunku avatar mukunku commented on June 25, 2024

@MCRE-BE The issue is with the data structure the app is using to store the data in memory. It doesn't support multiple columns with the same name because it's built to be case insensitive.

In your original bug report you mentioned:

The similar column names is a bug in my code, but should not make the program crash.

Is this a legitimate use case for your workflow or was it a mistake and you don't normally have same column names with different casing?

If this isn't a normal use case maybe just gracefully warning the user of the problem is a sufficient solution here:
image

from parquetviewer.

MCRE-BE avatar MCRE-BE commented on June 25, 2024

For me it was a mistake. So for me it's a sufficient solution, but might not be for others 🙄 But thanks for the fix 😄

I guess you can't change the column names easily (like setting a _x behind)? That's how pandas solves the issue in its dataframes.

from parquetviewer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.