Code Monkey home page Code Monkey logo

Comments (7)

quoing avatar quoing commented on August 14, 2024 2

Ok, I managed to fix the problem.

It would be better to fix the issue with file encoding rather than re-coding the output..

In my opinion it would be better to open file with correct encoding.. following seems much "generic" solution.. could you re-try?
https://github.com/setnicka/ulozto-downloader/blob/master/uldlib/frontend.py#L81
self.logfile = open(logfile, 'a', encoding="utf-8")

from ulozto-downloader.

Scavy avatar Scavy commented on August 14, 2024 1

Ok, I managed to fix the problem.

The original function:

    def _log_logfile(self, prefix: str, msg: str, progress: bool, level: LogLevel):
        if progress or self.logfile is None:
            return

        t = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        self.logfile.write(f"{t} {prefix}\t[{level.name}] {msg}\n")
        self.logfile.flush()

The changed function I made with error handling, that strips non-ascii characters and replaces them with ?:

    def _log_logfile(self, prefix: str, msg: str, progress: bool, level: LogLevel):
        if progress or self.logfile is None:
            return

        t = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        log_msg = f"{t} {prefix}\t[{level.name}] {msg}\n"

        try:
            self.logfile.write(log_msg)
        except UnicodeEncodeError:
            # Replace unencodable characters with a placeholder
            log_msg = log_msg.encode('ascii', 'replace').decode('ascii')
            self.logfile.write(log_msg)
            
        self.logfile.flush()

Edit: This is just a quick hack to circumvent the problem. I think the optimal solution would be to change the logfile to UTF-8.
So I leave my quick fix here, for someone to rewrite into a better solution.

Edit2: Just to make sure it's clear - This only handles writing of the filename to the logfile, if it contains non ASCII characters. So it will ruin anything that depends on getting the filename from the logfile in these cases.

from ulozto-downloader.

pschonmann avatar pschonmann commented on August 14, 2024

And whats url to download ? Probably "wrong" character in there

from ulozto-downloader.

Scavy avatar Scavy commented on August 14, 2024

It seems to be all files where the name contains "illegal" characters according to the cp1251 codepage.

I ran into another character that would trigger the same error:
UnicodeEncodeError: 'charmap' codec can't encode character '\u011b' in position 82: character maps to <undefined>

from ulozto-downloader.

Scavy avatar Scavy commented on August 14, 2024

It doesn't seem to be a problem in handling those filenames.. the problem is when it tries to write it to the logfile, that is when the error is triggered.

So some kind of error handling in code at line 99 in the file "\uldlib\frontend.py", is what is needed. That should handle all kinds of unknown characters, in that situation.
Unfortunately, I'm not proficient enough in python yet, to come up with a simple fix myself.

from ulozto-downloader.

Scavy avatar Scavy commented on August 14, 2024

Ok, I managed to fix the problem.

It would be better to fix the issue with file encoding rather than re-coding the output..

In my opinion it would be better to open file with correct encoding.. following seems much "generic" solution.. could you re-try? https://github.com/setnicka/ulozto-downloader/blob/master/uldlib/frontend.py#L81 self.logfile = open(logfile, 'a', encoding="utf-8")

This is a better solution! And works very well.
I was hoping for someone to pick it up and make a correct fix, rather than my way of circumventing the problem.
Thanks! :)
Can you do a PR with the solution so that @setnicka can accept it into the code?

from ulozto-downloader.

setnicka avatar setnicka commented on August 14, 2024

Fixed with #177 by @quoing, thank you.

from ulozto-downloader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.