Code Monkey home page Code Monkey logo

imagesize_py's Introduction

imagesize

image

This module analyzes JPEG/JPEG 2000/PNG/GIF/TIFF/SVG/Netpbm/WebP image headers and returns image size or DIP.

This module is a pure Python module. You can use file like object like file or something like io.BytesIO.

API

  • imagesize.get(filepath)

    Returns image size (width, height).

  • imagesize.getDPI(filepath)

    Returns image DPI (width, height).

Benchmark

It only parses headers, and ignores pixel data. So it is much faster than Pillow.

module result
imagesize (pure Python) 1.077 seconds per 100 000 times
Pillow 10.569 seconds per 100 000 times

I tested on MacBookPro (2014/Core i7) with 125kB PNG files.

Development

Run test with the following command:

License

MIT License

Thanks

I referred to the following code:

I use sample image from here:

Thank you for feedback:

imagesize_py's People

Contributors

avylove avatar darktrojan avatar extremlapin avatar ffreemt avatar hugovk avatar jdufresne avatar marcoffee avatar mitya57 avatar nicholas-schaub avatar nuffknacker avatar ossdev07 avatar s3cur3 avatar shibukawa avatar tk0miya avatar xantares avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imagesize_py's Issues

MacPorts imagesize_py port

#51

MacPorts is not updated with 1.4.1 version...

py39-imagesize @1.3.0 (python, devel, graphics)

Description:          This module analyzes jpeg/jpeg2000/png/gif image headers and returns the image size.
Homepage:             https://github.com/shibukawa/imagesize_py

Build Dependencies:   py39-setuptools
Library Dependencies: python39
Test Dependencies:    py39-pytest
Platforms:            darwin
License:              MIT
Maintainers:          none

EXIF rotation tags

If an image contains an EXIF rotation flag the returned size has to be rotated accordingly, any plans to add this? 👀

Tag releases

Could you make git tags of releases that you upload to PyPI, both so that it's possible to easily find the exact commit that the release is of, and to have an alternative download location now that PyPI uses hashed URLs instead of making them predictable, which is really annoying when packaging software since now I need to go to PyPI and copy the URL instead of just changing the version number on new releases.

Width and height of image are transposed when EXIF contains rotation metadata

EXIF supports 'Orientation' parameter, which may instruct the image to be opened after rotating by 90 degrees. This is respected by all software I have tested, but unfortunately not by imagesize, causing incorrect results.

Sample image:
example-exif
Same image with orientation EXIF set to rotate 90 degrees CW:
example-exif-rotated

However, when I run the following code:

print(imagesize.get('example-exif.jpg')) # This prints (300, 100)
print(imagesize.get('example-exif-rotated.jpg')) # This prints (300, 100)

# Note that numpy image dimensions are (height, width, colors)
print(imageio.imread('example-exif-rotated.jpg').shape) # This prints (100, 300, 3)
print(imageio.imread('example-exif-rotated.jpg').shape) # This prints (300, 100, 3)

Getting the number of channels?

Is it possible to add functionality to also parse out the number of channels in the image? I'd like to distinguish between grayscale, RGB, and RGBA images. It would be OK it it got confused by things like color pallets.

The reason is that I'd like to incorporate this into my kwimage.load_image_shape function as it is stupidly faster than PIL and GDAL:

        >>> # For large files, PIL is much faster GDAL
        >>> from osgeo import gdal
        >>> from PIL import Image
        >>> import timerit
        >>> #
        >>> import kwimage
        >>> fpath = kwimage.grab_test_image_fpath()
        >>> #
        >>> ti = timerit.Timerit(100, bestof=10, verbose=2)
        >>> for timer in ti.reset('gdal'):
        >>>     with timer:
        >>>         gdal_dset = gdal.Open(fpath, gdal.GA_ReadOnly)
        >>>         width = gdal_dset.RasterXSize
        >>>         height = gdal_dset.RasterYSize
        >>>         gdal_dset = None
        >>> #
        >>> for timer in ti.reset('PIL'):
        >>>     with timer:
        >>>         pil_img = Image.open(fpath)
        >>>         width, height = pil_img.size
        >>>         pil_img.close()
        >>> # The imagesize module is quite fast
        >>> import imagesize
        >>> for timer in ti.reset('imagesize'):
        >>>     with timer:
        >>>         width, height = imagesize.get(fpath)
Timed gdal for: 100 loops, best of 10
    time per loop: best=83.266 µs, mean=85.919 ± 2.1 µs
Timed PIL for: 100 loops, best of 10
    time per loop: best=38.191 µs, mean=38.981 ± 0.7 µs
Timed imagesize for: 100 loops, best of 10
    time per loop: best=8.269 µs, mean=8.516 ± 0.2 µs

But in those use-cases it's often important to know how many channels there will be as well. Is that possible to parse out of the headers?

Returns (-1,-1) instead of exception when used on something unsupported

When using imagesize.get() on anything which is not supported (text files, empty files, etc.), the method just returns (-1,-1). I would suggest raising a ValueError instead (as is already the case when trying a random XML file, since is is parsed as SVG).

On the other hand, that would be a behavior change, so maybe either document the current behavior or bump the major version if this change is implemented?

1.4.1: pep517 build fails

Source code from git tag

+ /usr/bin/python3 -sBm build -w --no-isolation
* Getting dependencies for wheel...
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 363, in <module>
    main()
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 345, in main
    json_out['return_val'] = hook(**hook_input['kwargs'])
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 130, in get_requires_for_build_wheel
    return hook(config_settings)
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 177, in get_requires_for_build_wheel
    return self._get_build_requires(
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 159, in _get_build_requires
    self.run_setup()
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 281, in run_setup
    super(_BuildMetaLegacyBackend,
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 174, in run_setup
    exec(compile(code, __file__, 'exec'), locals())
  File "setup.py", line 4, in <module>
    from imagesize import __version__
ImportError: cannot import name '__version__' from 'imagesize' (unknown location)

Reading image size of remote file

Has anyone considered adding this functionality?
From my brief experimentation, it's a little more tricky than swapping

    # with open(str(filepath), 'rb') as fhandle:
    with urllib.request.urlopen(str(url)) as fhandle:

Though that works fine for png files, anything that needs to seek, like a jpeg, will fail.

`_convertToPx` discards fractional units

In _convertToPx, all `length values are casts to integers. This breaks SVG files that specify width and height as floats. For example, "25.4mm" becomes "25mm", meaning the length will be "94.488px" instead of "96px".

missing test.jp2 file

see test/test_get.py:

    def test_load_jpeg2000(self):
        width, height = imagesize.get(os.path.join(imagedir, "test.jp2"))
        self.assertEqual(width, 802)
        self.assertEqual(height, 670)

also, It could be great if you could setup travis integration to run these tests

cannot read a svg file

I try to get the size of a .svg file. This file is displayed just fine on my computer but raise an error in this lib.
do you have any idea why ?

To reproduce:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg viewBox="0 0 1 1" xmlns="http://www.w3.org/2000/svg">
    <style> * { fill: black } </style>
    <polygon points="0,1 1,1 0.5,0" class="triangle" />
</svg>
import imagesize

w, h = imagesize.get("triangle.svg")
The full error traceback
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/imagesize.py:210, in get(filepath)
    209 data = data.decode('utf-8')
--> 210 width = re.search(r'[^-]width="(.*?)"', data).group(1)
    211 height = re.search(r'[^-]height="(.*?)"', data).group(1)

AttributeError: 'NoneType' object has no attribute 'group'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
/Users/pierrickrambaud/Documents/travail/FAO/app_buffer/sphinx-favicon/toto.ipynb Cellule 3 in <cell line: 3>()
1 import imagesize
----> 3 w, h = imagesize.get("/Users/pierrickrambaud/Documents/travail/FAO/app_buffer/sphinx-favicon/tests/roots/test-static_files/gfx/nested/triangle.svg")

File ~/.pyenv/versions/3.8.3/lib/python3.8/site-packages/imagesize.py:213, in get(filepath)
211 height = re.search(r'[^-]height="(.*?)"', data).group(1)
212 except Exception:
--> 213 raise ValueError("Invalid SVG file")
214 width = _convertToPx(width)
215 height = _convertToPx(height)

ValueError: Invalid SVG file

1.4.1: pytest is failing in `test/test_get_filelike.py::test_get_filelike` unit

Looks like URL used in test suite fails in test/test_get_filelike.py::test_get_filelike unit

+ PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-imagesize-1.4.1-5.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-imagesize-1.4.1-5.fc35.x86_64/usr/lib/python3.8/site-packages
+ /usr/bin/pytest -ra -m 'not network'
============================= test session starts ==============================
platform linux -- Python 3.8.17, pytest-7.4.0, pluggy-1.2.0
rootdir: /home/tkloczko/rpmbuild/BUILD/imagesize_py-1.4.1
collected 45 items

test/test_get.py .......................                                 [ 51%]
test/test_get_filelike.py F                                              [ 53%]
test/test_getdpi.py .....................                                [100%]

=================================== FAILURES ===================================
______________________________ test_get_filelike _______________________________

    def test_get_filelike():
        """ test_get_filelike. """

        url = 'https://www.tsln.com/wp-content/uploads/2018/10/bears-tsln-101318-3-1240x826.jpg'
        try:
>           response = urlopen(url)

test/test_get_filelike.py:28:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib64/python3.8/urllib/request.py:222: in urlopen
    return opener.open(url, data, timeout)
/usr/lib64/python3.8/urllib/request.py:531: in open
    response = meth(req, response)
/usr/lib64/python3.8/urllib/request.py:640: in http_response
    response = self.parent.error(
/usr/lib64/python3.8/urllib/request.py:569: in error
    return self._call_chain(*args)
/usr/lib64/python3.8/urllib/request.py:502: in _call_chain
    result = func(*args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <urllib.request.HTTPDefaultErrorHandler object at 0x7f3467737cd0>
req = <urllib.request.Request object at 0x7f34677378b0>
fp = <http.client.HTTPResponse object at 0x7f34677f60d0>, code = 404
msg = 'Not Found', hdrs = <http.client.HTTPMessage object at 0x7f34676b1760>

    def http_error_default(self, req, fp, code, msg, hdrs):
>       raise HTTPError(req.full_url, code, msg, hdrs, fp)
E       urllib.error.HTTPError: HTTP Error 404: Not Found

/usr/lib64/python3.8/urllib/request.py:649: HTTPError

During handling of the above exception, another exception occurred:

    def test_get_filelike():
        """ test_get_filelike. """

        url = 'https://www.tsln.com/wp-content/uploads/2018/10/bears-tsln-101318-3-1240x826.jpg'
        try:
            response = urlopen(url)
            raw = response.read()
        except Exception as exc:
>           raise SystemExit(exc)
E           SystemExit: HTTP Error 404: Not Found

test/test_get_filelike.py:31: SystemExit
=========================== short test summary info ============================
FAILED test/test_get_filelike.py::test_get_filelike - SystemExit: HTTP Error ...
========================= 1 failed, 44 passed in 1.40s =========================

Other thing is that all units which needs more than localhost would be good to marked by network pytest mark which is used widely many other modules test suites.
https://docs.pytest.org/en/7.1.x/example/markers.html
Many distributions build envs are intentionally cut off from access to public network and running in such conditions pytest -m "not network" would allows easy skip such units.

BufferedReader instances aren't used correctly

Hello! Thanks for the awesome package!

I tried doing something similar to:

with open(path, "rb") as f:
  imagesize.get(f)

But got an exception, since get expects the parameter to be either BytesIO or a PathLike/str, and it turns out open() returns a BufferedReader which is neither.

This is a simple fix since BufferedReader implements the same needed API as BytesIO.
I'd like to suggest a change: if the parameter is a str or PathLike, try to open it, in any other case try to use it as a buffer.

This would solve my use case and open more possibilities (i.e. a BufferedReader reading from HTTP could be used for e.g. #44 ).

Does this sound good? If so, I'd be happy to submit a PR.

Thanks!

Extended function to support Buffer and io.BufferedReader.

I maintained your function to process with Bytes and io.BufferedReader. If you want, you can get it.
Cause it helpful when you work with buffer.

**Note:
But this way, it doesn't support XML cause ElementTree works with file.

Thanks for your repo.

def image_size(src):
    """
    Implement: https://github.com/shibukawa/imagesize_py
    Return (width, height) for a given img file content
    no requirements
    :rtype Tuple[int, int]
    """
    assert isinstance(src, (bytes, io.BufferedReader, str))
    height = -1
    width = -1
    cursor = 0

    if type(src) is str:
        src = open(src, 'rb')

    if type(src) is io.BufferedReader:
        buffer = src.read()
        src.close()
    else:
        buffer = src

    head = buffer[:24]
    size = len(head)
    # handle GIFs
    if size >= 10 and head[:6] in (b'GIF87a', b'GIF89a'):
        # Check to see if content_type is correct
        try:
            width, height = struct.unpack("<hh", head[6:10])
        except struct.error:
            raise ValueError("Invalid GIF file")
    # see png edition spec bytes are below chunk length then and finally the
    elif size >= 24 and head.startswith(b'\211PNG\r\n\032\n') and head[12:16] == b'IHDR':
        try:
            width, height = struct.unpack(">LL", head[16:24])
        except struct.error:
            raise ValueError("Invalid PNG file")
    # Maybe this is for an older PNG version.
    elif size >= 16 and head.startswith(b'\211PNG\r\n\032\n'):
        # Check to see if we have the right content type
        try:
            width, height = struct.unpack(">LL", head[8:16])
        except struct.error:
            raise ValueError("Invalid PNG file")
    # handle JPEGs
    elif size >= 2 and head.startswith(b'\377\330'):
        try:
            size = 2
            ftype = 0
            while not 0xc0 <= ftype <= 0xcf or ftype in [0xc4, 0xc8, 0xcc]:
                cursor += size
                byte = buffer[cursor:cursor+1]
                cursor += 1
                while ord(byte) == 0xff:
                    byte = buffer[cursor:cursor+1]
                    cursor += 1
                ftype = ord(byte)
                size = struct.unpack('>H', buffer[cursor:cursor+2])[0] - 2
                cursor += 2
            # We are at a SOFn block
            cursor += 1  # Skip `precision' byte.
            height, width = struct.unpack('>HH', buffer[cursor:cursor+4])
            cursor += 4
        except struct.error:
            raise ValueError("Invalid JPEG file")
    # handle JPEG2000s
    elif size >= 12 and head.startswith(b'\x00\x00\x00\x0cjP  \r\n\x87\n'):
        cursor = 48
        try:
            height, width = struct.unpack('>LL', buffer[cursor:cursor+8])
        except struct.error:
            raise ValueError("Invalid JPEG2000 file")
    # handle big endian TIFF
    elif size >= 8 and head.startswith(b"\x4d\x4d\x00\x2a"):
        offset = struct.unpack('>L', head[4:8])[0]
        cursor = offset
        ifdsize = struct.unpack(">H", buffer[cursor:cursor+2])[0]
        cursor += 2
        for i in range(ifdsize):
            tag, datatype, count, data = struct.unpack(">HHLL", buffer[cursor:cursor+12])
            if tag == 256:
                if datatype == 3:
                    width = int(data / 65536)
                elif datatype == 4:
                    width = data
                else:
                    raise ValueError("Invalid TIFF file: width column data type should be SHORT/LONG.")
            elif tag == 257:
                if datatype == 3:
                    height = int(data / 65536)
                elif datatype == 4:
                    height = data
                else:
                    raise ValueError("Invalid TIFF file: height column data type should be SHORT/LONG.")
            if width != -1 and height != -1:
                break
        if width == -1 or height == -1:
            raise ValueError("Invalid TIFF file: width and/or height IDS entries are missing.")
    elif size >= 8 and head.startswith(b"\x49\x49\x2a\x00"):
        offset = struct.unpack('<L', head[4:8])[0]
        cursor = offset
        ifdsize = struct.unpack("<H", buffer[cursor:cursor+2])[0]
        cursor += 2
        for i in range(ifdsize):
            tag, datatype, count, data = struct.unpack("<HHLL", buffer[cursor:cursor+12])
            if tag == 256:
                width = data
            elif tag == 257:
                height = data
            if width != -1 and height != -1:
                break
        if width == -1 or height == -1:
            raise ValueError("Invalid TIFF file: width and/or height IDS entries are missing.")
    return width, height

[1.2.0] Git tag and some 1.2.0 changes missing in Git master?

Hi!

I noticed that the tag-based release listing of this repository does not show a release 1.2.0 while the listing on PyPI does. So I had a closer look and found that the latest release on PyPI has some tiny changes that I cannot find in Git history. Maybe that can be fixed? Am I missing something?

Thanks and best, Sebastian

diff -ur imagesize_py/README.rst imagesize-1.2.0/README.rst
--- imagesize_py/README.rst     2020-03-05 01:49:29.201029087 +0100
+++ imagesize-1.2.0/README.rst  2019-12-26 17:09:43.000000000 +0100
@@ -21,6 +21,12 @@
 * ``imagesize.get(filepath)``
 
   Returns image size (width, height).
+  ``get_from_bytes(bytes)`` is for bytes.
+
+* ``imagesize.getDPI(filepath)``
+
+  Returns DPI value.
+  ``getDPI_from_bytes(bytes)`` is for bytes.
 
 Benchmark
 ------------
@@ -83,4 +89,6 @@
 * Jon Dufresne (https://github.com/jdufresne)
 * Geoff Lankow (https://github.com/darktrojan)
 * Hugo (https://github.com/hugovk)
-
+* Jack Cherng (https://github.com/jfcherng)
+* Tyler A. Young (https://github.com/s3cur3)
+* Mark Browning (https://github.com/mabrowning)
diff -ur imagesize_py/setup.cfg imagesize-1.2.0/setup.cfg
--- imagesize_py/setup.cfg      2020-03-05 01:49:29.201029087 +0100
+++ imagesize-1.2.0/setup.cfg   2019-12-26 17:13:14.000000000 +0100
@@ -2,4 +2,9 @@
 universal = 1
 
 [metadata]
-license_file = LICENSE.rst
\ No newline at end of file
+license_file = LICENSE.rst
+
+[egg_info]
+tag_build = 
+tag_date = 0
+
diff -ur imagesize_py/setup.py imagesize-1.2.0/setup.py
--- imagesize_py/setup.py       2020-03-05 01:49:29.201029087 +0100
+++ imagesize-1.2.0/setup.py    2019-12-26 17:10:15.000000000 +0100
@@ -3,7 +3,7 @@
 from setuptools import setup
 
 setup(name='imagesize',
-      version='1.1.0',
+      version='1.2.0',
       description='Getting image size from png/jpeg/jpeg2000/gif file',
       long_description='''
 It parses image files' header and return image size.
@@ -13,6 +13,7 @@
 * JPEG2000
 * GIF
 * TIFF (experimental)
+* SVG
 
 This is a pure Python library.
 ''',
@@ -37,6 +38,7 @@
           'Programming Language :: Python :: 3.5',
           'Programming Language :: Python :: 3.6',
           'Programming Language :: Python :: 3.7',
+          'Programming Language :: Python :: 3.8',
           'Programming Language :: Python :: Implementation :: CPython',
           'Programming Language :: Python :: Implementation :: PyPy',
           'Topic :: Multimedia :: Graphics'

FAILED test/test_get_filelike.py::test_get_filelike - assert (-1, -1) == (1240, 826)

Facing below test failure.

E assert (-1, -1) == (1240, 826)
E At index 0 diff: -1 != 1240
E Use -v to get more diff

test/test_get_filelike.py:35: AssertionError
====================short test summary info ===============================================
FAILED test/test_get_filelike.py::test_get_filelike - assert (-1, -1) == (1240, 826)
=================1 failed, 44 passed in 1.75s =================================================

This test try to get image size and assert it with (1240, 826).
But the image link provided in code "https://github.com/shibukawa/imagesize_py/blob/master/test/test_get_filelike.py#L26" does not exist. Because of which image size returns (-1,-1) and assertion fails.

Include the test images in the PyPi release?

Hi,

Currently, the test files 'test.png' et cetera aren't included in the PyPi tarball.

Will you include the images used by the test suite in the PyPi release tarball?

Or, do you prefer that packagers use the tarballs from GitHub?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.