justinfx / fileseq Goto Github PK

View Code? Open in Web Editor NEW

243.0 27.0 42.0 1.04 MB

A Python library for parsing frame ranges.

License: Other

Python 100.00%

python vfx filesequence imagesequence

fileseq's Introduction

A Python library for parsing frame ranges and file sequences commonly used in VFX and Animation applications.

Frame Range Shorthand

Support for:

Standard: 1-10
Comma Delimited: 1-10,10-20
Chunked: 1-100x5
Filled: 1-100y5
Staggered: 1-100:3 (1-100x3, 1-100x2, 1-100)
Negative frame numbers: -10-100
Subframes: 1001-1066x0.25, 1001.5-1066.0x0.5
Padding: #=4 padded, @=1 padded
Alternate padding: #=1 padded, @=1 padded
Printf Syntax Padding: %04d=4 padded, %01d=1 padded
Houdini Syntax Padding: $F4=4 padding, $F=1 padded
Udim Syntax Padding: or %(UDIM)d, always 4 padded

FrameSets

A FrameSet wraps a sequence of frames in a list container.

Iterate a FrameSet

fs = fileseq.FrameSet("1-5")
for f in fs:
  print(f)

Access Frames

Using Indices:

>>> fs = fileseq.FrameSet("1-100:8")
>>> fs[0] # First frame.
1
>>> fs[-1] # Last frame.
98

Using Convenience Methods:

>>> fs = fileseq.FrameSet("1-100:8")
>>> fs.start() # First frame.
1
>>> fs.end() # Last frame.
98

FileSequence

Instantiate from String

fileseq.FileSequence("/foo/bar.1-10#.exr")
fileseq.FileSequence("/foo/bar.1-10x0.25#.#.exr", allow_subframes=True)

Format Path for VFX Software

Using FileSequence.format Method:

>>> seq = fileseq.FileSequence("/foo/bar.1-10#.exr")
>>> seq.format(template='{dirname}{basename}{padding}{extension}') 
"/foo/bar.#.exr"
>>> seq = fileseq.FileSequence("/foo/bar.1-10#.#.exr", allow_subframes=True)
>>> seq.format(template='{dirname}{basename}{padding}{extension}')
"/foo/bar.#.#.exr"

Joining:

>>> seq = fileseq.FileSequence("/foo/bar.1-10#.exr")
>>> ''.join([seq.dirname(), seq.basename(), '%0{}d'.format(len(str(seq.end()))), seq.extension()])
"/foo/bar.%02d.exr"

Alternate Padding Styles:

>>> seq = fileseq.FileSequence("/foo/bar.1-10#.exr", pad_style=fileseq.PAD_STYLE_HASH1)
>>> list(seq)
['/foo/bar.1.exr',
 '/foo/bar.2.exr',
 '/foo/bar.3.exr',
 '/foo/bar.4.exr',
 '/foo/bar.5.exr',
 '/foo/bar.6.exr',
 '/foo/bar.7.exr',
 '/foo/bar.8.exr',
 '/foo/bar.9.exr',
 '/foo/bar.10.exr']
>>> seq = fileseq.FileSequence("/foo/bar.1-10#.exr", pad_style=fileseq.PAD_STYLE_HASH4)
>>> list(seq)
['/foo/bar.0001.exr',
 '/foo/bar.0002.exr',
 '/foo/bar.0003.exr',
 '/foo/bar.0004.exr',
 '/foo/bar.0005.exr',
 '/foo/bar.0006.exr',
 '/foo/bar.0007.exr',
 '/foo/bar.0008.exr',
 '/foo/bar.0009.exr',
 '/foo/bar.0010.exr']

Get List of File Paths

>>> seq = fileseq.FileSequence("/foo/bar.1-10#.exr")
>>> [seq[idx] for idx, fr in enumerate(seq.frameSet())]
['/foo/bar.0001.exr',
 '/foo/bar.0002.exr',
 '/foo/bar.0003.exr',
 '/foo/bar.0004.exr',
 '/foo/bar.0005.exr',
 '/foo/bar.0006.exr',
 '/foo/bar.0007.exr',
 '/foo/bar.0008.exr',
 '/foo/bar.0009.exr',
 '/foo/bar.0010.exr']

Finding Sequences on Disk

Check a Directory for All Existing Sequences

seqs = fileseq.findSequencesOnDisk("/show/shot/renders/bty_foo/v1")

Check a Directory for One Existing Sequence.

Use a '@' or '#' where you might expect to use '*' for a wildcard character.
For this method, it doesn't matter how many instances of the padding character you use, it will still find your sequence.

Yes:

fileseq.findSequenceOnDisk('/foo/[email protected]')

Yes:

fileseq.findSequenceOnDisk('/foo/bar.@@@@@.exr')

No:

fileseq.findSequenceOnDisk('/foo/bar.*.exr')

To find subframe sequences you must explicitly opt-in

fileseq.findSequenceOnDisk('/foo/bar.#.#.exr', allow_subframes=True)

Limitations

While there may be many custom types of sequence patterns that could be considered a valid pipeline format, this library has taken an opinionated stance on acceptable sequence formats. This is done to keep parsing rules manageable and to not over-complicate the logic. The parsing rules can and have been expanded in some ways over time, such as adding support for new padding format patterns like printf "%04d", houdini "$F" and "". But other rules remain the same, such as expecting a frame number component to be found just before the file extension component.

Language Ports

fileseq's People

Contributors

Stargazers

Watchers

Forkers

ingenuityengine benroeder gregcotten kjaft ljkart awhetter aardmancgi tomwod jasperges rotorstudios evileyepictures jonntd flipswitchingmonkey walrusvision glenwalker cmartinaf manuelrais jcep21 free8011 joss13aws ruchitinfushion donalm psh845 richardssam pombredanne rkoschmitzky nelvana-studio lavalmi renaudll arunpillaii samson-jerome tk421storm manishtoons herronelou initialfx nzorov michalfratczak nimbleheroes doruchan polob

fileseq's Issues

findSequenceOnDisk doesn't work on windows - regex needs to be updated to deal with backslashes

here is updated regex for the DISK_PATTERN that fixes the issue on windows.

# Regular expression pattern for matching file names on disk.
DISK_PATTERN = r"^(.*[/\\])?(?:$|(.*?)(-?\d+)?(?:(\.[^.]*$)|$))"
DISK_RE = re.compile(DISK_PATTERN)

you use os.path.join that injects a backslash into the path (even if the original path has only forward slashes) and the regex chokes on it resulting in a wrong basename() which in turn breaks comparision in findSequenceOnDisk

this is never true on windows:

for match in matches:
    if match.basename() == basename and match.extension() == ext:
           return match

fails to install in python3

pip3 install fileseq

Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\bla\AppData\Local\Temp\pip-build-x7q1yu73\fileseq\setup.py", line 9, in
execfile(os.path.join(here, "src/fileseq/version.py"))
NameError: name 'execfile' is not defined

Sequences with mixed padding are treated as if padding was consistent

Hello,
I ran into a bit of an issue today while testing some edge cases.
I have files on disk where padding isn't consistent (some files are padded ##, others ###).
While only one file is ## padded, the resulting FileSequence is also ## padded.

list_of_files = ["/test_data/photo.004.jpg",
                 "/test_data/photo.08.jpg",
                 "/test_data/photo.009.jpg",
                 "/test_data/photo.015.jpg"]

seqs = fileseq.findSequencesInList(list_of_files)
for seq in seqs:
    for path in seq:
        print(path)

Result:

/test_data/photo.04.jpg
/test_data/photo.08.jpg
/test_data/photo.09.jpg
/test_data/photo.15.jpg

I'm not too sure what the behavior should have been. Originally I was using findSequencesOnDisk, and thought using strict padding would have seen this as 2 sequences rater than 1, but that has no effect on the result.

FileSequence unicode printing error in python2

When printing a FileSequence object containing unicode, or explicitly converting it to a string, under python2 an exception is raised.

#!/usr/bin/env python2
# -*- coding: utf-8 -*-

import fileseq

highlight_list = [u'file_カ_Z.01.txt']
ret = fileseq.findSequencesInList(highlight_list)[0]

print(ret)

$ python2 seq_test.py
Traceback (most recent call last):
  File "seq_test.py", line 9, in <module>
    print(ret)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u30ab' in position 5: ordinal not in range(128)

The FileSequence class needs to have a __unicode__() method implemented and then the __str__() method would call that and encode the result before returning it. Since we are already using python-future, we can use an automatic decorator:
https://python-future.org/compatible_idioms.html#custom-str-methods

@futils.python_2_unicode_compatible
class FileSequence(object):

Add is_consecutive() to FrameSet class

Hi everyone!

We try to find a way to know if there are "holes" inside a given FrameSet.

The only way we have right now is to unroll the whole FrameSet and compare it to another FrameSet generated with start() and end():

>>> import fileseq
>>> fs = fileseq.FrameSet("410-470x30")
>>> set(list(fs)) == set(list(fileseq.FrameSet("{}-{}".format(fs.start(), fs.end()))
False  # there is holes between start and end!

Maybe is_truncated() is not the good name but you get the idea, what do you think about adding such method?

Regards,

Dorian

return padding format string

I'm using https://github.com/kkroening/ffmpeg-python in combination with fileseq to encode hundreds of folders into mp4 for posting. Because my artists used 4 padding I'm good but I was thinking it would be great if you could also output the original basepadding as a strf formatted attribute in the template attributes. Eg. a sequence that is padded with %04d would output a seqformat string of %04d so when I pipe it to another command it properly references the original sequence.

Add TravisCI integration.

Add Travis yaml file.
Lower verbosity setting.
Add badge to README.

findSequencesOnDisk() now breaks with non-sequence files mixed in same directory

Reported by @matthewkapfhammer

Reproduction:

test/
    file.json
    seq.0001.ext
    seq.0002.ext

$ fileseq.findSequencesOnDisk("test")
ValueError: invalid literal for int() with base 10: ''

The problem is that because the DISK_RE pattern was recently updated, and there aren't adequate tests for findSequencesOnDisk, an unchecked frames list containing an empty string is being returned and tries to get converted into an int. Need to skip these cases in the loop.

Add the README information as intro text to the API documentation

The generated API documentation do not contain enough information about the general behaviour of the fileseq library, such as the padding character definitions. We should merge the readme into the api docs.

Multiple digit components confuses findSequenceOnDisk()

Repro:

mkdir test_seq && cd test_seq 
loop -t 1-9 "touch foofile_#.0001.exr"

fileseq.findSequenceOnDisk('/tmp/test_seq/foofile_#.0001.exr')

filesequence.py in findSequenceOnDisk(cls, pattern, strictPadding)
    699 
    700         msg = 'no sequence found on disk matching {0}'
--> 701         raise FileSeqException(msg.format(pattern))
    702 
    703     @classmethod

FileSeqException: no sequence found on disk matching /tmp/test_seq/foofile_#.0001.exr

Missing frame token in FileSequence string representation when it has no range

HI,

If I create a FileSequence without specifying the frame range, or set an existing FileSequence’s range to the empty string, and then get its string representation, the following happens:

import fileseq
seq = fileseq.FileSequence('/path/to/frames.1-5@@@.exr')
print(seq)
print(str(fileseq.FileSequence('/path/to/frames.@@@.exr')))
seq.setFrameRange('')
print(seq)

/path/to/frames.1-5@@@.exr
/path/to/frames..exr
/path/to/frames..exr

Is this the expected behaviour?

Thanks,
Unai

Support file extensions with multiple dot-components

Refs justinfx/gofileseq#10

File sequence parsing does not support file extension formats that have multiple dot-components. Examples are:

seq_archive.1-10#.tar.gz
SimCache.100-200#.bgeo.sc

Current implementation would see the extension as ".gz" or ".sc" and drop the prefix components.
For sequences that have frames, we can assume that everything after the frame number or padding is the extension. We expect to parse ".tar.gz" and ".bgeo.sc".

For path patterns that do not contain a frame or padding, we can only assume the extension is the last dot-components: "file.with.multiple.info.ext" -> ".ext"

Docstring rtype is often wrong

Looks like there are many rtype when there should be type. Here is an example: https://github.com/sqlboy/fileseq/blob/e09461ae9a46860ee5c9c34f43fc9c5ca8caac87/src/fileseq/filesequence.py#L602

Param docstring is:

        :param pattern: the sequence pattern being searched for
        :rtype: str
        :param strictPadding: if True, ignore files with padding length different from pattern
        :rtype: bool
        :raises: :class:`fileseq.exceptions.FileSeqException` if no sequence is found on disk

Should be:

        :param pattern: the sequence pattern being searched for
        :type pattern: str
        :param strictPadding: if True, ignore files with padding length different from pattern
        :type strictPadding: bool
        :return: `FileSequence` instance matching the given `pattern`
        :rtype: :obj:`FileSequence`
        :raises: :class:`fileseq.exceptions.FileSeqException` if no sequence is found on disk

rtype documents the "returned type". type my_param documents the type of my_param.

Update utils.asString() to prevent derived bytes types from slipping past type check (py2)

This is an issue that only affects python2, which is still supported for legacy reasons.

Types that are derived from bytes on python2 are able to get past the checks in utils.asString():

fileseq/src/fileseq/utils.py

Lines 323 to 325 in 798c602

    
           elif isinstance(obj, bytes): 
        
               if not futils.PY2: 
        
                   obj = obj.decode(FILESYSTEM_ENCODING)

This leads to the derived type making its way into FileSequence parsing and breaking things during string splitting when it potentially uses its own string semantics (ie removing trailing slashes from a dirname components).

We should patch this function to ensure all derived bytes types are converted to bytes:

    elif isinstance(obj, bytes):
        if futils.PY2:
            obj = bytes(obj)
        else:
            obj = obj.decode(FILESYSTEM_ENCODING)

FrameSet.padFrameRange() uses incorrect padding width for negative frames

FrameSet.padFrameRange() is using an incorrect padding width for negative numbers>

fileseq.padFrameRange('-3-1', 2)
# '-03-01'

It is considering the number without its leading hyphen. This does not align with other sources such as Nuke or python zfill, where the output would be '-3-01'.

findSequenceOnDisk not honouring pad style

Hi,

Whenever I use findSequenceOnDisk on a path with hash marks, strict padding and PAD_STYLE_HASH1, the resulting sequence’s pad style is PAD_STYLE_HASH4 and its zfill() returns 4 times as much padding as it should have. See examples below:

import fileseq
seq = fileseq.FileSequence('/path/to/frames.1001-1024####.exr', pad_style=fileseq.PAD_STYLE_HASH1)
print(seq.padStyle() == fileseq.PAD_STYLE_HASH1, seq.zfill())
seq = fileseq.findSequenceOnDisk('/path/to/frames.####.exr', strictPadding=True, pad_style=fileseq.PAD_STYLE_HASH1)
print(seq.padStyle() == fileseq.PAD_STYLE_HASH4, seq.zfill())

(True, 4)
(True, 16)

Cheers,
Unai

FileSequence can serialised and deserialised into different sequences

Depending on how you construct a FileSequence you can generate a sequence that cannot correctly deserialised into an equivalent object.

For example

>>> import fileseq
>>> fs = fileseq.FileSequence("myfile01.ext")
>>> fs.basename()
'myfile01'
>>> fs.setFrameRange("10-20")
>>> fs.start()
10
>>> fs.end()
20
>>> fs.setPadding("#")
>>> str(fs)
'myfile0110-20#.ext'
>>> fs2 = fileseq.FileSequence(str(fs))
>>> str(fs)
'myfile0110-20#.ext'
>>> fs2.basename()
'myfile'
>>> fs2.start()
110
>>> fs2.end()
20
>>> str(fs) == str(fs2)
True
>>> fs.frameSet() == fs2.frameSet()
False

Since we've made the changes made for #47, it should only effect sequences where the basename ends in a digit or a dash.

Given there is a number of entry points for setting values, and a number of exist points for formatting I'm not sure the best place to enforce validation.

ParseException for partially matching frame range pattern

$ fs = fileseq.FileSequence("[email protected]")
ParseException: Failed to parse FileSequence: file_1-@.jpg

I'm not entirely sure if this should be a ParseException, or if the regular expression needs to be tightening up to handle not matching on a partial frame range. My thinking is that it should be treated more along the lines of how this would succeed in parsing...

$ fs = fileseq.FileSequence("[email protected]")

Fails if there's anything in-between frame-range and extension

Last version of the master branch:

The fileseq.findSequenceOnDisk method has trouble finding a sequence for a file sequence such as:
img#_suffix.jpg
will not be recognized. Even a single character after the frame-range specifier will make it fail, e.g:
img#_.jpg
Removing the suffix makes it work again:
img#.jpg

findSequenceOnDisk() returns wrong results with similar naming in same dir

As of fileseq 1.0.0, the modified logic in findSequenceOnDisk() has a bug when it will return the wrongly matched sequence if there are multiple sequences in the same directory with slightly different endings to their basenames...

somedir/
    frames.1-100#.exr
    frames.foo.1-100#.exr

And calling like this:

findSequenceOnDisk("somedir/frames.#.exr")
# somedir/frames.foo.1-100#.exr

The reason for this, is the function using a wildcard glob for the basename prefix, and doesn't check that the basename exaclty matches the result before choosing to return it. The previous implementation did have this check:
https://github.com/sqlboy/fileseq/blob/v0.5.1/src/fileseq/all.py#L541

Update package on PyPi

Hey guys, can you please update the package on PyPi because it rather outdated https://pypi.python.org/pypi/Fileseq/1.1.5

Python2 exception if byte-encoded unicode is passed to FileSequence constructor

If a unicode string is encoded to bytes and then passed into a FileSequence constructor, the str/repr handling will fail in Python2

    def testStrUnicode(self):
        ret = FileSequence(u'file_カ_Z.01.txt')
        # make sure none of these raise a unicode exception
        _ = str(ret)
        _ = text_type(ret)
        _ = repr(ret)
        _ = ret.format()

        if PY2:
            # only test this in py2 since we don't accept bytes
            # in python3+
            ret = FileSequence(u'file_カ_Z.01.txt'.encode('utf-8'))
            # make sure none of these raise a unicode exception
            _ = str(ret)
            _ = text_type(ret)
            _ = repr(ret)
            _ = ret.format()

    def __str__(self):
        """
        String representation of this :class:`FileSequence`.
    
        Returns:
            str:
        """
        frameSet = utils.asString(self._frameSet or "")
        return "".join((
            self._dir,
            self._base,
            frameSet,
            self._pad if frameSet else "",
>           self._ext))
E       UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 5: ordinal not in range(128)

Need to ensure that __str__ returns a unicode object as needed, if python2

Refs #99 as some of the initial work on python2 unicode improvements

findSequenceOnDisk is case sensitive on windows

os.path on windows is case insensitive.
os.path.exists(some_file) returns True regardless of case
findSequenceOnDisk uses glob which is case sensitive and returns no files if the case is incorrect. This leads to a weird code where you check if the file exists and when it does you try to get the sequence which raises an exception that there are no files.

findSequenceOnDisk() failing on this particular sequence

So I have a sequence that findSequenceOnDisk () seems to fail on:

touch ./A001C009_220324BV_924360.tga
touch ./A001C009_220324BV_924361.tga
touch ./A001C009_220324BV_924362.tga

Then in python:

import fileseq
fileseq.findSequenceOnDisk('./A001C009_220324BV_924360.tga').frameRange() #returns empty
''

Can anybody please confirm ?

Handling of a single frame input path has changed in 1.0.0

The behaviour in parsing and handling single frame paths has changed between 0.5.1 and 1.0.0

src = "/path/to/file_v2.exr"
fs = fileseq.FileSequence(src)
print fs.start(), fs.end(), fs.padding(), str(fs)

#0.5.1
(0, 0, '', '/path/to/file_v2.exr')

#1.0.0
(2, 2, '@', '/path/to/[email protected]')

I feel it is incorrect to interpret this pattern as a sequence, when it is really a version. I believe the old pattern handling required a naming scheme like this for seqs: "basename.frames.ext"

This change has caused some unexpected bugs when the str() rep, padding, and range values suddenly changed between versions.

Get frame ranges

Hi,

some exporters need a start and an end frame as parameters, so I need a method that can give me a list of start-end pairs from a FrameSet.
Something like:

>>> f = fileseq.FrameSet("1,5,7,8,10-13,14")
>>> f.frameRanges()
[(1, 1), (5, 5), (7, 8), (10, 14)]

Is there a method for this?

Parsing fails on single files that end in a extension containing a number

As of fileseq 1.0, it appears that a single file path will fail to parse, if the extension contains a number:

$ fs = fileseq.FileSequence("file.mp4")
$ print fs
TypeError: sequence item 4: expected string, NoneType found

The extension field doesn't parse, and causes __str__ to fail

Add FileSequence.getPaddingNum method

Add a method to get the padding number back from padding characters.

OverflowError with filenames with long numbers in it.

I'm (perhaps supidly) using it when iterating through our file-system using a combination of os.walk and fileseq.findSequencesInList, and there are a couple of edge-cases that are causing it to error out with: OverflowError: Python int too large to convert to C long

Obviously I can catch it, but I'm wondering if should be handled internally.

Case 1:
A directory with the following splendidly named files:

-4780394256886551358.tmp
-5210526082881518455.tmp
-5413498426633446015.tmp

(Not quite sure where they are coming from, but I'm suspecting aspera). The drag is that this was in a directory with some sequences that I would like to pick up.

Case 2:

thumbcanon___063078065293.png
thumbcanon___063078065294.png
thumbcanon___242076169693.png
thumbcanon___242076169694.png
thumbcanon___242076169695.png

In this case these numbers are actually serial numbers.

Thanks for the work on this...

Sam.

Python future division can cause getPaddingChars() to break

If the python interpreter is using future division (int / int == float), then getPaddingChars() can break when it tries to multiply characters by a number.

Add support for per frame meta data.

findSequenceOnDisk with mixed case on Windows

Hello,

This issue is most lickely related to #74

I've had a strange log on a production that was producing the following error:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    result = filesequence.FileSequence.findSequenceOnDisk(pattern)
  File "path\to\1.12.3\fileseq\filesequence.py", line 1053, in findSequenceOnDisk
    globbed, using=seq, pad_style=pad_style, allow_subframes=allow_subframes
  File "path\to\1.12.3\fileseq\filesequence.py", line 715, in yield_sequences_in_list
    for path in filter(None, map(utils.asString, paths)):
  File "path\to\1.12.3\fileseq\filesequence.py", line 1124, in _filterByPaddingNum
    frame = get_frame(item) or ''
  File "path\to\1.12.3\fileseq\filesequence.py", line 1031, in <lambda>
    get_frame = lambda f: re.match(patt, f).group(1)
AttributeError: 'NoneType' object has no attribute 'group'

It turns out the error comes from an incorrect file sequence, that was created with two different cases. I have recreated it with a simple example:

Then I execute the following code:
(Note that I'm doing the query with the lowercase toto

from fileseq import filesequence

pattern = r"C:\Users\hbaldzuhn\Documents\test\test_file_seq\super_toto_#.txt"
result = filesequence.FileSequence.findSequenceOnDisk(pattern)
print("result: {}".format(result))

This triggers the error.

Note that when using fileseq-1.9.0, which was on another production environement, the issue does not raise

fileseq-1.9.0

result: C:\Users\hbaldzuhn\Documents\test\test_file_seq\super_toto_1-10@@@.txt

Linux

On linux, the command does not fail, but the resulting sequence is only between 1-6

Note

With both versions, on windows, findSequencesOnDisk() on the folder shows the 2 sequences:

C:\Users\hbaldzuhn\Documents\test\test_file_seq\super_toto_1-6@@@.txt
C:\Users\hbaldzuhn\Documents\test\test_file_seq\super_TOTO_7-10@@@.txt

The question about case sensitivity (seen on the other issue) is a subject on itself, but the behavior should be consistent.

Here, the error goes deep to the regex check, which is very hard to debug

findSequenceOnDisk() doesn't check the ext

When using findSequenceOnDisk() against a directory that contains multiple formats of the same sequence, only the first one matching the basename will be selected and returned.

For instance, given:

/path/to/images.#.exr
/path/to/images.#.jpg

A call to findSequenceOnDisk("/path/to/images.#.jpg") will return the exr sequence. It should probably also compare the extension, in addition to the basename.

DeprecationWarning: object() takes no parameters

On a recent upgrade of pytest, it started returning this warning being emitted from fileseq: DeprecationWarning: object() takes no parameters. This was from fileseq-1.7.1 running under python-2.7.15.

To reproduce

/virtualenv/bin/python -Wd
Python 2.7.15 (default, Feb 27 2019, 15:50:41)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import fileseq
>>> fileseq.FrameSet("1-10")
~/virtualenv/lib/python2.7/site-packages/fileseq/frameset.py:83: DeprecationWarning: object() takes no parameters
  self = super(cls, FrameSet).__new__(cls, *args, **kwargs)
FrameSet("1-10")

FrameSet.frange() decstring inconsistency

fileseq/src/fileseq/frameset.py

Line 262 in 2a63a9d

frozenset:

The doctstring says frange returns a frozenset, but for all the tests I'm doing I can only get it to return a string, and the code seems to intend it to return a string.
Should the docstring be updated?

Slicing FileSequences is not supported

>>> fs = fileseq.FileSequence('/my/path/image.#.jpg')

>>> fs.setFrameRange('1-12')

>>> fs[0]
'/my/path\\image.0001.jpg'

>>> fs[1]
'/my/path\\image.0002.jpg'

>>> fs[-1]
'/my/path\\image.0012.jpg'

>>> fs[len(fs)-2]
'/my/path\\image.0011.jpg'

>>> fs[1:]
Traceback (most recent call last):

  File "<ipython-input-27-8cd8dd826378>", line 1, in <module>
    fs[1:]

  File "C:\Users\jordan\AppData\Local\Continuum\anaconda2\lib\site-packages\fileseq\filesequence.py", line 362, in __getitem__
    return self.index(idx)

  File "C:\Users\jordan\AppData\Local\Continuum\anaconda2\lib\site-packages\fileseq\filesequence.py", line 336, in index
    return self.frame(self._frameSet[idx])

  File "C:\Users\jordan\AppData\Local\Continuum\anaconda2\lib\site-packages\fileseq\filesequence.py", line 313, in frame
    zframe = str(int(frame)).zfill(self._zfill)

TypeError: int() argument must be a string or a number, not 'tuple'

handling of stereo formatting

long time user, first time caller!

Is there any plan on handling stereo or view based formatting? nuke uses %v or %V and maybe there are ways I don't know about? not sure if this got standardized along the way.

fsd = fileseq.findSequenceOnDisk(self._filename)
File "fileseq/filesequence.py", line 493, in findSequenceOnDisk
raise FileSeqException(msg.format(pattern))
fileseq.exceptions.FileSeqException: no sequence found on disk matching comp_v01-%v.0001-0464#.jpg

Removing cross platform tests

I have issues with the cross platform tests:

test_unit.TestFindSequencesOnDisk.testCrossPlatformPathSep
test_unit.TestFindSequenceOnDisk.testCrossPlatformPathSep

The tests are "mocking" os.path which breaks the behavior of os.walk on Linux (python 2.7).
Since the tests are done on both Windows and Linux, I don't see the point of keeping these tests.
Or maybe there is something I didn't grasp?

sky.exr

The "fill" pattern y. is causing a file called sky.exr to be treated as a sequence.

FrameSet constructor raises different exceptions between releases

When the FrameSet constructor is passed a sequence containing bad values (strings for instance), we see different exceptions between v1.10.0 and v1.11.0

Repro

import fileseq
fileseq.FrameSet(["a", "b"])

v1.10.0

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../fileseq/frameset.py", line 128, in __init__
    self._order = tuple(order)
  File ".../fileseq/utils.py", line 119, in <genexpr>
    return (i for i in chain(*iterables) if i not in seen and not _add(i))
ValueError: invalid literal for int() with base 10: 'a'

v1.11.0

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File .../fileseq/frameset.py", line 139, in __init__
    self._frange = self.framesToFrameRange(
  File ".../fileseq/frameset.py", line 1375, in framesToFrameRange
    ret = ','.join(FrameSet.framesToFrameRanges(frames, zfill))
  File ".../fileseq/frameset.py", line 1262, in framesToFrameRanges
    curr_stride = abs(curr_frame-curr_start)
TypeError: unsupported operand type(s) for -: 'str' and 'str'

In either case the proper exception should really be a fileseq.exceptions.ParseException just like that which is raised when passing a bad single string arg value. We should add the missing tests for bad sequence args, as strings, and ensure it raises the same exception.

Support for <UDIM> token?

Hello.

I have some need to support <UDIM> as a token for the frame padding, which should be interpreted as %04d.

Is that something you could see being useful as part of the library? If so I could try to implement and make a pull request, if not I might just do my own branch.

Find sequences of folders

Hello!

I wonder if it would be possible to find sequences of folders as it is possible to find sequences of files.

I managed to make it work by modifying the FileSequence.findSequencesOnDisk method.

# Get just the immediate files under the dir.
# Avoids testing the os.listdir() for files as
# a second step.
ret = next(os.walk(dirpath), None)

files = ret[-1] if ret else []

if allow_folders:
    files += ret[-2] if ret else []

Is there a good reason that would make this feature irrelevant and/or difficult to code/maintain?

Thanks in advance for your answers.

Error in DISK_PATTERN regex

The on-disk regex pattern fails for the following case:

>>> import fileseq, re
>>> pattern = re.compile(fileseq.all._ON_DISK_PATTERN)
>>> pattern.match('foo.r1.bar/fbx').groups()
(None, 'foo.r', '1', '.bar/fbx')

This causes errant values to be introduced in the findSequencesOnDisk() method, ultimately resulting in a parsing exception.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "fileseq/all.py", line 357, in findSequencesOnDisk
    result.append(FileSequence(seq))
  File "fileseq/all.py", line 167, in __init__
    self.__frameSet = FrameSet(m.group(3))
  File "fileseq/all.py", line 79, in __init__
    raise ParseException("Failed to parse frame range: %s on part '%s'" % (frange, part))
fileseq.all.ParseException: Failed to parse frame range: x on part 'x'

findSequenceOnDisk() should try to observe the provided padding chars when matching

When findSequenceOnDisk() is given a sequence such as "/path/to/foo.@@@@@.ext", it really just treats the padding as a single wildcard like "/path/to/foo.*.ext". This will work fine in cases where directories observe conventions of not containing mixed sequences. Even having mixed sequences in the same directory would be fine, but in cases where there are two sequences named exactly the same but using different frame padding, the results will not reflect the one that matches the originally specified padding.

Example:

/path/to/foo.1-100@@@@@.ext
/path/to/foo.1-100#.ext

findSequenceOnDisk("/path/to/foo.#.ext")
# /path/to/foo.1-100@@@@@.ext

FileSequence.basename cut at the wrong place with still frames

I use findSequencesOnDisk to retrieve my sequences, it also returns the files without padding as I want it. But with the still frames, the basename methods cut the name at the first point, not the last.

I have this kind of filename: PROJECT_010_110_CAT.RGB_color.exr
The basename method returns: PROJECT_010_110_CAT.
The extension method returns: .RGB_color.exr

With a filename with padding: PROJECT_010_110_CAT.RGB_color.0042.exr
The basename returned is: PROJECT_010_110_CAT.RGB_color.
And the extension is: .exr

findSequencesOnDisk should not find hidden files by default

Recent changes to yield_sequences_in_list have started to expose hidden files missed by earlier versions of fileseq. This changes the behavior of findSequencesOnDisk, however, which would not previously return hidden files, even if they existed in the directory being searched. Showing hidden files seems appropriate as an optional behavior, specifically asked for, rather than the default one.

FileSequence formatting breaks if setDirname path is missing trailing path sep

If FileSequence.setDirname(path) is called with a path value that is missing the trailing path separator, the subsequent formatting of the sequence will be joined in a broken way; the class assumeds the dirname always ends in a path sep.

import fileseq

s = fileseq.FileSequence("/foo/bar/baz.1-10#.ext")
s.setDirname("/foo")
print s
# /foobaz.1-10#.ext

Multiple step sizes lost when calling intersection()

I would expect the code below to print FrameSet("1-100x25,1-100x1"), as both FrameSets are identical, but it prints FrameSet("1-100"). Technically, both FrameSets (fs and fs2) contain the same frames, but FrameSet.order is different for each.

from fileseq import FrameSet
fs = FrameSet("1-10x3,1-10x1")
fs2 = fs.intersection(fs)
# This will be "1-10", not "1-10x3,1-10x1".
print fs2
# Note the difference in the order between the two.
print fs.order
print fs2.order

This seems to happen because FrameSet.items is a frozenset that is intersected with another frozenset, then converted back to a frame range. This could be the case for other functions, but intersection() was my use case. I am currently working around the issue by splitting one of the FrameSet's by ',', intersecting one by one, and joining them back together. Any chance for a fix?

Allow regex patterns to be modified in subclass of FileSequence/FrameSet

For some edge cases, users would like the ability to use modified versions of some of the regexes to accomodate specialized situations. An example use-case is to make a FileSequence not detect negative frame ranges which could be mistaken when parsing "file-10.ext" and the basename/frame are expected to be "file-" and "10", respectively.

If we alias the regex constants into class attributes on FileSequence and FrameSet, then users have the option to subclass and replace these patterns. All direct references to the classes need to be replaced with cls references, and static methods need to be converted to class methods where appropriate.
The patterns will still need to be well-behaved in providing the same capture groups, but that is on the developer.

Init FrameSet from range/limits?

Hi Chad and thanks for the hard work.

frame_set = FrameSet('{}-{}'.format(frame_in, frame_out))

I think it's a little silly. I understand giving it an iterator generates a FrameSet with given explicit frames:

>>> FrameSet([100, 102])
FrameSet("100,102")

But I was wondering:

Does it make sense to have constructor:

>>> FrameSet.from_range(100, 102)
FrameSet("100-102")

Or :

>>> FrameSet.from_limits(100, 102)
FrameSet("100-102")

Any advise?

Thanks in advance!

	elif isinstance(obj, bytes):
	if not futils.PY2:
	obj = obj.decode(FILESYSTEM_ENCODING)