garcia / simfile Goto Github PK
View Code? Open in Web Editor NEWA modern simfile parsing & editing library for Python 3
License: MIT License
A modern simfile parsing & editing library for Python 3
License: MIT License
While using nine or null, for one particular file it would not be able to load. After looking through the files I found the issue:
[2023-06-29 21:37:37.085] ERROR stray 'x' encountered at start of document
Traceback (most recent call last):
File "nine_or_null\__init__.py", line 729, in batch_process
File "simfile\__init__.py", line 109, in open
File "simfile\__init__.py", line 152, in open_with_detected_encoding
File "simfile\__init__.py", line 83, in load
File "simfile\base.py", line 164, in __init__
File "simfile\ssc.py", line 217, in _parse
File "msdparser\parser.py", line 103, in parse_msd
File "msdparser\parser.py", line 72, in push_text
As it turns out, you can have leading text at the very top of a MSD file, and (at least tested on ITGMania, but this is such an edge case I wouldn't doubt it also works in sm5.1/etc) it will still be usable in game. However, since this is completely unexpected input from a proper file w/r/t this library, it throws an error.
There are some idiosyncrasies in how simfiles are handled by StepMania that aren't modeled by this package at all currently. For example, if you want to load the banner for a simfile on disk, you can certainly try using os.path
to Frankenstein the simfile's filename and sim.banner
together... but that only works if the BANNER tag is filled in, which isn't guaranteed. (For example, quite a few files in UPSMH Online have a blank BANNER tag.) If you leave the field blank, StepMania will try to find the appropriate image file on its own! This also applies to the music path.
While this logic can be implemented in client code, I think it warrants inclusion as part of this package, given that processing simfile data together with audio or image data is a pretty noteworthy use case. It also helps us work toward the goal of handling any simfile that StepMania accepts.
I had an implementation section here previously, but I totally forgot that simfiles don't know their own filenames, so my proposed implementation was bogus. We would likely need a new submodule for loading a simfile's supplementary files.
Related to #5. The timing engine should handle negative BPMs as if they were warps. The best way to do this is almost certainly to convert them to warps internally, rather than trying to design the engine's internal data structures around this archaic hack.
Right now TimingEngine is almost certainly prone to raising exceptions if given negative BPMs (I haven't tested it yet). At minimum, it won't correctly flag negative BPM warp regions as being un-hittable()
. As such, this qualifies as a bug, although fixing it will likely constitute a feature release (and thus minor version bump) for #5.
Two bugs in simfile 2.0.0–2.1.0's SSC implementation break multi-value properties, causing them to be truncated or mangled past the first value:
DISPLAYBPM
and ATTACKS
properties of both simfiles and charts stop parsing at the first :
. For DISPLAYBPM
, this means a BPM range of min:max
will be incorrectly parsed as a static BPM of min
. ATTACKS
are completely broken as they use colon as a separator.SSCChart
. This has the same effects described above, but this only affects values with colons that were written to the chart object manually (e.g. chart.displaybpm = "120:240"
) since the first bug shadows this bug during deserialization.Thanks for the great package!
I noticed that when parsing charts with 12ths or 24ths, the underlying Fraction representation for Beats are very strange.
Here's a minimal reproducing example where we parse an SM file with a single measure full of 12th notes and print the numerator and denominator of all Note objects:
https://gist.github.com/jcjohnson/603d9b5ab50016429f762cf9a5db2af7
When run, it gives the output:
(0, 1)
(6004799503160661, 18014398509481984)
(6004799503160661, 9007199254740992)
(1, 1)
(6004799503160661, 4503599627370496)
(7505999378950827, 4503599627370496)
(2, 1)
(5254199565265579, 2251799813685248)
(6004799503160661, 2251799813685248)
(3, 1)
(7505999378950827, 2251799813685248)
(8256599316845909, 2251799813685248)
The problem is here: https://github.com/garcia/simfile/blob/master/simfile/notes/__init__.py#L169
You are passing a float to the Fraction constructor rather than a pair of integers. This will behave fine when the fraction is a power of two (since these can be represented exactly as floating-point values) but will give strange behavior when the fraction is not a power of two.
The fix is simple; you just need to use the Fraction constructor that accepts a pair of integers for the numerator and denominator. When run with --apply-patch=1
, my test script above monkey-patches NoteData.__iter__
to apply this fix; this then gives a much more normal result when parsing the chart with 12th notes:
(0, 1)
(1, 3)
(2, 3)
(1, 1)
(4, 3)
(5, 3)
(2, 1)
(7, 3)
(8, 3)
(3, 1)
(10, 3)
(11, 3)
Currently, lastsecondhint is only supported as a simfile property for the entire simfile. However, it seems like Project Outfox (formerly Stepmania 5.3) supports lastsecondhint as a CHART property as well. Not only that, but it'll try to dynamically determine the appropriate #lastsecondhint for charts that do not explicitly specify that property, and, when generating the cached simfile, will ovewrite the ssc header #lastsecondhint (which is used in various capacities by themes) with the longest #lastsecondhint between all the charts contained within the ssc, including ones that were dynamically generated due to being empty.
Because of that, it'd be very useful if we could batch add a #lastsecondhint to all charts, avoiding the issue of one chart within the ssc messing up other charts intended lastsecondhint.
I'd appreciate if such a functionality was added to the API. I believe this can probably be achieved by setting strict simfile parsing to off, but I haven't been able to get it to work. For reference, here's the script I'm using: https://github.com/SheepyChris/PIU-Simfiles/blob/main/Scripts/ChartProcessing.py (currently only adds lastsecondhint to headers, not to each chart).
Hi, thank you for the module! For some code that I am writing, I need to align note time with song time. the simfile object has the attribute "offset", which as I understand it is the length of time until beat 0. Are the note times when using time_notes() relative to the "song" or the actual audio?
thank you
Keysounded SSC charts store their note data in a NOTES2
property instead of the usual NOTES
(the two properties are mutually exclusive to my knowledge). This poses an interesting design problem for simfile: should invoking .notes
/ ['NOTES']
on an SSCChart use the NOTES2
data when present, or should this logic be kept in NoteData
?
Separation of concerns dictates that NoteData
should decide where to find the note data, and SSCChart
should stay free of "business logic" and act as a plain dictionary of properties. This is how the library works currently, but this functionality isn't particularly discoverable, and I can imagine a lot of users not understanding why they need to call notedata.update_chart(chart)
instead of the more straightforward chart.notes = str(notedata)
. It would be nice if the straightforward solution just worked in all cases.
Any ideas on how to resolve this elegantly?
msdparser returns a value of None
when a parameter has no colon separator (e.g. #KEY;
). This deserializes into a simfile object correctly, but attempting to serialize it back to a string throws an AttributeError
:
File "/lib/python3.10/site-packages/simfile/_private/serializable.py", line 19, in __str__
self.serialize(serialized)
File "/lib/python3.10/site-packages/simfile/base.py", line 191, in serialize
file.write(f"{param}\n")
File "/lib/python3.10/site-packages/msdparser/parameter.py", line 87, in __str__
self.serialize(output, escapes=escapes)
File "/lib/python3.10/site-packages/msdparser/parameter.py", line 80, in serialize
file.write(MSDParameter.serialize_component(component, escapes=escapes))
File "/lib/python3.10/site-packages/msdparser/parameter.py", line 57, in serialize_component
return reduce(
File "/lib/python3.10/site-packages/msdparser/parameter.py", line 58, in <lambda>
lambda key, esc: key.replace(esc, f"\\{esc}"),
AttributeError: 'NoneType' object has no attribute 'replace'
First off, thanks for building this library! It's been remarkably easy to use for simfile introspection.
I ran into this unfortunate edge case while parsing through DDR A20 PLUS content:
Traceback (most recent call last):
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/simfile/__init__.py", line 89, in open
return open_with_detected_encoding(
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/simfile/__init__.py", line 137, in open_with_detected_encoding
return (load(file, strict=strict), encoding)
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/simfile/__init__.py", line 67, in load
return SMSimfile(file=file, strict=strict)
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/simfile/base.py", line 139, in __init__
self._parse(parse_msd(
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/simfile/sm.py", line 159, in _parse
for (key, value) in parser:
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/msdparser/__init__.py", line 117, in parse_msd
ps.write(char)
File "/Users/vyhd/.pyenv/versions/3.9.2/lib/python3.9/site-packages/msdparser/__init__.py", line 62, in write
raise MSDParserError(f"stray {repr(text)} encountered")
msdparser.MSDParserError: stray 'G' encountered
which was triggered by this line: #SOURCE:ゲーム「STEINS\;GATE」;
in the SM file located at: https://zenius-i-vanisher.com/v5.2/viewsimfile.php?simfileid=45612
It looks like StepMania's MSD parser allows you to use \
to escape any control character (source), so this lib probably should be able to as well.
I... might be able to submit a PR for this sometime in the next few weeks? But I wanted to flag it now, in case someone gets to it first.
>>> import simfile
>>> simfileobj = simfile.open('testdata/Springtime.ssc')
>>> simfileobj.charts = []
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
This really should work, but it doesn't currently. You can modify the charts
list, but can't reassign it.
Found a couple of asset discovery discrepancies between simfile and StepMania:
bn.png
is discovered by StepMania, but looking at the source code, I don't know why. It seems to look for a filename containing banner
or ending with bn
(with a space!), neither of which seem to match bn.png
. Maybe FindFirstFilenameContaining is doing something overly complex?#BACKGROUND:file.png;
should resolve to a FILE.PNG
) but it would be good to confirm this.The SSC format includes a FAKES
property (at both the simfile and chart level) that defines fake regions - beat ranges in which notes of all types are unhittable. This, together with warp segments, are the two reasons why a note might not be hittable, as observed in TimingData.h. (StepMania uses the term judgable here, but hittable appears to be used synonymously elsewhere in source code.)
Right now, TimingEngine
only marks warp regions as unhittable. Fixing this cleanly will probably require adding a fakes
attribute to TimingData
. Currently, the attributes in TimingData
are scoped to just those required for TimingEngine
to do its job, under which rule this addition makes perfect sense. However, it does count as an expansion to the public API, which will incur a minor version bump.
Also of note: in the future, we may want to add all of the timing properties for which any chart-level data replaces the simfile-level data. This set of properties is documented in NotesLoaderSSC.cpp.
For example, OceanLab Megamix contains spurious control characters in some BPM values. StepMania appears to use the C++ function std::stof
to parse each number, which stops when it reaches any character that can't be parsed into the number. The control characters only ever cut off zeros in the decimals, so there is no impact to gameplay in this case. As far as StepMania is concerned, there's nothing wrong with the file.
TimingData
, on the other hand, throws an exception when presented with this simfile. This is because Python's Decimal
class constructor expects the incoming string data to be clean. We should change our behavior to match StepMania to support this and similar files.
For sections that are warped over or fake segments, if they contain notes/mines/etc, the counts aren't removed. This is a deviation from StepMania's behavior where it correctly processes those counts.
I recognize that Warps and Fakes have limited support, but I think a lot of charts have been making use of them lately so might be something to consider.
simfile/simfile/timing/engine.py
Line 283 in caacf9c
I phrased this weirdly; at minimum, it should say "if and only if either of these conditions hold true". But there's probably a better way to phrase it to begin with.
The displaybpm
function can't handle #DISPLAYBPM:;
or malformed strings currently. I assume StepMania just ignores the field in these cases, but would be good to verify.
Stumbled upon this when I created a SimfilePack
using the ITG Level Asian pack and found that banner()
gave cdtitle-tetaes.png instead of the expected bn.png. This doesn't seem to happen consistently, however.
I decided to try and investigate this. My C++ knowledge is pretty lacking, but from looking through the StepMania code, I suspect the data structure ultimately being iterated through to find the pack banner file is an std::set
, which apparently maintains an order on its elements, in this case alphabetically by filename. Therefore, we pick the alphabetically first file satisfying the required conditions to be the pack banner, which seems to match up with the behaviour I observed testing the game manually. In contrast, os.listdir()
(what simfile uses) does not guarantee its results will be ordered at all, which I think this is where this discrepancy came from. This might need a more thorough investigation to figure out the exact behaviours, though.
We should add a stopgap for simfiles with warp events (WARPS tags for SSC, negative BPMs / stops for SM) so that we don't produce invalid simfiles. The lack of a stopgap necessitated a patch in ITL 2022. At minimum we should throw an exception in this scenario.
Processing one erroneous song causes a ValueError. Namely, there are too many commas in the bpms, which then get split and end up in empty strings which cannot be split by = and unpacked into two values. Here is one of the errors, it's caused for all charts because it's in the main bpms.
Miss You 0
Traceback (most recent call last):
File "...", line 8, in calculate_statistics
timing_data = TimingData(simfileInstance, simfileInstance.charts[chart])
File "D:\Meatball\venv\lib\site-packages\simfile\timing\__init__.py", line 210, in __init__
self.bpms = BeatValues.from_str(simfile_or_chart.bpms)
File "D:\Meatball\venv\lib\site-packages\simfile\timing\__init__.py", line 174, in from_str
beat, value = row.strip().split("=")
ValueError: not enough values to unpack (expected 2, got 1)
Also, here is the offending part of the file, which is attached below:
#BPMS:0.000000=132.000000
,172.000000=99.000000
,172.250000=49.500000
,172.500000=66.000000
,205.000000=-132,
,209.000000=132,
,213.000000=-132,
,217.000000=132,
,221.000000=-132,
,225.000000=132,
,229.000000=-132,
,232.875000=132
;
StepMania 3.9 infamously "supported" warp segments through a timing data exploit: setting the BPM to negative at some beat, then back to positive at a later beat, would warp the notes from the negative BPM change to... somewhere after the positive BPM change. To my knowledge, this exploit no longer works in StepMania 5, in part because the behavior is now officially supported through the WARPS
timing data property. However, the StepMania editor still supports the legacy hack, converting any warps in the internal SSC data to negative BPMs when saving an SM file. This not only means that simfiles with warps created in SM5 will still work in 3.9, but that old files from the 3.9 era can be edited in SM5 non-destructively.
To achieve feature parity, simfile.convert.sm_to_ssc
should convert negative BPMs to warps, and simfile.convert.ssc_to_sm
should do the opposite (only for simfile warps - warps in the chart timing will raise an exception by default).
So, I was just going through files trying to count game elements and noticed that whenever an empty edit came up, the library crashed due to a list index being out of range. Here are three similar errors. For each, I give the file name and the index of the chart failing, followed by the stack trace. Find the files attached:
Dance Vibrations 2
Traceback (most recent call last):
File "...", line 10, in calculate_statistics
note_data = NoteData(simfileInstance.charts[chart])
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 136, in __init__
self._columns = NoteData._get_columns(self._notedata)
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 142, in _get_columns
first_line = first_measure.strip().splitlines()[0].strip()
IndexError: list index out of range
That Is Fair 4
Traceback (most recent call last):
File "...", line 10, in calculate_statistics
note_data = NoteData(simfileInstance.charts[chart])
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 136, in __init__
self._columns = NoteData._get_columns(self._notedata)
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 142, in _get_columns
first_line = first_measure.strip().splitlines()[0].strip()
IndexError: list index out of range
Virtual Emotion 0
Traceback (most recent call last):
File "...", line 10, in calculate_statistics
note_data = NoteData(simfileInstance.charts[chart])
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 136, in __init__
self._columns = NoteData._get_columns(self._notedata)
File "D:\Meatball\venv\lib\site-packages\simfile\notes\__init__.py", line 142, in _get_columns
first_line = first_measure.strip().splitlines()[0].strip()
IndexError: list index out of range
Downgrading to msdparser==2.0.0b1
seems to resolve the issue for me.
Python 3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import simfile
>>> simfile.open("Xuxa.sm")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\simfile\__init__.py", line 89, in open
return open_with_detected_encoding(
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\simfile\__init__.py", line 137, in open_with_detected_encoding
return (load(file, strict=strict), encoding)
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\simfile\__init__.py", line 63, in load
is_ssc = _detect_ssc(peek_file)
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\simfile\__init__.py", line 46, in _detect_ssc
for first_key, _ in parser:
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\msdparser\__init__.py", line 202, in parse_msd
for token, value in lex_msd(
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\msdparser\__init__.py", line 366, in lex_msd
chunk = textio.read(4096)
File "C:\Users\w\AppData\Local\Programs\Python\Python39\lib\site-packages\simfile\_private\tee_file.py", line 13, in __getattr__
return getattr(self._file, name)
AttributeError: 'itertools._tee' object has no attribute 'read'
Very simple fix.
Workaround: replace assets = simfile_dir.assets()
with assets = Assets(simfile_dir.simfile_dir, filesystem=simfile_dir.filesystem)
.
Hey, so I was using your tool to save files with new difficulty ratings, and it worked for almost all of my files, except for ECFA 2021\MAX 300 (SX 12)\max 300.ssc
, which gives me an AttributeError: 'NoneType' object has no attribute 'replace'
. I suspect this is due to line 47 (which, btw, is also slightly broken) where gimmick=none
. This is the first line that was not stored. Find a stack trace attached.
error.txt
We should document simfile.convert
with a little more grace than what the AutoAPI page currently has to offer.
The dedicated page should include examples for practical use cases, like saving to both the SM and SSC file in a simfile directory, validating that the two files on disk have matching base properties, and backfilling missing SM files for the sake of StepMania builds that don't support SSC (like NotITG).
Lines 38 to 41 in 272c837
Neither of these conditions will ever be true. Splitting with .rpartition('.')
means the extension won't have the .
prefix.
In practice, this rarely causes issues because detection by VERSION tag is reliable enough. But if you use simfile.open
to open an empty file with a .ssc
extension, it comes back as an SMSimfile. Noticed this while writing unit tests.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.