Comments (8)
https://github.com/badboy/iso8601/blob/master/src/parsers.rs is probably not a bad place to start to start looking at text-based nom parsing.
from mack.
For feat
in titles at least, we probably want take_until!
from mack.
I think one of the main downsides for nom is that the learning curve seems pretty high. For example, I've been struggling just to do feat extraction for a while, which is made much worse by cryptic type mismatches from within macros themselves:
mack % gd
diff --git src/fixers.rs src/fixers.rs
index 42265b7..f9a2f10 100644
--- src/fixers.rs
+++ src/fixers.rs
@@ -1,6 +1,6 @@
use regex::{Regex, RegexBuilder};
use taglib;
-use types::{Fixer, MackError, Track};
+use types::{Fixer, MackError, Track, TrackTitle};
use std::borrow::Cow;
lazy_static! {
@@ -9,6 +9,23 @@ lazy_static! {
).case_insensitive(true).build().unwrap();
}
+named!(open_bracket, alt!(tag!("(") | tag!("[")));
+named!(close_bracket, alt!(tag!(")") | tag!("]")));
+named!(feature_word, alt!(tag_no_case!("featuring") | tag_no_case!("feat") | tag_no_case!("ft")));
+named!(parse_track_title<&[u8], TrackTitle>,
+ do_parse!(
+ track_name: take_till!(alt!(opt!(open_bracket) | eof!())) >>
+ feature_word >> opt!(".") >>
+
+ featured_artists: take_till!(opt!(close_bracket)) >>
+
+ (TrackTitle {
+ track_name: track_name,
+ featured_artists: featured_artists,
+ })
+ )
+);
+
pub fn run_fixers(track: &mut Track, dry_run: bool) -> Result<Vec<Fixer>, MackError> {
let mut applied_fixers = Vec::new();
let mut tags = track.tag_file.tag()?;
diff --git src/main.rs src/main.rs
index 2901b45..f41bc87 100644
--- src/main.rs
+++ src/main.rs
@@ -4,6 +4,8 @@ extern crate ignore;
extern crate lazy_static;
extern crate regex;
extern crate taglib;
+#[macro_use]
+extern crate nom;
mod fixers;
mod track;
diff --git src/types.rs src/types.rs
index 001e3d0..c3109a0 100644
--- src/types.rs
+++ src/types.rs
@@ -7,6 +7,12 @@ pub struct Track {
pub tag_file: taglib::File,
}
+#[derive(Debug)]
+pub struct TrackTitle {
+ pub track_name: String,
+ pub featured_artists: Vec<String>,
+}
+
#[derive(Debug)]
pub enum Fixer {
FEAT,
from mack.
Using regexes as a replacement mechanism really show their limitations when it comes to things like replacing the final "and" with ", &", since we don't know if there are only two artists (and so there should be no comma) or not.
I suspect the best balance for now is to extract using regex, and have specific types for a title, and an artist (possibly with "feat." in). For other types I don't think we'll need them yet since I can't think of multiple pieces of metadata contained.
from mack.
Well, there are other limitations with the regex
crate for our needs. Greedy matching to try and extract particular things (like (feat. X)
) while eliminating the surrounding context often causes us to be unable to reasonably make things optional, but still prefer to cut them out. Since there's no negative lookahead, we can't do much. Maybe fancy-regex
really can help here.
from mack.
Or maybe it's just more judicious use of regexes that's the solution. We could explicitly look for our feat string and do the rest manually.
from mack.
lalrpop possibly looks more promising: https://github.com/lalrpop/lalrpop
from mack.
The current TrackFeat approach seems the best compromise for now.
from mack.
Related Issues (20)
- Work out how to deduplicate fix_tag_whitespace HOT 1
- Go through error handling and make sure it's sane
- Add path renaming HOT 2
- Allow providing multiple inputs
- Add travis tests HOT 1
- Tidy up main HOT 1
- Add OSX/Windows tests HOT 1
- Consider configuring build configs HOT 1
- Upload to crates.io
- Investigate taglib performance HOT 1
- Store modification time of files and don't recheck if matching HOT 2
- Set exit codes on partial failures
- About assets HOT 2
- clean_part should look at directories
- Move to use audiotags
- Handle char boundary truncation
- Allow custom feat format
- Support custom max path len
- Normalise multiple dots to a single one for exfat
- Add rename tests
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mack.