sonos / rustling-ontology Goto Github PK
View Code? Open in Web Editor NEWOntology for rustling
License: Other
Ontology for rustling
License: Other
Initial issue reported: https://github.com/snipsco/next-release/issues/809
"last wednesday between one thirty and three forty-five am" gives wrong resolution on the left side time-of-day
platform v1.2, v1.3
en - other languages may be impacted too
last wednesday between one thirty and three forty-five am
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | Output(OutputValue) |
+====+============+=============+====================================================+====================================================================================================================================================================================================+
| 0 | -3.3397157 | 0.035447035 | last wednesday between two and three forty-five am | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-09T00:00:00+02:00, end: 2019-10-09T03:45:00+02:00, precision: Exact, latent: false }, datetime_kind: Datetime }) |
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | Output(OutputValue) |
+====+============+=============+====================================================+====================================================================================================================================================================================================+
| 0 | -3.3397157 | 0.035447035 | last wednesday between two and three forty-five am | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-09T01:30:00+02:00, end: 2019-10-09T03:45:00+02:00, precision: Exact, latent: false }, datetime_kind: Datetime }) |
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
0.13.1
zh
二月十号周天
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| ix | text | kind | rule | childs |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| 5 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day |
| 4 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day |
| 3 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + <day-of-month> <name |
| 2 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + intersect |
| 1 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + <day-of-month> <name |
| 0 | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + intersect |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| ix | text | kind | rule | childs |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| 5 | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day |
| 4 | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day |
| 3 | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + <day-of-month> <name |
| 2 | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + intersect |
| 1 | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + <day-of-month> <name |
| 0 | 二月十号周天 | Time(TimeOutput { moment: , grain: Day, precision: Exact }) | intersect | named-month + intersect |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
周天
: <named-day>
-> helpers::day_of_week(Weekday::Sun)
十号
: <day-of-month>
-> helpers::day_of_month(integer.value().value as u32)
二月
: <named-month>
-> helpers::month(2)
十号周天
: <day-of-month> <named-day>
-> a.value().intersect(&b.value())
-> some
二月十号周天
: <time> <time>
-> |a, b| a.value().intersect(b.value())
0.18.0
ja
二十足
BuiltinEntity { value: "二十足", range: 0..3, entity: Number(NumberValue { value: 20.0 }), entity_kind: Number }
BuiltinEntity { value: "二十", range: 0..2, entity: Number(NumberValue { value: 20.0 }), entity_kind: Number }
We expect Rustling not to output the quantifier in the number.
It seems that users exclude them from tagging (or are told to exclude them). This creates inconsistencies between what the user tags and what Rustling can recognize leaving the CRF without builtin entity match features, which makes it fail.
v0.17.6
all
"zéro virgule quatre-vingt-cinq
+----+--------+---+--------------------------------+--------------------------------+
| ix | log(p) | p | text | value |
+====+========+===+================================+================================+
| 0 | 0 | 1 | zéro virgule quatre-vingt-cinq | Float(FloatOutput(0.84999996)) |
+----+--------+---+--------------------------------+--------------------------------+
+----+--------+---+--------------------------------+--------------------------------+
| ix | log(p) | p | text | value |
+====+========+===+================================+================================+
| 0 | 0 | 1 | zéro virgule quatre-vingt-cinq | Float(FloatOutput(0.85)) |
+----+--------+---+--------------------------------+--------------------------------+
Parsing "from 8am on august 3rd to 10pm on november 3rd 2019" results in a wrong resolution for the from
date.
The resolution is correct when trying without the trailing "2019".
0.19.0
en
from 8am on august 3rd to 10pm on november 3rd 2019
DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2008-08-03T08:00:00+02:00, end: 2019-11-03T22:00:00+01:00, precision: Exact, latent: false }, datetime_kind: DatePeriod })
DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-08-03T08:00:00+02:00, end: 2019-11-03T22:00:00+01:00, precision: Exact, latent: false }, datetime_kind: DatePeriod })
0.15.3
en
cargo run -- --lang en parse "two months after summer
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| ix | text | kind | rule | childs |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| 0 | two months after summer | Time(TimeOutput { moment: 2013-08-21T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) | <duration> after <time> | <integer> <unit-of-d + after + season |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| ix | text | kind | rule | childs |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| 0 | two months after summer | Time(TimeOutput { moment: 2013-11-23T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) | <duration> after <time> | <integer> <unit-of-d + after + season |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
For the moment, the currencies are not the same amongst the languages. The support of the different currencies should be normalized.
cf this issue: snipsco/snips-nlu#678
in the helpers provide to write a grammar. the functions: smart_span_to
and span_to
should be consistent. All ad hoc behaviour should be in smart_spant_to
and span_to
should only contains the classic behaviour
v0.13.1
EN
./rustling-cli -l "en" parse -k "Number" "twelve point zero zero two"
+----+--------+---+----------------------------+---------------------------+
| ix | log(p) | p | text | value |
+====+========+===+============================+===========================+
| 2 | 0 | 1 | _______________________two | Integer(IntegerOutput(2)) |
+----+--------+---+----------------------------+---------------------------+
| 1 | 0 | 1 | __________________zero____ | Integer(IntegerOutput(0)) |
+----+--------+---+----------------------------+---------------------------+
| 0 | 0 | 1 | twelve point zero_________ | Float(FloatOutput(12)) |
+----+--------+---+----------------------------+---------------------------+
+----+--------+---+----------------------------+---------------------------+
| ix | log(p) | p | text | value |
+====+========+===+============================+===========================+
| 0 | 0 | 1 | twelve point zero zero two | Float(FloatOutput(12.002))|
+----+--------+---+----------------------------+---------------------------+
cf this issue: snipsco/snips-nlu#682
Right now if I input "une demi douzaine d'oeufs"
Only "une" gets detected. Not "demi douzaine"
your.tag.number
your_language_code
your input here
Paste the output of the `rustling-cli` here
Paste the expected output of the `rustling-cli` here
Forwarding the issue snipsco/snips-nlu#833.
Hi,
Depending on how a question is asked, the timeframe identified for winter varies:
- "in winter":
"value": { "from": "2019-12-21 00:00:00 +01:00", "kind": "TimeInterval", "to": "2020-03-21 00:00:00 +01:00" }
- "in winter 2019" (dates in 2020 are removed)
"value": { "from": "2019-12-21 00:00:00 +01:00", "kind": "TimeInterval", "to": "2020-01-01 00:00:00 +01:00" }
Best,
Joffrey
Issue observed in the Carrefour demo.
ASR output: cinq cents grammes de fraises
Slots: cinq, grammes, fraises
your.tag.number
your_language_code
your input here
Paste the output of the `rustling-cli` here
Paste the expected output of the `rustling-cli` here
Initial issue reported: https://github.com/snipsco/next-release/issues/808
The am/pm specifier from right side of time-of-day interval doesn't apply to elliptic left side
platform v1.2
en
- other languages may be impacted too
between 2 and 3pm
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | Output(OutputValue) |
+====+===========+============+==========================+======================================================================================================================================================================================================+
| 0 | -1.146101 | 0.31787375 | between two and three pm | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-11T02:00:00+02:00, end: 2019-10-11T15:00:00+02:00, precision: Exact, latent: false }, datetime_kind: TimePeriod }) |
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | Output(OutputValue) |
+====+===========+============+==========================+======================================================================================================================================================================================================+
| 0 | -1.146101 | 0.31787375 | between two and three pm | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-11T14:00:00+02:00, end: 2019-10-11T15:00:00+02:00, precision: Exact, latent: false }, datetime_kind: TimePeriod }) |
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
0.14.0
en
0.14.0
fr
your input here
In all languages, review handling of intervals and their resolution. This should be consistent across languages, taking into account various differences between languages such as "since" (en) vs. "depuis" (fr) and semantic differences for resolution e.g. "since" (in the past) ≠ "from" (next occurrence).
Consistent default behaviours should also be reviewed and fixed if needed for start and end resolutions depending on start/end grain (e.g. "after x" = "after the end boundary of x").
Simple dates with slashes like 2018/09/30
or 30/09/2018
are not parsed correctly.
0.17.7
All languages
2018/09/30
+----+--------------+-----------+------------+------------------------------+
| ix | log(p) | p | text | value |
+====+==============+===========+============+==============================+
| 2 | -0.072079904 | 0.9304565 | _____09___ | Integer(IntegerOutput(9)) |
+----+--------------+-----------+------------+------------------------------+
| 1 | -0.072079904 | 0.9304565 | ________30 | Integer(IntegerOutput(30)) |
+----+--------------+-----------+------------+------------------------------+
| 0 | -0.072079904 | 0.9304565 | 2018______ | Integer(IntegerOutput(2018)) |
+----+--------------+-----------+------------+------------------------------+
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+========+===+============+=====================================================================================================+
| 0 | 0 | 1 | 2018/09/30 | Time(TimeOutput { moment: 2018-09-30T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) |
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
en, ja, ko
30/09/2018
+----+--------------+-----------+------------+------------------------------+
| ix | log(p) | p | text | value |
+====+==============+===========+============+==============================+
| 2 | -0.072079904 | 0.9304565 | 30________ | Integer(IntegerOutput(30)) |
+----+--------------+-----------+------------+------------------------------+
| 1 | -0.072079904 | 0.9304565 | ___09_____ | Integer(IntegerOutput(9)) |
+----+--------------+-----------+------------+------------------------------+
| 0 | -0.072079904 | 0.9304565 | ______2018 | Integer(IntegerOutput(2018)) |
+----+--------------+-----------+------------+------------------------------+
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+========+===+============+=====================================================================================================+
| 0 | 0 | 1 | 2018/09/30 | Time(TimeOutput { moment: 2018-09-30T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) |
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
cf this issue --> snipsco/snips-nlu#689
Rustling doesn't identify the year in dates before the Unix epoch (January 1 1970).
This is a similar issue as #102 and the same issue as my comment there, but I add this here as a new issue because I found the exact date where it goes wrong.
0.17.7
en
december 31 1969
| ix | log(p) | p | text | value |
+====+==============+============+==================+=====================================================================================================+
| 1 | -0.072079904 | 0.9304565 | ____________1969 | Integer(IntegerOutput(1969)) |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+
| 0 | -0.17216337 | 0.84184164 | december 31_____ | Time(TimeOutput { moment: 2019-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+
+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+============+===========+==================+=====================================================================================================+
| 0 | -0.4431368 | 0.6420194 | december 31 1969 | Time(TimeOutput { moment: 1969-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+
The algo can be improved for these entities:
b.rule_1_terminal("named-day",
b.reg(r#"星期日|星期天|礼拜天|周日|禮拜天|週日|禮拜日"#)?,
|_| helpers::day_of_week(Weekday::Sun)
);
周天
is missing
b.rule_1_terminal("hundred",
b.reg(r#"百|仟"#)?,
|_| IntegerValue::new_with_grain(100, 2)
);
b.rule_1_terminal("thousand",
b.reg(r#"千|佰"#)?,
|_| IntegerValue::new_with_grain(1000, 3)
);
百|佰
hundred
千|仟
thousand
Seasons not supported:
春(天|季)?
: spring
夏(天|季)?
: summer
秋(天|季)?
: fall
冬(天|季)?
: winter
b.rule_1_terminal("afternoon",
b.reg(r#"下午|中午|晏晝"#)?,
|_| {
Ok(helpers::hour(12, false)?
.span_to(&helpers::hour(19, false)?, false)?
.latent()
.form(Form::PartOfDay))
}
);
only 下午
is right
b.rule_1_terminal("morning",
b.reg(r#"早上|早晨|朝頭?早"#)?,
|_| {
Ok(helpers::hour(4, false)?
.span_to(&helpers::hour(12, false)?, false)?
.latent()
.form(Form::PartOfDay))
}
);
early morning: 早上
朝頭?早
has no sense at first sight
Hi there!
I'm wondering whether the parser grammars can be used in a reverse way, as generative grammars (e.g. to generate text sequences which are being parsed by these grammars). Is this a supported feature? If not, do you think it is viable to implement it (and could you suggest where to begin)?
Thanks.
On the current platform
en
"in 8 hours"
gives TimeInterval
but the timestamp is now + 8h +/- granularity
(for instance it rounds the last minute). In some cases, the user might want to build an assistant to set a timer exactly 8h after the command, without some rounding. In this case it would be best if the default timestamp is the exact "now + 8h" time, and that if we want to provide granularity this should be optional in a different field
TimeInterval
where the timestamp is now + 8h
0.15.3
FR
rustc 1.31.0-nightly
your.tag.number
your_language_code
your input here
error: failed to run custom build command for `crfsuite-sys v0.2.0-pre (https://github.com/snipsco/crfsuite-rs?rev=30b2ea6#30b2ea6f)`
process didn't exit successfully: `/nff/tools/dack/target/release/build/crfsuite-sys-c76bddaf3c451c85/build-script-build` (exit code: 101)
Paste the output of the `rustling-cli` here
thread 'main' panicked at 'Unable to find libclang: "couldn't find any of ['libclang.so', 'libclang.so.', 'libclang-.so'], set the LIBCLANG_PATH environment variable to a path where one of these files can be found (skipped: [])"', libcore/result.rs:1009:5
Paste the expected output of the `rustling-cli` here
0.15.2
en
rustling crashes when called with inputs of the form "<number> <nth> quarter"
where <number>
is a number greater than 1050000
and <nth>
is an ordinal (1st, 2nd, 3rd ...).
The crash doesn't happend for <number>
values smaller than 1040000
, however the closer to 1040000
and the longer the parsing time is. On my Macbook Pro (Core i7), the parsing time goes up to 1m45s for 1040000
.
0.18.1
en
1050000 2nd quarter
$ RUST_BACKTRACE=1 cargo run -p rustling-cli -- --lang en parse "1050000 2nd quarter" 101 ↵
Finished dev [unoptimized + debuginfo] target(s) in 0.17s
Running `target/debug/rustling-cli --lang en parse '1050000 2nd quarter'`
thread 'main' panicked at 'No such local time', /Users/adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.6/src/offset/mod.rs:145:34
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:70
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:58
at src/libstd/panicking.rs:200
3: std::panicking::default_hook
at src/libstd/panicking.rs:215
4: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
at src/libstd/panicking.rs:478
5: std::sync::once::Once::is_completed
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/panicking.rs:412
6: alloc::raw_vec::alloc_guard
at ./<::std::macros::panic macros>:3
7: core::ptr::real_drop_in_place
at /Users/adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.6/src/offset/mod.rs:186
8: <time::duration::Duration as core::cmp::PartialOrd>::lt
at moment/src/lib.rs:122
9: <time::duration::Duration as core::cmp::PartialOrd>::lt
at moment/src/lib.rs:186
10: <time::duration::Duration as core::cmp::PartialOrd>::lt
at moment/src/lib.rs:177
11: rustling_ontology_values::helpers::easter::offset
at ./moment/src/lib.rs:352
12: core::clone::impls::<impl core::clone::Clone for usize>::clone
at ./moment/src/interval_constraints.rs:675
13: <bool as core::default::Default>::default
at ./moment/src/walker.rs:146
14: <bool as core::default::Default>::default
at ./moment/src/walker.rs:233
15: core::clone::impls::<impl core::clone::Clone for usize>::clone
at ./moment/src/interval_constraints.rs:857
16: core::clone::impls::<impl core::clone::Clone for usize>::clone
at ./moment/src/interval_constraints.rs:1032
17: <bool as core::default::Default>::default
at ./moment/src/walker.rs:169
18: <bool as core::default::Default>::default
at ./moment/src/walker.rs:201
19: <bool as core::default::Default>::default
at ./moment/src/walker.rs:266
20: <alloc::string::String as core::ops::deref::Deref>::deref
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1813
21: <alloc::string::String as core::ops::deref::Deref>::deref
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1725
22: <bool as core::default::Default>::default
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:1468
23: core::clone::impls::<impl core::clone::Clone for usize>::clone
at ./moment/src/interval_constraints.rs:1034
24: core::clone::impls::<impl core::clone::Clone for usize>::clone
at ./moment/src/interval_constraints.rs:864
25: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::AllocErr>>::from
at values/src/context.rs:54
26: <rustling_ontology::tagger::CandidateTagger<'a, C> as rustling::MaxElementTagger<rustling_ontology_values::dimension::Dimension>>::tag::{{closure}}
at src/tagger.rs:62
27: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:1447
28: <f32 as core::ops::arith::Mul<&'a f32>>::mul
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:582
29: core::hint::unreachable_unchecked
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:519
30: core::hint::unreachable_unchecked
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:582
31: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:436
32: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:1447
33: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:606
34: core::hint::unreachable_unchecked
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1856
35: core::hint::unreachable_unchecked
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1839
36: core::hint::unreachable_unchecked
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1725
37: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:1468
38: core::hint::unreachable_unchecked
at src/tagger.rs:60
39: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /Users/adrien/.cargo/git/checkouts/rustling-281bdd3bf97d4e1e/fd8084b/src/lib.rs:140
40: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
at /Users/adrien/.cargo/git/checkouts/rustling-281bdd3bf97d4e1e/fd8084b/src/lib.rs:144
41: rustling_ontology::Parser
at src/lib.rs:64
42: rustling_ontology::Parser
at src/lib.rs:89
43: rustling_cli::main
at cli/src/main.rs:54
44: std::rt::lang_start::{{closure}}
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/rt.rs:64
45: std::panicking::try::do_call
at src/libstd/rt.rs:49
at src/libstd/panicking.rs:297
46: panic_unwind::dwarf::eh::read_encoded_pointer
at src/libpanic_unwind/lib.rs:92
47: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
at src/libstd/panicking.rs:276
at src/libstd/panic.rs:388
at src/libstd/rt.rs:48
48: std::rt::lang_start
at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/rt.rs:64
49: rustling_cli::main
$ RUST_BACKTRACE=1 cargo run -p rustling-cli -- --lang en parse "1050000 2nd quarter" 101 ↵
Finished dev [unoptimized + debuginfo] target(s) in 0.14s
Running `target/debug/rustling-cli --lang en parse '1050000 2nd quarter'`
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+=============+============+==================+=========================================================================================================+
| 1 | -0.07018268 | 0.9322235 | 1050000_________ | Integer(IntegerOutput(1050000)) |
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+
| 0 | -0.6190392 | 0.53846157 | _____2nd quarter | Time(TimeOutput { moment: 2019-04-01T00:00:00+02:00, grain: Quarter, precision: Exact, latent: false }) |
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+
0.18.0
en
fr
two hundred twenty-three million three hundred one thousand two hundred eleven euros
trois cent deux millions quatre cent trente milles deux cent trente euros
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+=============+===========+======================================================================================+================================================================================================+
| 0 | -0.11813116 | 0.8885795 | two hundred twenty-three million three hundred one thousand two hundred eleven euros | AmountOfMoney(AmountOfMoneyOutput { value: 223301220.0, precision: Exact, unit: Some("EUR") }) |
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+============+============+===========================================================================+================================================================================================+
| 0 | -1.9776177 | 0.13839854 | trois cent deux millions quatre cent trente milles deux cent trente euros | AmountOfMoney(AmountOfMoneyOutput { value: 302430240.0, precision: Exact, unit: Some("EUR") }) |
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+=============+===========+======================================================================================+================================================================================================+
| 0 | -0.11813116 | 0.8885795 | two hundred twenty-three million three hundred one thousand two hundred eleven euros | AmountOfMoney(AmountOfMoneyOutput { value: 223301220.0, precision: Exact, unit: Some("EUR") }) |
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+============+============+===========================================================================+================================================================================================+
| 0 | -1.9776177 | 0.13839854 | trois cent deux millions quatre cent trente milles deux cent trente euros | AmountOfMoney(AmountOfMoneyOutput { value: 302430230.0, precision: Exact, unit: Some("EUR") }) |
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
At the moment, finance is not supported in Chinese. It needs to be added.
0.17.0
DE
"das jahr 1905"
"das jähr 1890"
Time(TimeOutput { moment: 1970-01-01T00:59:59+01:00, grain: Year, precision: Exact, latent: false })
Time(TimeOutput { moment: 1890-01-01T00:00:00+01:00, grain: Year, precision: Exact, latent: false })
I'm new to rust, please forgive me if I'm wrong :-).
I got a question regarding the moment
module, in lib.rs
why ONLY re-export these 3 structs of chrono crate pub use chrono::{Weekday, Local, TimeZone};
?
Perhaps, we may also need Utc
to be able used with Moment
, e.g. Moment(Utc::now())
?
Trying to build the weather example of snips-nlu-lib, I had the following error..
$ cargo run
Compiling snips-nlu-resources-packed v0.55.2 (https://github.com/snipsco/snips-nlu-rs#6a48f869)
Compiling crfsuite-sys v0.2.0-pre (https://github.com/snipsco/crfsuite-rs?rev=b18d95c#b18d95cf)
Compiling rustling-ontology v0.17.0 (https://github.com/snipsco/rustling-ontology?tag=0.17.0#2efa4c01)
error: unexpected close delimiter: `]`
--> /home/arousseau/dev/personnel/rust/test_snip/target/debug/build/crfsuite-sys-1b9f39ab8b7d8fbd/out/crfsuite.rs:201:32
|
201 | #[derive(Copy, Clone)]; 4usize ] , _bindgen_union_align : u32 , }#[test]
| ^
I just create a project with cargo new --bin test_snip
and I had the following files
# Cargo.toml
[package]
name = "test_snip"
version = "0.1.0"
[dependencies]
snips-nlu-lib = { git = "https://github.com/snipsco/snips-nlu-rs", branch = "master" }
// src/main.rs
extern crate serde_json;
extern crate snips_nlu_lib;
use std::env;
use snips_nlu_lib::{FileBasedConfiguration, SnipsNluEngine};
fn main() {
let args: Vec<String> = env::args().collect();
let model_file = &args[1];
let query = &args[2];
let configuration = match FileBasedConfiguration::from_path(model_file, false) {
Ok(conf) => conf,
Err(e) => panic!(format!("{}", e)),
};
let nlu_engine = SnipsNluEngine::new(configuration).unwrap();
let result = nlu_engine.parse(query, None).unwrap();
println!("{}", serde_json::to_string_pretty(&result).unwrap());
}
At the moment, a number (or temperature, amount of money) starting with an explicit +
is not parsed entirely as the prefix is omitted in the result.
0.18.0
All languages
+100
or
+100°C
+----+--------------+-----------+------+-----------------------------+
| ix | log(p) | p | text | value |
+====+==============+===========+======+=============================+
| 0 | -0.072079904 | 0.9304565 | _100 | Integer(IntegerOutput(100)) |
+----+--------------+-----------+------+-----------------------------+
or
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+============+===========+========+=======================================================================================+
| 0 | -1.1418271 | 0.3192352 | _100°c | Temperature(TemperatureOutput { value: 100.0, unit: Some("celsius"), latent: false }) |
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
+----+--------------+-----------+------+-----------------------------+
| ix | log(p) | p | text | value |
+====+==============+===========+======+=============================+
| 0 | -0.072079904 | 0.9304565 | +100 | Integer(IntegerOutput(100)) |
+----+--------------+-----------+------+-----------------------------+
or
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+============+===========+========+=======================================================================================+
| 0 | -1.1418271 | 0.3192352 | +100°c | Temperature(TemperatureOutput { value: 100.0, unit: Some("celsius"), latent: false }) |
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
0.15.3
ja
29パーセント
Japanese numbers lead to a regex crash
29%
Hi Rustling team !
I have issues compiling latest version on Windows 10.
Compiling rustling-ontology v0.17.0 (file:///C:/Users/romai/projects/rustling/rustling-ontology)
error: failed to run custom build command forrustling-ontology v0.17.0 (file:///C:/Users/romai/projects/rustling/rustling-ontology)
process didn't exit successfully:C:\Users\romai\projects\rustling\rustling-ontology\target\debug\build\rustling-ontology-670eb07658abcf73\build-script-build
(exit code: 101)
--- stdout
cargo:rerun-if-changed=grammar/de/src/rules.rs--- stderr
thread 'main' panicked at 'calledResult::unwrap()
on anErr
value: RustlingError(Msg("example: "montag morgens" matched no rule"), State { next_error: None })', libcore\result.rs:945:5
note: Run withRUST_BACKTRACE=1
for a backtrace.
Any hints ?
commit a24b00e (HEAD -> develop, tag: 0.17.4, origin/develop, origin/HEAD)
en
Okay, um, your total is one billion three dollars and that is it?
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+=============+===========+===================================================================+===============================================================================================+
| 0 | -0.11813116 | 0.8885795 | ________________________one billion three dollars________________ | AmountOfMoney(AmountOfMoneyOutput { value: 1000000000.0, precision: Exact, unit: Some("$") }) |
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
| ix | log(p) | p | text | value |
+====+=============+===========+===================================================================+===============================================================================================+
| 0 | -0.11813116 | 0.8885795 | ________________________one billion three dollars________________ | AmountOfMoney(AmountOfMoneyOutput { value: 1000000003.0, precision: Exact, unit: Some("$") }) |
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
This issue was initially raised here: snipsco/snips-nlu#677
0.17.5
en
Q1 2018
Integer(IntegerOutput(2018))
Time(TimeOutput { moment: 2018-01-01T00:00:00+01:00, grain: Quarter, precision: Exact, latent: false })
Fr: depuis - à partir de
En: since -from
De: seit - ab
French: "depuis" combined with durations, but not with simple dates
"depuis le 5 janvier": KO
recognized as "le 5 janvier"
"depuis 3 mois": OK
TimeInterval(Between { start: 2018-07-24T00:00:00+02:00, end: 2018-10-24T12:40:43.829472+02:00, precision: Exact, latent: false })
** French: "à partir de": OK
"à partir du 5 janvier": OK
TimeInterval(After(TimeOutput { moment: 2019-01-05T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))
**English: "from", "from...on": KO
"from october 13th": KO
"from october 13th on": KO
recognized as "october 13th"
German: "seit" combined with simple dates, but not with durations
"seit dem 11. januar" : OK
TimeInterval(After(TimeOutput { moment: 2018-01-11T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))
"seit 3 monaten": KO
recognized as "3 monaten"
**German: "ab" (= from, depuis): OK
"ab dem 11. Januar"
TimeInterval(After(TimeOutput { moment: 2018-01-11T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))
Rustling should not interpret values like "2K"
as datetime.
0.17.5
en
cargo run -- --lang en parse "2K" -k Time
Time(TimeOutput { moment: 2000-01-01T00:00:00+01:00, grain: Year, precision: Exact, latent: false })
Nothing
until + date => final boundary = date at 00:00:00 => INCORRECT
from + date1 + until + date2 => final boundary = date2 + 1 at 00:00:00 => CORRECT
context:
2019-03-28T10:00:00
until tomorrow
from yesterday until tomorrow
until the twenty-ninth
from yesterday until the twenty-ninth
until tomorrow | TimeInterval(Before(TimeOutput { moment: 2019-03-29T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))
from yesterday until tomorrow | TimeInterval(Between { start: 2019-03-27T00:00:00+01:00, end: 2019-03-30T00:00:00+01:00, precision: Exact, latent: false })
until the twenty-ninth | TimeInterval(Before(TimeOutput { moment: 2019-03-29T00:00:00+01:00, grain: Day, precision: Exact, latent: true }))
from yesterday until the twenty-ninth | TimeInterval(Between { start: 2019-03-27T00:00:00+01:00, end: 2019-03-30T00:00:00+01:00, precision: Exact, latent: false })
The lib panics when given the input "31 2"
.
For similar inputs like "31 1"
or "31 3"
, it works fine, interpreting the value as January 31st
or March 31st
. Thus it seems that the invalid February 31st
causes the crash.
$ cargo run -- --lang fr parse "31 2"
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `/Users/Adrien/dev/test/rustling-ontology/target/debug/rustling-cli --lang fr parse '31 2'`
thread 'main' panicked at 'No such local time', /Users/Adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.3.0/src/offset/mod.rs:151:34
note: Run with `RUST_BACKTRACE=1` for a backtrace.
These rules doesn't seems to differentiate midnight and noon
"hour"
"hour and minutes"
"hour and minutes and seconds"
"next " should return either the actual next or the one after depending on the current <-day-of-week>
Example:
Date is Tuesday the 12th:
your.tag.number
EN
one two three
parser = RustlingParser(u"en")
parser.parse(u"one two three")
>>> [{'char_range': {'end': 13, 'start': 4},
'dim': 'Time',
'latent': False,
'value': {'grain': u'minute',
'latent': False,
'precision': u'exact',
'type': 'value',
'value': '2017-12-28 14:03:00 +01:00'}},
{'char_range': {'end': 3, 'start': 0},
'dim': 'Number',
'latent': False,
'value': {'type': 'value', 'value': 1}}]
We expect each number to be parsed separately as 1
, 2
and 3
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.