Code Monkey home page Code Monkey logo

rustling-ontology's People

Contributors

adrienball avatar anaisaurus avatar clemdoum avatar hdlj avatar hubertdelajonquieresonos avatar hywan avatar jimregan avatar johannasimoens avatar kali avatar odiledevismessonos avatar proemke avatar rosastern avatar rosasternsonos avatar thadguidry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rustling-ontology's Issues

Wrong time-of-day in "last <date> <time-of-day interval>"

Initial issue reported: https://github.com/snipsco/next-release/issues/809

Parsing Error

"last wednesday between one thirty and three forty-five am" gives wrong resolution on the left side time-of-day

Version

platform v1.2, v1.3

Language

en - other languages may be impacted too

Parser input

last wednesday between one thirty and three forty-five am

Parser output

+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p)     | p           | text                                               | Output(OutputValue)                                                                                                                                                                                |
+====+============+=============+====================================================+====================================================================================================================================================================================================+
| 0  | -3.3397157 | 0.035447035 | last wednesday between two and three forty-five am | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-09T00:00:00+02:00, end: 2019-10-09T03:45:00+02:00, precision: Exact, latent: false }, datetime_kind: Datetime }) |
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Parser expected output (Optional)

+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p)     | p           | text                                               | Output(OutputValue)                                                                                                                                                                                |
+====+============+=============+====================================================+====================================================================================================================================================================================================+
| 0  | -3.3397157 | 0.035447035 | last wednesday between two and three forty-five am | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-09T01:30:00+02:00, end: 2019-10-09T03:45:00+02:00, precision: Exact, latent: false }, datetime_kind: Datetime }) |
+----+------------+-------------+----------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Intersection give a value in the past

Parsing Error

Version

0.13.1

Language

zh

Parser input

二月十号周天

Parser output

+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| ix | text         | kind                                                                                 | rule      | childs                                 |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| 5  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day                  |
| 4  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day                  |
| 3  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + <day-of-month> <name |
| 2  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + intersect            |
| 1  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + <day-of-month> <name     |
| 0  | 二月十号周天 | Time(TimeOutput { moment: 2013-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + intersect                |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+

Parser expected output (Optional)

+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| ix | text         | kind                                                                                 | rule      | childs                                 |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+
| 5  | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day                  |
| 4  | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | intersect + named-day                  |
| 3  | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + <day-of-month> <name |
| 2  | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | <integer> month + intersect            |
| 1  | 二月十号周天 | Time(TimeOutput { moment: 2019-02-10T00:00:00+01:00, grain: Day, precision: Exact }) | intersect | named-month + <day-of-month> <name     |
| 0  | 二月十号周天 | Time(TimeOutput { moment: 	, grain: Day, precision: Exact }) | intersect | named-month + intersect                |
+----+--------------+--------------------------------------------------------------------------------------+-----------+----------------------------------------+

Rules applied

周天: <named-day> -> helpers::day_of_week(Weekday::Sun)
十号: <day-of-month> -> helpers::day_of_month(integer.value().value as u32)
二月: <named-month> -> helpers::month(2)
十号周天: <day-of-month> <named-day> -> a.value().intersect(&b.value()) -> some
二月十号周天: <time> <time> -> |a, b| a.value().intersect(b.value())

Exclude quantifiers from JA number

Parsing Error

Version

0.18.0

Language

ja

Parser input

二十足

Parser output

BuiltinEntity { value: "二十足", range: 0..3, entity: Number(NumberValue { value: 20.0 }), entity_kind: Number }

Parser expected output (Optional)

BuiltinEntity { value: "二十", range: 0..2, entity: Number(NumberValue { value: 20.0 }), entity_kind: Number }

We expect Rustling not to output the quantifier in the number.
It seems that users exclude them from tagging (or are told to exclude them). This creates inconsistencies between what the user tags and what Rustling can recognize leaving the CRF without builtin entity match features, which makes it fail.

cc @mayukauenogayer

Wrong computation in resolution of decimal numbers > 0 && < 1

Parsing Error

Version

v0.17.6

Language

all

Parser input

"zéro virgule quatre-vingt-cinq

Parser output

+----+--------+---+--------------------------------+--------------------------------+
| ix | log(p) | p | text                           | value                          |
+====+========+===+================================+================================+
| 0  | 0      | 1 | zéro virgule quatre-vingt-cinq | Float(FloatOutput(0.84999996)) |
+----+--------+---+--------------------------------+--------------------------------+

Parser expected output (Optional)

+----+--------+---+--------------------------------+--------------------------------+
| ix | log(p) | p | text                           | value                          |
+====+========+===+================================+================================+
| 0  | 0      | 1 | zéro virgule quatre-vingt-cinq | Float(FloatOutput(0.85)) |
+----+--------+---+--------------------------------+--------------------------------+

TimeInterval resolution issue

Parsing Error

Parsing "from 8am on august 3rd to 10pm on november 3rd 2019" results in a wrong resolution for the from date.
The resolution is correct when trying without the trailing "2019".

Version

0.19.0

Language

en

Parser input

from 8am on august 3rd to 10pm on november 3rd 2019

Parser output

DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2008-08-03T08:00:00+02:00, end: 2019-11-03T22:00:00+01:00, precision: Exact, latent: false }, datetime_kind: DatePeriod })

Parser expected output (Optional)

DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-08-03T08:00:00+02:00, end: 2019-11-03T22:00:00+01:00, precision: Exact, latent: false }, datetime_kind: DatePeriod })

"Two months after summer" should begin at the end of summer

Parsing Error

Version

0.15.3

Language

en

Parser input

cargo run -- --lang en parse "two months after summer

Parser output

+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| ix | text                    | kind                                                                                                                        | rule                         | childs                                 |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| 0 | two months after summer | Time(TimeOutput { moment: 2013-08-21T00:00:00+02:00, grain: Day, precision: Exact, latent: false })                         | <duration> after <time>      | <integer> <unit-of-d + after + season  |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+

Parser expected output (Optional)

+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| ix | text                    | kind                                                                                                                        | rule                         | childs                                 |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+
| 0 | two months after summer | Time(TimeOutput { moment: 2013-11-23T00:00:00+02:00, grain: Day, precision: Exact, latent: false })                         | <duration> after <time>      | <integer> <unit-of-d + after + season  |
+----+-------------------------+-----------------------------------------------------------------------------------------------------------------------------+------------------------------+----------------------------------------+

Inconsistent behaviour in helpers time interval

in the helpers provide to write a grammar. the functions: smart_span_to and span_to should be consistent. All ad hoc behaviour should be in smart_spant_to and span_to should only contains the classic behaviour

Parsing error: decimals with leading 0

Parsing Error

Version

v0.13.1

Language

EN

Parser input

./rustling-cli -l "en" parse -k "Number" "twelve point zero zero two"

Parser output

+----+--------+---+----------------------------+---------------------------+
| ix | log(p) | p | text                       | value                     |
+====+========+===+============================+===========================+
| 2  | 0      | 1 | _______________________two | Integer(IntegerOutput(2)) |
+----+--------+---+----------------------------+---------------------------+
| 1  | 0      | 1 | __________________zero____ | Integer(IntegerOutput(0)) |
+----+--------+---+----------------------------+---------------------------+
| 0  | 0      | 1 | twelve point zero_________ | Float(FloatOutput(12))    |
+----+--------+---+----------------------------+---------------------------+

Parser expected output (Optional)

+----+--------+---+----------------------------+---------------------------+
| ix | log(p) | p | text                       | value                     |
+====+========+===+============================+===========================+
| 0  | 0      | 1 | twelve point zero zero two | Float(FloatOutput(12.002))|
+----+--------+---+----------------------------+---------------------------+

Wrong dates for "winter" if year is provided

Forwarding the issue snipsco/snips-nlu#833.

Hi,

Depending on how a question is asked, the timeframe identified for winter varies:

  • "in winter":
"value": {
        "from": "2019-12-21 00:00:00 +01:00",
        "kind": "TimeInterval",
        "to": "2020-03-21 00:00:00 +01:00"
}
  • "in winter 2019" (dates in 2020 are removed)
"value": {
        "from": "2019-12-21 00:00:00 +01:00",
        "kind": "TimeInterval",
        "to": "2020-01-01 00:00:00 +01:00"
}

Best,
Joffrey

"cinq cents" > "cinq"

Issue observed in the Carrefour demo.
ASR output: cinq cents grammes de fraises
Slots: cinq, grammes, fraises

Parsing Error

Version

your.tag.number

Language

your_language_code

Parser input

your input here

Parser output

Paste the output of the `rustling-cli` here

Parser expected output (Optional)

Paste the expected output of the `rustling-cli` here

Resolve am/pm in elliptic left side of time-of-day interval

Initial issue reported: https://github.com/snipsco/next-release/issues/808

Parsing Error

The am/pm specifier from right side of time-of-day interval doesn't apply to elliptic left side

Version

platform v1.2

Language

en - other languages may be impacted too

Parser input

between 2 and 3pm

Parser output

+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p)    | p          | text                     | Output(OutputValue)                                                                                                                                                                                  |
+====+===========+============+==========================+======================================================================================================================================================================================================+
| 0  | -1.146101 | 0.31787375 | between two and three pm | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-11T02:00:00+02:00, end: 2019-10-11T15:00:00+02:00, precision: Exact, latent: false }, datetime_kind: TimePeriod }) |
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Parser expected output (Optional)

+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ix | log(p)    | p          | text                     | Output(OutputValue)                                                                                                                                                                                  |
+====+===========+============+==========================+======================================================================================================================================================================================================+
| 0  | -1.146101 | 0.31787375 | between two and three pm | DatetimeInterval(DatetimeIntervalOutput { interval_kind: Between { start: 2019-10-11T14:00:00+02:00, end: 2019-10-11T15:00:00+02:00, precision: Exact, latent: false }, datetime_kind: TimePeriod }) |
+----+-----------+------------+--------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

[EN] Improve grammar snips/datetime for better sync with generative grammars

Parsing Error

Version

0.14.0

Language

en

Not parsed queries

  • on thursday march thirteen
  • twenty hours and an half hour from now
  • on monday november the first
  • fifteen hours and an half hour from now
  • on august seventeenth two thousand and thirty-nine (year too big)
  • at oneish
  • on thursday june thirty-one (impossible date)
  • at twoish
  • at tenish
  • on wednesday july the fourteenth
  • on saturday september eleventh
  • on saturday february five
  • nine weeks and a half from now
  • on monday november fifteen
  • on sunday december twelfth
  • twelve hours and an half hour from now
  • on wednesday january twelve
  • at dinner time
  • two hours and an half hour from now
  • on sunday march thirty
  • around tea time
  • on monday august thirtieth
  • on wednesday january nineteen
  • three and a half months from now
  • since the end of the week
  • on thursday october twenty-eight
  • on monday march the seventeenth
  • four years and a half from now
  • by the end of the week
  • on saturday march eight
  • on saturday december twenty-five
  • nineteen hours and an half hour from now
  • on august the seventeenth two thousand and forty (year too big)
  • on saturday july twenty-four
  • on thursday september the thirtieth
  • on october the fourteenth two thousand and forty (year too big)
  • before the end of the week
  • during work
  • eleven weeks and a half from now
  • at twelveish
  • on friday november nineteen
  • one week and a half from now
  • on friday november thirty-one (impossible)
  • on tuesday september seven
  • on november twenty-third two thousand and forty (impossible)
  • on monday august sixteenth
  • on tuesday july sixth
  • on tuesday february the eighth
  • one hour and an half hour from now
  • on monday june thirty-one (impossible)
  • eleven hours and an half hour from now
  • on sunday june twenty
  • twelve years and a half from now
  • on tuesday march the eleventh
  • eleven and a half months from now
  • on friday september tenth
  • on june the sixth two thousand and thirty-nine (year too big)
  • on june tenth two thousand and forty (year too big)
  • on tuesday october five
  • on april the twenty-fifth two thousand and forty (year too big)
  • on tuesday march four
  • on may fourth two thousand and forty (year too big)
  • until the end of the week
  • three weeks and a half from now
  • on saturday october thirtieth
  • six years and a half from now
  • on tuesday december twenty-one
  • on thursday april the twenty-second
  • sixteen hours and an half hour from now
  • on friday july two
  • on sunday october thirty-one
  • eighteen hours and an half hour from now
  • on sunday february thirteenth
  • on sunday march the second
  • on sunday april thirty-one
  • on may seventh two thousand and thirty-nine
  • twenty-six days and a half from now
  • on february the twenty-ninth two thousand and thirty-nine
  • eleven years and a half from now
  • on january the eighteenth two thousand and forty
  • on wednesday july twenty-eight
  • on sunday may the twenty-third
  • on wednesday april the second
  • on thursday january twenty
  • on tuesday december the twenty-eighth
  • four weeks and a half from now
  • on thursday january twenty-seventh
  • yesterday for lunch
  • on october the twenty-eighth two thousand and forty
  • on sunday march sixteen
  • on friday september ten
  • ten weeks and a half from now
  • at fourish
  • on monday september the twentieth
  • two and a half months from now
  • on saturday august the seventh
  • on thursday june the tenth
  • thirteen hours and an half hour from now
  • on saturday may eight
  • on sunday june thirteen
  • one day and a half from now
  • on may the twenty-sixth two thousand and forty
  • on saturday march the eighth
  • around meal time
  • on saturday march twenty-two
  • on tuesday august three
  • on thursday january the twenty-seventh
  • for first sunday of advent
  • on september sixteenth two thousand and forty
  • on september the thirtieth two thousand and thirty-nine
  • on october the twenty-ninth two thousand and thirty-nine
  • on saturday october two
  • on august twenty-ninth two thousand and thirty-nine
  • nine years and a half from now
  • on saturday october thirty
  • on monday january three
  • seven weeks and a half from now
  • one year and a half from now
  • on april the twenty-ninth two thousand and forty (year too big)
  • on wednesday october twentieth
  • on thursday august the twelfth
  • eight weeks and a half from now
  • on tuesday december fourteen
  • nine hours and an half hour from now
  • on thursday december two
  • six hours and an half hour from now
  • at eightish
  • on march the tenth two thousand and forty (year too big)
  • on august seventh two thousand and forty (year too big)
  • ten hours and an half hour from now
  • on friday march fourteenth
  • on november the nineteenth two thousand and thirty-nine (year too big)
  • on february thirtieth two thousand and twenty-two
  • at tea time
  • around dinner time
  • around supper time
  • on july twenty-seventh two thousand and thirty-nine (year too big)
  • on friday january seven
  • on february first two thousand and forty (year too big)
  • before work
  • on sunday september twenty-six
  • two years and a half from now
  • on saturday february twenty-six
  • twenty-three hours and an half hour from now
  • on sunday august the first
  • at supper time
  • at breakfast time
  • on january sixth two thousand and forty (year too big)
  • at sunrise
  • on sunday july four
  • five hours and an half hour from now
  • seven hours and an half hour from now
  • on friday december three
  • seven years and a half from now
  • on tuesday june one
  • on february the fourth two thousand and forty
  • on may the third two thousand and thirty-nine (year too big)
  • at brunch time
  • at fiveish
  • four hours and an half hour from now
  • on thursday january twenty-seven
  • on sunday november thirty-one
  • on tuesday february twenty-two
  • on friday april twenty-three
  • on thursday january thirteen
  • on november twenty-fourth two thousand and thirty-nine (year too big)
  • on wednesday march the fifth
  • on wednesday april nine
  • twenty-one hours and an half hour from now
  • seventeen hours and an half hour from now
  • five years and a half from now
  • on monday august thirty
  • on february the twenty-ninth two thousand and thirty-one (impossible)
  • twenty-two hours and an half hour from now
  • on monday march the third
  • after the end of the week
  • on saturday december fourth
  • on september the twentieth two thousand and thirty-nine (year too big)
  • on february thirtieth twenty seventeen
  • nine and a half months from now
  • on sunday june the sixth
  • on august the twenty-fifth two thousand and forty (year too big)
  • on tuesday march the fourth
  • on tuesday november ninth
  • at meal time
  • at threeish
  • at sixish
  • on sunday november the twenty-eighth
  • at nineish
  • around breakfast time
  • twelve days and a half from now
  • on april twenty-eighth two thousand and thirty-nine (year too big)
  • on wednesday march twelve
  • five weeks and a half from now
  • at elevenish
  • at sunset
  • on wednesday january the nineteenth
  • on august the fourth two thousand and thirty-nine (year too big)
  • around brunch time
  • at sevenish
  • at midday
  • on september seventeenth two thousand and thirty-eight from one thirteen am to five twenty-three pm
  • on december sixteenth at eleven minutes past six
  • around fifteen twenty-three on june tenth twenty eighteen
  • from sevenish to tenish
  • at lunch time on thursday march the seventh

[FR] Improve grammar snips/datetime for better sync with generative grammars

Version

0.14.0

Language

fr

Parser input

your input here

Not parsed queries

  • à la fin de la journée
  • pour l'assomption
  • à l'heure du café
  • à dix heures de l'après-midi
  • à l'heure du souper
  • cinquante-cinq heures quatorze minutes plus tard
  • pour la fête des mères
  • à l'heure du brunch
  • pour la fête des pères
  • au moment du thé
  • à la tombée de la nuit
  • à dix-huit heures passées de vingt-quatre de l'après-midi
  • à douze heures passées de cinquante de l'après-midi
  • quarante heures neuf minutes plus tard
  • une heure trente minutes plus tard
  • à la tombée du jour
  • au moment du petit-déjeuner
  • au levé du soleil
  • pour la veille de noël
  • à sept heures passées de une de l'après-midi
  • le dernier week-end du mois de décembre
  • cette nuit
  • au levé du soleil
  • le dernier week-end du mois de janvier
  • pour pâques
  • pendant le travail
  • à treize heures passées de seize minutes
  • pour l'armistice
  • au moment du dîner
  • à vingt-trois heures passées de huit minutes
  • au crépuscule
  • avant le travail
  • pour la fête nationale
  • à l'heure du goûter

Consistent interval handling

In all languages, review handling of intervals and their resolution. This should be consistent across languages, taking into account various differences between languages such as "since" (en) vs. "depuis" (fr) and semantic differences for resolution e.g. "since" (in the past) ≠ "from" (next occurrence).
Consistent default behaviours should also be reviewed and fixed if needed for start and end resolutions depending on start/end grain (e.g. "after x" = "after the end boundary of x").

Dates with slashes not recognized

Parsing Error

Simple dates with slashes like 2018/09/30 or 30/09/2018 are not parsed correctly.

Version

0.17.7

Language 1

All languages

Parser input 1

2018/09/30

Parser output 1

+----+--------------+-----------+------------+------------------------------+
| ix | log(p)       | p         | text       | value                        |
+====+==============+===========+============+==============================+
| 2  | -0.072079904 | 0.9304565 | _____09___ | Integer(IntegerOutput(9))    |
+----+--------------+-----------+------------+------------------------------+
| 1  | -0.072079904 | 0.9304565 | ________30 | Integer(IntegerOutput(30))   |
+----+--------------+-----------+------------+------------------------------+
| 0  | -0.072079904 | 0.9304565 | 2018______ | Integer(IntegerOutput(2018)) |
+----+--------------+-----------+------------+------------------------------+

Parser expected output 1

+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text       | value                                                                                               |
+====+========+===+============+=====================================================================================================+
| 0  | 0      | 1 | 2018/09/30 | Time(TimeOutput { moment: 2018-09-30T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) |
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+

Language 2

en, ja, ko

Parser input 2

30/09/2018

Parser output 2

+----+--------------+-----------+------------+------------------------------+
| ix | log(p)       | p         | text       | value                        |
+====+==============+===========+============+==============================+
| 2  | -0.072079904 | 0.9304565 | 30________ | Integer(IntegerOutput(30))   |
+----+--------------+-----------+------------+------------------------------+
| 1  | -0.072079904 | 0.9304565 | ___09_____ | Integer(IntegerOutput(9))    |
+----+--------------+-----------+------------+------------------------------+
| 0  | -0.072079904 | 0.9304565 | ______2018 | Integer(IntegerOutput(2018)) |
+----+--------------+-----------+------------+------------------------------+

Parser expected output 2

+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p) | p | text       | value                                                                                               |
+====+========+===+============+=====================================================================================================+
| 0  | 0      | 1 | 2018/09/30 | Time(TimeOutput { moment: 2018-09-30T00:00:00+02:00, grain: Day, precision: Exact, latent: false }) |
+----+--------+---+------------+-----------------------------------------------------------------------------------------------------+

Year not identified for dates before the Unix epoch (January 1 1970)

Parsing Error

Rustling doesn't identify the year in dates before the Unix epoch (January 1 1970).

This is a similar issue as #102 and the same issue as my comment there, but I add this here as a new issue because I found the exact date where it goes wrong.

Version

0.17.7

Language

en

Parser input

december 31 1969

Parser output

| ix | log(p)       | p          | text             | value                                                                                               |
+====+==============+============+==================+=====================================================================================================+
| 1  | -0.072079904 | 0.9304565  | ____________1969 | Integer(IntegerOutput(1969))                                                                        |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+
| 0  | -0.17216337  | 0.84184164 | december 31_____ | Time(TimeOutput { moment: 2019-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+--------------+------------+------------------+-----------------------------------------------------------------------------------------------------+

Parser expected output

+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+
| ix | log(p)     | p         | text             | value                                                                                               |
+====+============+===========+==================+=====================================================================================================+
| 0  | -0.4431368 | 0.6420194 | december 31 1969 | Time(TimeOutput { moment: 1969-12-31T00:00:00+01:00, grain: Day, precision: Exact, latent: false }) |
+----+------------+-----------+------------------+-----------------------------------------------------------------------------------------------------+

Algorithms to be optimised

The algo can be improved for these entities:

  • am freitag den dreiβigsten juni um mittag (on friday june the thirtieth at noon)
  • am sonntag den neunten mai zweitausendzweiunddreiβig von ein uhr bis zwei uhr morgens (on sunday may the ninth 2032 from one to two in the morning)
  • am montag den fünften juli zweitausendzweiunddreiβig um ein uhr eins (on monday july the fifth 2032 at one hour one)
  • ein uhr morgens

[Chinese] Grammar Issues

    b.rule_1_terminal("named-day",
                      b.reg(r#"星期日|星期天|礼拜天|周日|禮拜天|週日|禮拜日"#)?,
                      |_| helpers::day_of_week(Weekday::Sun)
    );

周天 is missing

    b.rule_1_terminal("hundred",
                      b.reg(r#"百|仟"#)?,
                      |_| IntegerValue::new_with_grain(100, 2)
    );

    b.rule_1_terminal("thousand",
                      b.reg(r#"千|佰"#)?,
                      |_| IntegerValue::new_with_grain(1000, 3)
    );

百|佰 hundred
千|仟 thousand

Seasons not supported:

春(天|季)? : spring
夏(天|季)?: summer
秋(天|季)?: fall
冬(天|季)?: winter

    b.rule_1_terminal("afternoon",
                      b.reg(r#"下午|中午|晏晝"#)?,
                      |_| {
                          Ok(helpers::hour(12, false)?
                              .span_to(&helpers::hour(19, false)?, false)?
                              .latent()
                              .form(Form::PartOfDay))
                      }
    );

only 下午 is right

    b.rule_1_terminal("morning",
                      b.reg(r#"早上|早晨|朝頭?早"#)?,
                      |_| {
                          Ok(helpers::hour(4, false)?
                              .span_to(&helpers::hour(12, false)?, false)?
                              .latent()
                              .form(Form::PartOfDay))
                      }
    );

early morning: 早上
朝頭?早 has no sense at first sight

Generative grammars

Hi there!

I'm wondering whether the parser grammars can be used in a reverse way, as generative grammars (e.g. to generate text sequences which are being parsed by these grammars). Is this a supported feature? If not, do you think it is viable to implement it (and could you suggest where to begin)?

Thanks.

Datetime granularity should be optional

Parsing Error

Version

On the current platform

Language

en

Parser input

"in 8 hours"

Parser output

gives TimeInterval but the timestamp is now + 8h +/- granularity (for instance it rounds the last minute). In some cases, the user might want to build an assistant to set a timer exactly 8h after the command, without some rounding. In this case it would be best if the default timestamp is the exact "now + 8h" time, and that if we want to provide granularity this should be optional in a different field

Parser expected output (Optional)

TimeInterval where the timestamp is now + 8h

[FR] Duration

Parsing Error

Version

0.15.3

Language

FR

Parser input

  • "Pour une durée de 5 minutes"
  • "Pour 5 minutes"

Parser output

"error: failed to run custom build command for `crfsuite-sys v0.2.0-pre" While trying to set up snips-nlu on docker image

Parsing Error

Version

rustc 1.31.0-nightly

your.tag.number

Language

your_language_code

Parser input

your input here

Parser output

error: failed to run custom build command for `crfsuite-sys v0.2.0-pre (https://github.com/snipsco/crfsuite-rs?rev=30b2ea6#30b2ea6f)`
process didn't exit successfully: `/nff/tools/dack/target/release/build/crfsuite-sys-c76bddaf3c451c85/build-script-build` (exit code: 101)

Paste the output of the `rustling-cli` here

thread 'main' panicked at 'Unable to find libclang: "couldn't find any of ['libclang.so', 'libclang.so.', 'libclang-.so'], set the LIBCLANG_PATH environment variable to a path where one of these files can be found (skipped: [])"', libcore/result.rs:1009:5

Parser expected output (Optional)

Paste the expected output of the `rustling-cli` here

Crash in EN for inputs of the form "<big_number> <nth> quarter"

Parsing Error

rustling crashes when called with inputs of the form "<number> <nth> quarter" where <number> is a number greater than 1050000 and <nth> is an ordinal (1st, 2nd, 3rd ...).

The crash doesn't happend for <number> values smaller than 1040000, however the closer to 1040000 and the longer the parsing time is. On my Macbook Pro (Core i7), the parsing time goes up to 1m45s for 1040000.

Version

0.18.1

Language

en

Parser input

1050000 2nd quarter

Parser output

$ RUST_BACKTRACE=1 cargo run -p rustling-cli -- --lang en parse "1050000 2nd quarter"                                                                                 101 ↵
    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
     Running `target/debug/rustling-cli --lang en parse '1050000 2nd quarter'`
thread 'main' panicked at 'No such local time', /Users/adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.6/src/offset/mod.rs:145:34
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:39
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:70
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:58
             at src/libstd/panicking.rs:200
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:215
   4: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
             at src/libstd/panicking.rs:478
   5: std::sync::once::Once::is_completed
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/panicking.rs:412
   6: alloc::raw_vec::alloc_guard
             at ./<::std::macros::panic macros>:3
   7: core::ptr::real_drop_in_place
             at /Users/adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.6/src/offset/mod.rs:186
   8: <time::duration::Duration as core::cmp::PartialOrd>::lt
             at moment/src/lib.rs:122
   9: <time::duration::Duration as core::cmp::PartialOrd>::lt
             at moment/src/lib.rs:186
  10: <time::duration::Duration as core::cmp::PartialOrd>::lt
             at moment/src/lib.rs:177
  11: rustling_ontology_values::helpers::easter::offset
             at ./moment/src/lib.rs:352
  12: core::clone::impls::<impl core::clone::Clone for usize>::clone
             at ./moment/src/interval_constraints.rs:675
  13: <bool as core::default::Default>::default
             at ./moment/src/walker.rs:146
  14: <bool as core::default::Default>::default
             at ./moment/src/walker.rs:233
  15: core::clone::impls::<impl core::clone::Clone for usize>::clone
             at ./moment/src/interval_constraints.rs:857
  16: core::clone::impls::<impl core::clone::Clone for usize>::clone
             at ./moment/src/interval_constraints.rs:1032
  17: <bool as core::default::Default>::default
             at ./moment/src/walker.rs:169
  18: <bool as core::default::Default>::default
             at ./moment/src/walker.rs:201
  19: <bool as core::default::Default>::default
             at ./moment/src/walker.rs:266
  20: <alloc::string::String as core::ops::deref::Deref>::deref
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1813
  21: <alloc::string::String as core::ops::deref::Deref>::deref
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1725
  22: <bool as core::default::Default>::default
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:1468
  23: core::clone::impls::<impl core::clone::Clone for usize>::clone
             at ./moment/src/interval_constraints.rs:1034
  24: core::clone::impls::<impl core::clone::Clone for usize>::clone
             at ./moment/src/interval_constraints.rs:864
  25: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::AllocErr>>::from
             at values/src/context.rs:54
  26: <rustling_ontology::tagger::CandidateTagger<'a, C> as rustling::MaxElementTagger<rustling_ontology_values::dimension::Dimension>>::tag::{{closure}}
             at src/tagger.rs:62
  27: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:1447
  28: <f32 as core::ops::arith::Mul<&'a f32>>::mul
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:582
  29: core::hint::unreachable_unchecked
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:519
  30: core::hint::unreachable_unchecked
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/traits.rs:582
  31: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:436
  32: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/mod.rs:1447
  33: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:606
  34: core::hint::unreachable_unchecked
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1856
  35: core::hint::unreachable_unchecked
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1839
  36: core::hint::unreachable_unchecked
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/liballoc/vec.rs:1725
  37: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libcore/iter/iterator.rs:1468
  38: core::hint::unreachable_unchecked
             at src/tagger.rs:60
  39: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /Users/adrien/.cargo/git/checkouts/rustling-281bdd3bf97d4e1e/fd8084b/src/lib.rs:140
  40: <alloc::collections::CollectionAllocErr as core::convert::From<core::alloc::LayoutErr>>::from
             at /Users/adrien/.cargo/git/checkouts/rustling-281bdd3bf97d4e1e/fd8084b/src/lib.rs:144
  41: rustling_ontology::Parser
             at src/lib.rs:64
  42: rustling_ontology::Parser
             at src/lib.rs:89
  43: rustling_cli::main
             at cli/src/main.rs:54
  44: std::rt::lang_start::{{closure}}
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/rt.rs:64
  45: std::panicking::try::do_call
             at src/libstd/rt.rs:49
             at src/libstd/panicking.rs:297
  46: panic_unwind::dwarf::eh::read_encoded_pointer
             at src/libpanic_unwind/lib.rs:92
  47: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
             at src/libstd/panicking.rs:276
             at src/libstd/panic.rs:388
             at src/libstd/rt.rs:48
  48: std::rt::lang_start
             at /rustc/2aa4c46cfdd726e97360c2734835aa3515e8c858/src/libstd/rt.rs:64
  49: rustling_cli::main

Parser expected output (Optional)

$ RUST_BACKTRACE=1 cargo run -p rustling-cli -- --lang en parse "1050000 2nd quarter"                                                                                    101 ↵
    Finished dev [unoptimized + debuginfo] target(s) in 0.14s
     Running `target/debug/rustling-cli --lang en parse '1050000 2nd quarter'`
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+
| ix | log(p)      | p          | text             | value                                                                                                   |
+====+=============+============+==================+=========================================================================================================+
| 1  | -0.07018268 | 0.9322235  | 1050000_________ | Integer(IntegerOutput(1050000))                                                                         |
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+
| 0  | -0.6190392  | 0.53846157 | _____2nd quarter | Time(TimeOutput { moment: 2019-04-01T00:00:00+02:00, grain: Quarter, precision: Exact, latent: false }) |
+----+-------------+------------+------------------+---------------------------------------------------------------------------------------------------------+

Wrong resolution of big numbers (at the end) when number + currency

Parsing Error

Version

0.18.0

Language

en
fr

Parser input

two hundred twenty-three million three hundred one thousand two hundred eleven euros
trois cent deux millions quatre cent trente milles deux cent trente euros

Parser output

+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p)      | p         | text                                                                                 | value                                                                                          |
+====+=============+===========+======================================================================================+================================================================================================+
| 0  | -0.11813116 | 0.8885795 | two hundred twenty-three million three hundred one thousand two hundred eleven euros | AmountOfMoney(AmountOfMoneyOutput { value: 223301220.0, precision: Exact, unit: Some("EUR") }) |
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+

| ix | log(p)     | p          | text                                                                      | value                                                                                          |
+====+============+============+===========================================================================+================================================================================================+
| 0  | -1.9776177 | 0.13839854 | trois cent deux millions quatre cent trente milles deux cent trente euros | AmountOfMoney(AmountOfMoneyOutput { value: 302430240.0, precision: Exact, unit: Some("EUR") }) |
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+

Parser expected output (Optional)

+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p)      | p         | text                                                                                 | value                                                                                          |
+====+=============+===========+======================================================================================+================================================================================================+
| 0  | -0.11813116 | 0.8885795 | two hundred twenty-three million three hundred one thousand two hundred eleven euros | AmountOfMoney(AmountOfMoneyOutput { value: 223301220.0, precision: Exact, unit: Some("EUR") }) |
+----+-------------+-----------+--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
| ix | log(p)     | p          | text                                                                      | value                                                                                          |
+====+============+============+===========================================================================+================================================================================================+
| 0  | -1.9776177 | 0.13839854 | trois cent deux millions quatre cent trente milles deux cent trente euros | AmountOfMoney(AmountOfMoneyOutput { value: 302430230.0, precision: Exact, unit: Some("EUR") }) |
+----+------------+------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+

[DE] Issue with years before 1900

Parsing Error

Version

0.17.0

Language

DE

Parser input

"das jahr 1905"
"das jähr 1890"

Parser output

Time(TimeOutput { moment: 1970-01-01T00:59:59+01:00, grain: Year, precision: Exact, latent: false })

Parser expected output (Optional)

Time(TimeOutput { moment: 1890-01-01T00:00:00+01:00, grain: Year, precision: Exact, latent: false })

Question regarding moment module

I'm new to rust, please forgive me if I'm wrong :-).

I got a question regarding the moment module, in lib.rs why ONLY re-export these 3 structs of chrono crate pub use chrono::{Weekday, Local, TimeZone}; ?

Perhaps, we may also need Utc to be able used with Moment, e.g. Moment(Utc::now()) ?

error: Could not compile `crfsuite-sys`

Trying to build the weather example of snips-nlu-lib, I had the following error..

$ cargo run
   Compiling snips-nlu-resources-packed v0.55.2 (https://github.com/snipsco/snips-nlu-rs#6a48f869)
   Compiling crfsuite-sys v0.2.0-pre (https://github.com/snipsco/crfsuite-rs?rev=b18d95c#b18d95cf)
   Compiling rustling-ontology v0.17.0 (https://github.com/snipsco/rustling-ontology?tag=0.17.0#2efa4c01)
error: unexpected close delimiter: `]`
   --> /home/arousseau/dev/personnel/rust/test_snip/target/debug/build/crfsuite-sys-1b9f39ab8b7d8fbd/out/crfsuite.rs:201:32
    |
201 | #[derive(Copy, Clone)]; 4usize ] , _bindgen_union_align : u32 , }#[test]
    |                                ^

I just create a project with cargo new --bin test_snip and I had the following files

# Cargo.toml

[package]
name = "test_snip"
version = "0.1.0"

[dependencies]
snips-nlu-lib = { git = "https://github.com/snipsco/snips-nlu-rs", branch = "master" }
// src/main.rs
extern crate serde_json;
extern crate snips_nlu_lib;

use std::env;

use snips_nlu_lib::{FileBasedConfiguration, SnipsNluEngine};

fn main() {
    let args: Vec<String> = env::args().collect();
    let model_file = &args[1];
    let query = &args[2];
    let configuration = match FileBasedConfiguration::from_path(model_file, false) {
        Ok(conf) => conf,
        Err(e) => panic!(format!("{}", e)),
    };
    let nlu_engine = SnipsNluEngine::new(configuration).unwrap();

    let result = nlu_engine.parse(query, None).unwrap();

    println!("{}", serde_json::to_string_pretty(&result).unwrap());
}

+ prefix should be included in the parsing

Parsing Error

At the moment, a number (or temperature, amount of money) starting with an explicit + is not parsed entirely as the prefix is omitted in the result.

Version

0.18.0

Language

All languages

Parser input

+100
or
+100°C

Parser output

+----+--------------+-----------+------+-----------------------------+
| ix | log(p)       | p         | text | value                       |
+====+==============+===========+======+=============================+
| 0  | -0.072079904 | 0.9304565 | _100 | Integer(IntegerOutput(100)) |
+----+--------------+-----------+------+-----------------------------+

or

+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
| ix | log(p)     | p         | text   | value                                                                                 |
+====+============+===========+========+=======================================================================================+
| 0  | -1.1418271 | 0.3192352 | _100°c | Temperature(TemperatureOutput { value: 100.0, unit: Some("celsius"), latent: false }) |
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+

Parser expected output (Optional)

+----+--------------+-----------+------+-----------------------------+
| ix | log(p)       | p         | text | value                       |
+====+==============+===========+======+=============================+
| 0  | -0.072079904 | 0.9304565 | +100 | Integer(IntegerOutput(100)) |
+----+--------------+-----------+------+-----------------------------+

or

+----+------------+-----------+--------+---------------------------------------------------------------------------------------+
| ix | log(p)     | p         | text   | value                                                                                 |
+====+============+===========+========+=======================================================================================+
| 0  | -1.1418271 | 0.3192352 | +100°c | Temperature(TemperatureOutput { value: 100.0, unit: Some("celsius"), latent: false }) |
+----+------------+-----------+--------+---------------------------------------------------------------------------------------+

Parsing Japanese numbers

Parsing Error

Version

0.15.3

Language

ja

Parser input

29パーセント

Parser output

Japanese numbers lead to a regex crash

Parser expected output (Optional)

29%

Issue while compiling

Hi Rustling team !

I have issues compiling latest version on Windows 10.

Compiling rustling-ontology v0.17.0 (file:///C:/Users/romai/projects/rustling/rustling-ontology)
error: failed to run custom build command for rustling-ontology v0.17.0 (file:///C:/Users/romai/projects/rustling/rustling-ontology)
process didn't exit successfully: C:\Users\romai\projects\rustling\rustling-ontology\target\debug\build\rustling-ontology-670eb07658abcf73\build-script-build (exit code: 101)
--- stdout
cargo:rerun-if-changed=grammar/de/src/rules.rs

--- stderr
thread 'main' panicked at 'called Result::unwrap() on an Err value: RustlingError(Msg("example: "montag morgens" matched no rule"), State { next_error: None })', libcore\result.rs:945:5
note: Run with RUST_BACKTRACE=1 for a backtrace.

Any hints ?

Error in parsing money with billion + small number at the end

Parsing Error

Version

commit a24b00e (HEAD -> develop, tag: 0.17.4, origin/develop, origin/HEAD)

Language

en

Parser input

Okay, um, your total is one billion three dollars and that is it?

Parser output

+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
| ix | log(p)      | p         | text                                                              | value                                                                                         |
+====+=============+===========+===================================================================+===============================================================================================+
| 0  | -0.11813116 | 0.8885795 | ________________________one billion three dollars________________ | AmountOfMoney(AmountOfMoneyOutput { value: 1000000000.0, precision: Exact, unit: Some("$") }) |
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+

Parser expected output (Optional)

+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+
| ix | log(p)      | p         | text                                                              | value                                                                                         |
+====+=============+===========+===================================================================+===============================================================================================+
| 0  | -0.11813116 | 0.8885795 | ________________________one billion three dollars________________ | AmountOfMoney(AmountOfMoneyOutput { value: 1000000003.0, precision: Exact, unit: Some("$") }) |
+----+-------------+-----------+-------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+

Handling of "since <time>"

Fr: depuis - à partir de
En: since -from
De: seit - ab

French: "depuis" combined with durations, but not with simple dates
"depuis le 5 janvier": KO
recognized as "le 5 janvier"

"depuis 3 mois": OK
TimeInterval(Between { start: 2018-07-24T00:00:00+02:00, end: 2018-10-24T12:40:43.829472+02:00, precision: Exact, latent: false })

** French: "à partir de": OK
"à partir du 5 janvier": OK
TimeInterval(After(TimeOutput { moment: 2019-01-05T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))

**English: "from", "from...on": KO
"from october 13th": KO
"from october 13th on": KO
recognized as "october 13th"

German: "seit" combined with simple dates, but not with durations
"seit dem 11. januar" : OK
TimeInterval(After(TimeOutput { moment: 2018-01-11T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))

"seit 3 monaten": KO
recognized as "3 monaten"

**German: "ab" (= from, depuis): OK
"ab dem 11. Januar"
TimeInterval(After(TimeOutput { moment: 2018-01-11T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))

2K interpreted as datetime

Parsing Error

Rustling should not interpret values like "2K" as datetime.

Version

0.17.5

Language

en

Parser input

cargo run -- --lang en parse "2K" -k Time

Parser output

Time(TimeOutput { moment: 2000-01-01T00:00:00+01:00, grain: Year, precision: Exact, latent: false })

Parser expected output (Optional)

Nothing

until + date (with no starting date) => wrong resolution

until + date => final boundary = date at 00:00:00 => INCORRECT
from + date1 + until + date2 => final boundary = date2 + 1 at 00:00:00 => CORRECT

context:
2019-03-28T10:00:00

Language: en

Parser input

until tomorrow
from yesterday until tomorrow
until the twenty-ninth
from yesterday until the twenty-ninth

Parser output

until tomorrow | TimeInterval(Before(TimeOutput { moment: 2019-03-29T00:00:00+01:00, grain: Day, precision: Exact, latent: false }))

from yesterday until tomorrow | TimeInterval(Between { start: 2019-03-27T00:00:00+01:00, end: 2019-03-30T00:00:00+01:00, precision: Exact, latent: false })

until the twenty-ninth | TimeInterval(Before(TimeOutput { moment: 2019-03-29T00:00:00+01:00, grain: Day, precision: Exact, latent: true }))

from yesterday until the twenty-ninth | TimeInterval(Between { start: 2019-03-27T00:00:00+01:00, end: 2019-03-30T00:00:00+01:00, precision: Exact, latent: false })

Parser expected output (Optional)

panic with 'No such local time' for specific input

The lib panics when given the input "31 2".
For similar inputs like "31 1" or "31 3", it works fine, interpreting the value as January 31st or March 31st. Thus it seems that the invalid February 31st causes the crash.

$ cargo run -- --lang fr parse "31 2"
    Finished dev [unoptimized + debuginfo] target(s) in 0.15s
     Running `/Users/Adrien/dev/test/rustling-ontology/target/debug/rustling-cli --lang fr parse '31 2'`
thread 'main' panicked at 'No such local time', /Users/Adrien/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.3.0/src/offset/mod.rs:151:34
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Dynamic date processing (next <time-of-day>)

"next " should return either the actual next or the one after depending on the current <-day-of-week>

Example:
Date is Tuesday the 12th:

  • next Wednesday should be the 20th;
  • next Thursday should be the 14th;
  • next Friday should be the 15th

"One two three" should be parsed as 1, 2 and 3

Parsing Error

Version

your.tag.number

Language

EN

Parser input

one two three

Parser output

parser = RustlingParser(u"en")
parser.parse(u"one two three")
>>> [{'char_range': {'end': 13, 'start': 4},
  'dim': 'Time',
  'latent': False,
  'value': {'grain': u'minute',
   'latent': False,
   'precision': u'exact',
   'type': 'value',
   'value': '2017-12-28 14:03:00 +01:00'}},
 {'char_range': {'end': 3, 'start': 0},
  'dim': 'Number',
  'latent': False,
  'value': {'type': 'value', 'value': 1}}]

Parser expected output (Optional)

We expect each number to be parsed separately as 1, 2 and 3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.