This might be an issue with retree
rather than with the dateparser
though.
The following test (which you cannot execute via the public API) fails:
@Test
public void parserWithLimitedPatterns(){
List<String> rules = Arrays.asList(
"(?<year>\\d{4})\\W{1}(?<month>\\d{1,2})\\W{1}(?<day>\\d{1,2})[^\\d]?",
"\\W*(?:at )?(?<hour>\\d{1,2}):(?<minute>\\d{1,2})(?::(?<second>\\d{1,2}))?(?:[.,](?<ns>\\d{1,9}))?(?<zero>z)?",
" ?(?<zoneOffset>[-+]\\d{1,2}:?(?:\\d{2})?)"
);
DateParser dateParser = new DateParser(rules, new HashSet<>(rules), Collections.emptyMap(), true, false);
String input = "2022-08-09 19:04:31.600000+00:00";
Date date = dateParser.parseDate(input);
assertEquals(parser.parseDate(input), date);
}
Note how those 3 rules should be sufficient to parse the date.
- There is a rule for the year-month-day part
- There is a rule for the hours:minutes:seconds.ns part
- There is a rule for the zone offset part
However, during parsing the zoneoffset rule is never used. Instead, it uses the rule for the hours twice.
The weird thing is that when I add a rule that should not be used (`" ?(?\d{4})$"), the test suddenly succeeds:
@Test
public void parserWithLimitedPatterns(){
List<String> rules = Arrays.asList(
"(?<year>\\d{4})\\W{1}(?<month>\\d{1,2})\\W{1}(?<day>\\d{1,2})[^\\d]?",
" ?(?<year>\\\\d{4})$",
"\\W*(?:at )?(?<hour>\\d{1,2}):(?<minute>\\d{1,2})(?::(?<second>\\d{1,2}))?(?:[.,](?<ns>\\d{1,9}))?(?<zero>z)?",
" ?(?<zoneOffset>[-+]\\d{1,2}:?(?:\\d{2})?)"
);
DateParser dateParser = new DateParser(rules, new HashSet<>(rules), Collections.emptyMap(), true, false);
String input = "2022-08-09 19:04:31.600000+00:00";
Date date = dateParser.parseDate(input);
assertEquals(parser.parseDate(input), date);
}
The position where I add that additional rule is important. For example adding it at the end of the list instead of at index 1 makes the test fail again.
I bumped into this issue for PR #28 , where I try to reduce the number of rules that are used for parsing to improve the performance.