I recently switched my parsing from going over a stream of characters produced by str::chars()
to parsing over a stream of characters from ariadne::Source::chars()
. This makes it a lot easier to load all of my source files through a cache which can be directly used by ariadne
for printing reports.
Unfortunately it isn't well documented that str::char()
and ariadne::Source::chars()
don't produce the same, or even roughly equivalent, output.
I spent some time debugging why none of my line-ending parsers were succeeding before I realized that ariadne::Source::chars()
doesn't produce any line-endings (or even whitespace).
fn main() {
let raw_src = "a \nb";
let src = ariadne::Source::from(raw_src);
let raw_char_vec = raw_src.chars().collect::<Vec<_>>();
let src_char_vec = src.chars().collect::<Vec<_>>();
println!("{:#?}", raw_char_vec);
println!("{:#?}", src_char_vec);
assert_eq!(raw_src.len(), src.len());
assert_eq!(raw_src.len(), src.chars().count());
}
It would be nice if ariadne::Source::chars()
were documented as not producing any white-space.
It would also be nice if ariadne::Source
provided another method which could get a stream of characters including line-endings with appropriate spans.
I mocked up an adapter which can re-engineer a character stream which is mostly equivalent to str::chars()
as long as your parser doesn't care about white-space. It returns all the characters in the file with their correct indices, and then also injects newline characters using the index at the end of the line span. This works as a replacement for str::chars().enumerate()
in my parsing code.
struct Renumerate<'a> {
produced: usize,
line_info: Option<(usize, usize, Vec<char>)>,
lines: Box<dyn Iterator<Item = &'a ariadne::Line> + 'a>,
}
impl<'a> Renumerate<'a> {
pub fn new(src: &'a ariadne::Source) -> Renumerate<'a> {
let mut lines = src.lines();
Renumerate {
produced: 0,
line_info: lines
.next()
.map(|line| (line.span().start, line.span().end, line.chars().collect())),
lines: Box::new(lines),
}
}
}
impl<'a> Iterator for Renumerate<'a> {
type Item = (usize, char);
fn next(&mut self) -> Option<Self::Item> {
if let Some((start, end, ref line_chars)) = self.line_info {
if self.produced < line_chars.len() {
self.produced += 1;
Some((start + self.produced - 1, line_chars[self.produced - 1]))
} else {
self.line_info = self
.lines
.next()
.map(|line| (line.span().start, line.span().end, line.chars().collect()));
self.produced = 0;
Some((end - 1, '\n'))
}
} else {
None
}
}
}