All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
select!andselect_ref!now support cfg attributes.
MapExtra::emit, which allows emitting secondary errors during mapping operationsInputRef::emit, which allows emitting secondary errors withincustomparserslabelled_with, which avoids the need to implementClonefor labelsInput::split_token_span, a convenience function for splitting(Token, Span)inputs so that chumsky can understand themInput::split_spanned, which does the same as above, but for implementors ofWrappingSpanspanned, a combinator which automatically annotates a parser output with a span- Experimental:
IterParser::parse_iter, a way to turn anIterParser(likex.repeated()) into an iteratorParser::debug, which provides access to various parser debugging utilities.
- Made
nested_inmore flexible by allowing it to map between different input types instead of requiring the same input as the outer parser
- A prioritisation bug with
nested_in
- Implement
Default,PartialOrd, andOrdforSimpleSpan - Implement
PartialOrdandOrdforRich
- Patched compilation error that only appeared in release builds
- The
set(...)combinator, which may be used to conveniently parse a set of patterns, in any order - Support for non-associative infix operators are now supported by the
.pratt(...)combinator Parser::try_foldl, allowing folding operations to short-circuit in a manner similar toParser::try_mapParser::contextual, which allows a parser to be disabled or enabled in a context-sensitive manner- Implemented
ValueInputforIterInput, which was previously missing - More types that implement the
Statetrait:RollbackState, which reverts parser state when a parser rewinds and may be used to, for example, count the number of times a pattern appears in the inputTruncateState, which truncates a vector when a parser rewinds, and may be used to implement an arena allocator for AST nodes
- Implemented
IterParserfora.then(b)when bothaandbare bothIterParsers that produce the same output type
- Made lifetime bounds on
recursiveandParserExtramore permissive - Improved support for grapheme parsing
- Text parsers now report labels
Parser::filternow generates aDefaultExpected::SomethingElselabel instead of nothing (this can be overridden with the.labelled(...)function)- Improved areas of documentation
- Make whitespace parsers reject codepoints that are vulnerable to CVE-2021-42574
- Maybe the
select!parser more permissive, accepting any implementor ofInputinstead of requiringValueInputtoo
- Many minor incorrect debug-only sanity checks have been fixed
- Many minor span and error prioritisation behavioural problems have been fixed (most notably,
Parser::try_map) - The
.rewind()parser no longer rewinds any secondary error that were encountered - Accidental
text::ascii::keywordlifetime regression
- Implemented
ContainerforVecDeque - New section covering recursion in the guide
Boxedtypes now have a default type parameter ofextra::Default, likeParserandIterParser- The tutorial has been updated for
0.10and has been moved to the guide
- Nonsense spans occasionally generated for non-existent tokens
- Improved docs have been added for several items
- Many minor documentation issues have been fixed
Note: version 0.10 is a from-scratch rewrite of chumsky with innumerable small changes. To avoid this changelog being longer than the compiled works of Douglas Adams, the following is a high-level overview of the major feature additions and does not include small details.
- Support for zero-copy parsing (i.e: parser outputs that hold references to the parser input)
- Support for parsing nested inputs like token trees
- Support for parsing context-sensitive grammars such as Python-style indentation, Rust-style raw strings, and much more
- Support for parsing by graphemes as well as unicode codepoints
- Support for caching parsers independent of the lifetime of the parser
- A new trait,
IterParser, that allows expressing parsers that generate many outputs - Added the ability to collect iterable parsers into fixed-size arrays, along with a plethora of other container types
- Support for manipulating shared state during parsing, elegantly allowing support for arena allocators, cstrees, interners, and much more
- Support for a vast array of new input types: slices, strings, arrays,
impl Readers, iterators, etc. - Experimental support for memoization, allowing chumsky to parse left-recursive grammars and reducing the computational complexity of parsing certain grammars
- An extension API, allowing third-party crates to extend chumsky's capabilities and introduce new combinators
- A
prattparser combinator, allowing for conveniently and simply creating expression parsers with precise operator precedence - A
regexcombinator, allowing the parsing of terms based on a specific regex pattern - Properly differentiated ASCII and Unicode text parsers
Parser::then_withhas been removed in favour of the new context-sensitive combinators
- Performance has radically improved
- Error generation and handling is now significantly more flexible
- Properly fixed
skip_then_retry_untilregression
- Regression in
skip_then_retry_untilrecovery strategy
- A
spill-stackfeature that usesstackerto avoid stack overflow errors for deeply recursive parsers - The ability to access the token span when using
select!likeselect! { |span| Token::Num(x) => (x, span) } - Added a
skip_parserrecovery strategy that allows you to implement your own recovery strategies in terms of other parsers. For example,.recover_with(skip_parser(take_until(just(';'))))skips tokens until after the next semicolon - A
notcombinator that consumes a single token if it is not the start of a given pattern. For example,just("\\n").or(just('"')).not()matches anycharthat is not either the final quote of a string, and is not the start of a newline escape sequence - A
semantic_indentationparser for parsing indentation-sensitive languages. Note that this is likely to be deprecated/removed in the future in favour of a more powerful solution #[must_use]attribute for parsers to ensure that they're not accidentally created without being usedOption<Vec<T>>andVec<Option<T>>now implementChain<T>andOption<String>implementsChain<char>choicenow supports both arrays and vectors of parsers in addition to tuples- The
Simpleerror type now implementsEq
text::whitespacereturns aRepeatedinstead of animpl Parser, allowing you to call methods likeat_leastandexactlyon it.- Improved
no_stdsupport - Improved examples and documentation
- Use zero-width spans for EoI by default
- Don't allow defining a recursive parser more than once
- Various minor bug fixes
- Improved
Displayimplementations for various built-in error types andSimpleReason - Use an
OrderedContainertrait to avoid unexpected behaviour for unordered containers in combination withjust
- Made several parsers (
todo,unwrapped, etc.) more useful by reporting the parser's location on panic - Boxing a parser that is already boxed just gives you the original parser to avoid double indirection
- Improved compilation speeds
then_withcombinator to allow limited support for parsing nested patterns- impl From<&[T; N]> for Stream
SkipUntil/SkipThenRetryUntil::skip_start/consume_endfor more precise control over skip-based recovery
- Allowed
Validateto map the output type - Switched to zero-size End Of Input spans for default implementations of
Stream - Made
delimited_bytake combinators instead of specific tokens - Minor optimisations
- Documentation improvements
- Compilation error with
--no-default-features - Made default behaviour of
skip_untilmore sensible
-
A new tutorial to help new users
-
selectmacro, a wrapper overfilter_mapthat makes extracting data from specific tokens easy -
choiceparser, a better alternative to longorchains (which sometimes have poor compilation performance) -
todoparser, that panics when used (but not when created) (akin to Rust'stodo!macro, but for parsers) -
keywordparser, that parses exact identifiers -
from_strcombinator to allow converting a pattern to a value inline, usingstd::str::FromStr -
unwrappedcombinator, to automatically unwrap an output value inline -
rewindcombinator, that allows reverting the input stream on success. It's most useful when requiring that a pattern is followed by some terminating pattern without the first parser greedily consuming it -
map_err_with_spancombinator, to allow fetching the span of the input that was parsed by a parser before an error was encountered -
or_elsecombinator, to allow processing and potentially recovering from a parser error -
SeparatedBy::at_mostto require that a separated pattern appear at most a specific number of times -
SeparatedBy::exactlyto require that a separated pattern be repeated exactly a specific number of times -
Repeated::exactlyto require that a pattern be repeated exactly a specific number of times -
More trait implementations for various things, making the crate more useful
- Made
just,one_of, andnone_ofsignificant more useful. They can now accept strings, arrays, slices, vectors, sets, or just single tokens as before - Added the return type of each parser to its documentation
- More explicit documentation of parser behaviour
- More doc examples
- Deprecated
seq(justhas been generalised and can now be used to parse specific input sequences) - Sealed the
Charactertrait so that future changes are not breaking - Sealed the
Chaintrait and made it more powerful - Moved trait constraints on
Parserto where clauses for improved readability
- Fixed a subtle bug that allowed
separated_byto parse an extra trailing separator when it shouldn't - Filled a 'hole' in the
Errortrait's API that conflated a lack of expected tokens with expectation of end of input - Made recursive parsers use weak reference-counting to avoid memory leaks
skip_untilerror recovery strategySeparatedBy::at_leastandSeparatedBy::at_mostfor parsing a specific number of separated itemsParser::validatefor integrated AST validationRecursive::declareandRecursive::definefor more precise control over recursive declarations
- Improved
separated_byerror messages - Improved documentation
- Hid a new (probably) unused implementation details
take_untilprimitive
- Added span to fallback output function in
nested_delimiters
- Support for LL(k) parsing
- Custom error recovery strategies
- Debug mode
- Nested input flattening
- Radically improved error quality