SHell-compatible WILDcards, for Rust.
shwild is a small, standalone library, implemented in C++ with a C and a C++ API, that provides shell-compatible wildcard matching.
shwild.Rust is a Rust port, with minimal API differences. The design emphasis is on simplicity-of-use, modularity, and performance.
let pattern = r"Where are the* [๐ผ๐ป]s\?";
assert_eq!(Ok(false), shwild_matches!(pattern, ""));
assert_eq!(Ok(false), shwild_matches!(pattern, "Where are the bears?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the ๐ปs?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the ๐ผs?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are their ๐ปs?"));
assert_eq!(Ok(true), shwild_matches!(pattern, "Where are the big brown ๐ปs?"));
assert_eq!(Ok(false), shwild_matches!(pattern, "Where are the teddy-๐ปs?"));
(See Examples section for more examples.)
The library (and other shwild variants) support the following pattern elements:
- Literal - a non-empty string fragment, as in
"Where are the"
, which matches the exact same string fragment in the input; - Wild-1 - represented by the single character
'?'
in the pattern, which represents a match of exactly any one character. In the above exampler"Where are the* [๐ผ๐ป]s\?"
the'?'
is not interpreted as a wild-1 because it is escaped by the'\'
character and instead part of the literal fragment"s?"
; - Wild-N - represented by the single character
'*'
in the pattern, which represents a match of any number of characters; - Range - represented by a sequence of characters within
'['
and']'
, as in the"[๐ผ๐ป]"
fragment in the above example, which will match to any one of range character in the input. As well as an unordered sequence of literal characters, ranges may also capture contiguous sequences, as in"[zc-aja]"
(any of characters'a'
,'b'
,'c'
,'j'
,'z'
) or in"[abm-PrZ]"
(any of characters'a'
,'b'
,'m'
,'M'
,'n'
,'N'
,'o'
,'O'
,'p'
,'P'
,'r'
,'Z'
); - Not-range - represented in the same form as a Range but where the first range character is
'^'
and the remaining characters represent a set of characters that cannot appear (at the requisite position) in the input;
Reference in Cargo.toml in the usual way:
shwild = { version = "~0.1" }
The constant IGNORE_CASE
causes matching to ignore case.
The shwild::Error
enum is used to represent a parse result, defined as:
pub enum Error {
/// Parse error encountered.
ParseError {
line : usize,
column : usize,
message : String,
},
}
The shwild::Result
enum is a specialized std::result::Result
type for shwild, defined as:
pub type Result<T> = std_result::Result<T, shwild::Error>;
The following crate features are defined:
Name | Effect | Is "default" ? |
Dependent feature(s) |
---|---|---|---|
"lookup-ranges" |
Causes match/non-match ranges to be implemented in terms of UnicodePointMap (from collect-rs crate), resulting in significant performance improvements in parsing and matching |
Yes | |
"test-regex" |
Introduces a dependency to regex crate to support benchmark/example program(s) | No |
The shwild::matches()
function attempts to parse a pattern
according to flags
and then match against it the string input
.
pub mod shwild {
pub fn matches(
pattern : &str,
input : &str,
flags : i64,
) -> Result<bool>;
}
The shwild::shwild_matches!()
macro is a shorthand for the shwild::matches()
function, providing 2-parameter and 3-parameter forms. The 2-parameter form passes 0 for the flags
parameter.
The shwild::CompiledMatcher
structure is the data structure that is used to parse the pattern and then test the input string. Because there is a small, but non-zero, cost to parsing patterns - and complex patterns more so, of course - so if matching is to be repeated in a context where performance costs matter then you may prefer to create an instance of CompiledMatcher
and then use it to test against, as in:
let pattern = r"Where are the* [๐ผ๐ป]s\?";
let flags = 0;
let matcher = shwild::CompiledMatcher::from_pattern_and_flags(pattern, flags).unwrap();
assert!(!matcher.matches(""));
assert!(!matcher.matches("Where are the bears?"));
assert!( matcher.matches("Where are the ๐ปs?"));
assert!( matcher.matches("Where are the ๐ผs?"));
assert!( matcher.matches("Where are their ๐ปs?"));
assert!( matcher.matches("Where are the big brown ๐ปs?"));
assert!(!matcher.matches("Where are the teddy-๐ปs?"));
If you are ever need to get an understanding about the parsed state you can use the Debug
implementation for the CompiledMatcher
, as in:
// a pattern for rudimentary Windows path names
let pattern = r"[A-Z]\?*\?*.[ce][ox][em]";
let matcher = shwild::CompiledMatcher::from_pattern_and_flags(pattern, flags).unwrap();
eprintln!("matcher={matcher:?}");
No public traits are defined at this time.
T.B.C.
Defect reports, feature requests, and pull requests are welcome on https://github.com/synesissoftware/shwild.Rust.
shwild.Rust has two dependencies, both optional:
- collect-rs - required, for more efficient range matching, if feature
"lookup-ranges"
is specified; - regex - required, by some benchmark/example programs only, if feature
"test-regex"
is specified;
Crates upon which shwild has development dependencies:
None at this time.
shwild is released under the 3-clause BSD license. See LICENSE for details.