Basic usage:
You can either create a new object to start to regex:
RegX regx = new RegX();
regx.anyChar().anyTimes();
or you can just simply import the static method regx()
and start to regex right away:
static import com.gmanjon.regx.RegX.regx;
...
regx().anyChar().anyTimes();
- Create a regular expression meaning "any a followed by an even number of b's"
Regular Expression: a*(bb)*
RegX Syntax:
String regx = regx().literal("a").anyTimes().followedBy(regx().literalGroup("bb").anyTimes()).toString();
- Create a regular expression for validating emails (may be could be more accurate to mail specifications, but will do for example pourpuses)
Regular Expression: [a-zA-Z](\.?\w+)*@(\w+)*\.[a-zA-Z]{2,5}
RegX Syntax:
RegX startAccountName = regx().alphabeticChar();
RegX continueAccountName = regx().literal('.').optional().anyWord().group().anyTimes();
RegX at = regx().literal('@');
RegX domain = regx().anyWord().group().anyTimes().literal('.').alphabeticChar(2, 5);
String regx = startAccountName.followedBy(continueAccountName).followedBy(at).followedBy(domain).toString();
All examples assume the RegX object is already created in a variables called r
, either via static method or constructor.
๐ด Not yet supported. Be patient, it will be.
Construct |
Matches |
Example |
x |
The character x |
```r.literal('x')``` |
\\ |
The backslash character |
```r.literal('\\')``` |
\0n |
The character with octal value 0n (0 <= n <= 7) |
```r.literal(0n)``` |
\0nn |
The character with octal value 0nn (0 <= n <= 7) |
```r.literal(0nn)``` |
\0mnn |
The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) |
```r.literal(0mnn)``` |
\xhh |
The character with hexadecimal value 0xhh |
```r.literal(0xhh)``` |
\uhhhh |
The character with hexadecimal value 0xhhhh |
```r.literal(0xhhhh)``` |
\x{h...h} |
The character with hexadecimal value 0xh...h (Character.MIN_CODE_POINT <= 0xh...h <= Character.MAX_CODE_POINT) |
```r.literal(0xh...h)``` |
\t |
The tab character ('\u0009') |
```r.literal('\t')``` |
\n |
The newline (line feed) character ('\u000A') |
```r.literal('\n')``` |
\r |
The carriage-return character ('\u000D') |
```r.literal('\r')``` |
\f |
The form-feed character ('\u000C') |
```r.literal('\f')``` |
\a |
The alert (bell) character ('\u0007') |
```r.literal(0x0007)``` or ```r.regex("\a")``` |
\e |
The escape character ('\u001B') |
```r.literal(0x001B)``` or ```r.regex("\e")``` |
\cx |
The control character corresponding to x |
```r.literal('\x')``` |
Construct |
Matches |
Example |
[abc] |
a, b, or c (simple class) |
```r.oneOf("abc")``` |
[^abc] |
Any character except a, b, or c (negation) |
```r.noneOf("abc")``` |
[a-zA-Z] |
a through z or A through Z, inclusive (range) |
```r.range('a','z','A,'Z')``` |
[a-d[m-p]] |
a through d, or m through p: [a-dm-p] (union) |
```r.range('a','d','m','p')``` |
[a-z&&[def]] |
d, e, or f (intersection) |
```r.range('a','z').intersectOneOf("def")``` |
[a-z&&[^bc]] |
a through z, except for b and c: [ad-z] (subtraction) |
```r.range('a','z').intersectNoneOf("bc")``` |
[a-z&&[^m-p]] |
a through z, and not m through p: [a-lq-z](subtraction) |
```r.range('a','z').intersectExclude('m','p')``` |
Predefined character classes
Construct |
Matches |
Example |
. |
Any character (may or may not match line terminators) |
```r.anyChar()``` |
\d |
A digit: [0-9] |
```r.digit()``` |
\D |
A non-digit: [^0-9] |
```r.nonDigit()``` |
\s |
A whitespace character: [ \t\n\x0B\f\r] |
|
\S |
A non-whitespace character: [^\s] |
|
\w |
A word character: [a-zA-Z_0-9] |
|
\W |
A non-word character: [^\w] |
|
POSIX character classes (US-ASCII only)
Construct |
Matches |
Example |
\p{Lower} |
A lower-case alphabetic character: [a-z] |
๐ด |
\p{Upper} |
An upper-case alphabetic character:[A-Z] |
๐ด |
\p{ASCII} |
All ASCII:[\x00-\x7F] |
๐ด |
\p{Alpha} |
An alphabetic character:[\p{Lower}\p{Upper}] |
๐ด |
\p{Digit} |
A decimal digit: [0-9] |
๐ด |
\p{Alnum} |
An alphanumeric character:[\p{Alpha}\p{Digit}] |
๐ด |
\p{Punct} |
Punctuation: One of !"###$%&'()*+,-./:;<=>?@[\]^_`{|}~ |
๐ด |
\p{Graph} |
A visible character: [\p{Alnum}\p{Punct}] |
๐ด |
\p{Print} |
A printable character: [\p{Graph}\x20] |
๐ด |
\p{Blank} |
A space or a tab: [ \t] |
๐ด |
\p{Cntrl} |
A control character: [\x00-\x1F\x7F] |
๐ด |
\p{XDigit} |
A hexadecimal digit: [0-9a-fA-F] |
๐ด |
\p{Space} |
A whitespace character: [ \t\n\x0B\f\r] |
๐ด |
java.lang.Character classes (simple java character type)
Construct |
Matches |
Example |
\p{javaLowerCase} |
Equivalent to java.lang.Character.isLowerCase() |
๐ด |
\p{javaUpperCase} |
Equivalent to java.lang.Character.isUpperCase() |
๐ด |
\p{javaWhitespace} |
Equivalent to java.lang.Character.isWhitespace() |
๐ด |
\p{javaMirrored} |
Equivalent to java.lang.Character.isMirrored() |
๐ด |
Classes for Unicode scripts, blocks, categories and binary properties
Construct |
Matches |
Example |
\p{IsLatin} |
A Latin script character (script) |
๐ด |
\p{InGreek} |
A character in the Greek block (block) |
๐ด |
\p{Lu} |
An uppercase letter (category) |
๐ด |
\p{IsAlphabetic} |
An alphabetic character (binary property) |
๐ด |
\p{Sc} |
A currency symbol |
๐ด |
\P{InGreek} |
Any character except one in the Greek block (negation) |
๐ด |
[\p{L}&&[^\p{Lu}]] |
Any letter except an uppercase letter (subtraction) |
๐ด |
Construct |
Matches |
Example |
^ |
The beginning of a line |
```r.startOfLine()``` |
$ |
The end of a line |
```r.endOfLine()``` |
\b |
A word boundary |
|
\B |
A non-word boundary |
|
\A |
The beginning of the input |
|
\G |
The end of the previous match |
|
\Z |
The end of the input but for the final terminator, if any |
|
\z |
The end of the input |
|
Construct |
Matches |
Example |
X? |
X, once or not at all |
```r.literal('X').optional()``` |
X* |
X, zero or more times |
```r.literal('X').anyTimes()``` |
X+ |
X, one or more times |
```r.literal('X').oneOrMoreTimes()``` |
X{n} |
X, exactly n times |
```r.literal('X').times(n)``` |
X{n,} |
X, at least n times |
```r.literal('X').times(n, -1)``` |
X{n,m} |
X, at least n but not more than m times |
```r.literal('X').times(n, m)``` |
Construct |
Matches |
Example |
X?? |
X, once or not at all |
```r.literal('X').optional()``` |
X*? |
X, zero or more times |
```r.literal('X').anyTimesLazy()``` |
X+? |
X, one or more times |
```r.literal('X').oneOrMoreTimesLazy()``` |
X{n}? |
X, exactly n times |
```r.literal('X').times(n)``` |
X{n,}? |
X, at least n times |
|
X{n,m}? |
X, at least n but not more than m times |
|
Construct |
Matches |
Example |
X?+ |
X, once or not at all |
|
X*+ |
X, zero or more times |
|
X++ |
X, one or more times |
|
X{n}+ |
X, exactly n times |
|
X{n,}+ |
X, at least n times |
|
X{n,m}+ |
X, at least n but not more than m times |
|
Construct |
Matches |
Example |
XY |
X followed by Y |
|
X|Y |
Either X or Y |
|
(X) |
X, as a capturing group |
|
Construct |
Matches |
Example |
\n |
Whatever the nth capturing group matched |
|
\k |
Whatever the named-capturing group "name" matched |
|
Construct |
Matches |
Example |
\ |
Nothing, but quotes the following character |
|
\Q |
Nothing, but quotes all characters until \E |
|
\E |
Nothing, but ends quoting started by \Q |
|
Special constructs (named-capturing and non-capturing)
Construct |
Matches |
Example |
(?X) |
X, as a named-capturing group |
|
(?:X) |
X, as a non-capturing group |
|
(?idmsuxU-idmsuxU) |
Nothing, but turns match flags i d m s u x U on - off |
|
(?idmsux-idmsux:X) |
X, as a non-capturing group with the given flags i d m s u x on - off |
|
(?=X) |
X, via zero-width positive lookahead |
|
(?!X) |
X, via zero-width negative lookahead |
|
(?<=X) |
X, via zero-width positive lookbehind |
|
(?
| X, via zero-width negative lookbehind |
|
(?>X) |
X, as an independent, non-capturing group |
|