Code Monkey home page Code Monkey logo

irregular's People

Contributors

cadamini avatar chadmando avatar figueroadavid avatar startautomating avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

irregular's Issues

Use-RegEx: -IncludeInputObject

Describe the Problem
When matching a series of files or functions, Use-Regex does make it easy to pair the match (or extracted match) with the input.

This makes an object pipeline using both pieces of information considerably less elegant.

Describe the solution you'd like

Use-Regex should have a parameter -IncludeInputObject (aliased to -OutputInputObject). This should include the input object in the output when using -Extract.

Describe alternatives you've considered

Use-Regex: -IncludeMatch could signal to include the input object.

Write-RegEx -If -Then does not work with empty lookahead

Describe the bug

Write-RegEx -If -Then could be useful for creating balancing groups, if it produced valid RegEx.

To Reproduce

Write-RegEx -If Foo -Then ?!  # produces an invalid RegEx:  (?(Foo)(?:?!))

Expected behavior

(?(Foo)(?!))

Improve readability on the console when using `-Extract` to create PSCustomObjects

Creating PSCO from -Extract is hard to read on the Commandline

The default output is hard to read on the command line, unless you filter properties.

Wanted Behavior

Output where named capture groups become object properties, excluding 0 and Match

Example using -match

$regex = @'
(?x)
# parse output from: "netstat -a -n -o
^\s+
(?<Protocol>\S+)
\s+
(?<LocalAddress>\S+)
\s+
(?<ForeignAddress>\S+)
\s+
(?<State>\S{0,})?
\s+
(?<Pid>\S+)$
'@

netstat -a -n -o | %{                                              
    if($_ -match $regex) {                                         
      $matches.remove(0)                                           
      [pscustomobject]$Matches                                     
    }                                                              
} | select -Last 2 -ov 'last'                                      
| ft                                                               
                                                                   
$last | fl                                                                                                 

image

Current Behavior

image

netstat -a -n -o 
| Use-RegEx -Extract -Pattern $regex -Match { $_ } -ea SilentlyContinue
| select -Last 2 -ov last
|Ft

$last | fl

Change

Removing or hiding the the [Match] and 0 properties by default when using -Extract

netstat -a -n -o 
| Use-RegEx -Extract -Pattern $regex -Match { $_ } -ea SilentlyContinue
| select -ExcludeProperty 'match', '0'
| select -Last 2 -ov last
| ft

$last | fl

image

Regular Expression for GCode

Describe the Desired Expression

A regular expression that can extract out GCode instructions.

Provide some Sample Source Text

; generated by Slic3r 1.3.0 on 2021-10-05 at 19:28:57

; external perimeters extrusion width = 0.44mm (3.38mm^3/s)
; perimeters extrusion width = 0.48mm (7.54mm^3/s)
; infill extrusion width = 0.48mm (10.05mm^3/s)
; solid infill extrusion width = 0.48mm (2.51mm^3/s)
; top infill extrusion width = 0.48mm (1.88mm^3/s)

M127
M118 X38.97 Y23.08 Z3.00 T0
M140 S50 T0
M104 S230 T0
M104 S0 T1
M107
G90
G28
M132 X Y Z A B
G1 Z50.000 F420
G161 X Y F3300
M7 T0
M6 T0
M651
M907 X100 Y100 Z40 A100 B20
M108 T0
M106
; Filament gcode

G21 ; set units to millimeters
G90 ; use absolute coordinates
M73 P0
G1 Z0.400 F7800.000
G1 E-2.00000 F2400.00000
G1 X16.663 Y-10.414 F7800.000
G1 E0.00000 F2400.00000
G1 F1800
G1 X18.593 Y-9.147 E0.27422
G1 X19.941 Y-7.377 E0.53847
G1 X29.941 Y11.623 E3.08846
G1 X30.714 Y14.750 E3.47102
G1 X30.714 Y19.750 E4.06485
G1 X30.317 Y22.025 E4.33907
G1 X29.173 Y24.030 E4.61330
G1 X27.417 Y25.530 E4.88752
G1 X25.257 Y26.345 E5.16175
G1 X24.000 Y26.464 E5.31171
G1 X-6.000 Y26.464 E8.87467

Highlight What you'd like to match

Each line should be matched. If the line starts with a ;, it should be considered a comment. Otherwise, the first word is the instruction and all subsequent words are arguments. Anything after ; on a given line should be considered a comment.

Regular Expression for Cron Intervals

Describe the Desired Expression
It would be great to match cron intervals

Provide some Sample Source Text
0 0-23 * * *
11 11 * * 1,2,3,4,5

Highlight What you'd like to match
Minute, Hour, Day, DayOfMonth, DayOfWeek

Set-Regex: Specify how to know the path which is used to save the file if I do not use the Path parameter

The documentation :

    -Path <String>
        The path to the file.  If this is not provided, it will save regular expressions to the user's Irregular
        module path.

1 - When I save a regex I can't tell where the cmdlet is saving the file.

2 - When the module is not installed by Install-module, in this case the behavior concerning the choice of the path is not documented.

Describe the solution you'd like

For 1 : add Verbose parameter.
For 2 : documente the behavior in this case.

Additional context

The Irregular module was not installed by Install-Module (no internet access on the server):

IPMO C: \ Users \ MyAccount \ Downloads \ irregular \ Irregular.psd1

$env:psmodulepath
C:\Users\MyAccount\Documents\WindowsPowerShell\Modules; C:\Program Files\WindowsPowerShell\Modules; C:\Windows\system32\WindowsPowerShell\v1.0\Modules ...

In my case the file is saved in the first path present in $env:psmodulepath :

C:\Users\MyAccount\Documents\WindowsPowerShell\Modules\Irregular\Regex

Found with trace-Command :

Trace-Command PathResolution -expression {
$StartAnchor='\[1'
$EndAnchor ='P]]'
Write-RegEx -Name ExtractDataBetweenMarkers -After ${StartAnchor} -CharacterClass Any  -Greedy -Lazy| 
 write-regex -Before $EndAnchor |Set-Regex } -pshost


#DÉBOGUER : PathResolution Information: 0 :     RESOLVED PATH:
#C:\Users\MyAccount\Documents\WindowsPowerShell\Modules\Irregular\RegEx\ExtractDataBetweenMarkers.regex.txt

The cmdlet create the path '\Irregular\RegEx'

Write-RegEx.ps1 help example #4 doesn't work for all email addresses

In Write-RegEx.ps1, the fourth .Example provides code for capturing an e-mail address, but the sample code provided fails if the email address contains subdomains (e.g. [email protected] or [email protected]).

I'm also not sure what would happen if an email address or domain contains multiple hyphens ([email protected] or [email protected]).

FWIW, emailRegex.com has a pretty exhaustive .NET RegEx string for email addresses, but it might be too much for an example.

To Reproduce
emailRegex.Match("[email protected]")

Expected behavior
Returns the entire email address

Actual behavior
Returns [email protected]

Additional context
This came up during your talk to the NY PowerShell MeetUp.

Cool talk, btw!

Export-RegEx -As EmbeddedEngine is a no-op

Describe the bug

Export-RegEx -As EmbeddedEngine does nothing

To Reproduce

Export-RegEx -As EmbeddedEngine -Name * -Path .\Test.ps1

Does not create test.ps1, or error out.

Additional context

It appears that this has been internally renamed to 'Embedded', but not updated in the [ValidateSet()] of Get-RegEx or Export-RegEx.

?<PowerShell_HelpField> is slightly incorrect

?<PowerShell_HelpField> is slightly incorrect.

It matches any whitespace after the field.

It should match any whitespace except a newline or carriage return.

Then it should match the carriage returns and newlines, and then match the Content of the field.

Additionally. ?<PowerShell_HelpField> does not locate the end correctly.

It looks for . followed by any word characters.

It should instead look for dot followed by any valid help field names.

Markdown Lists

Describe the Desired Expression

Handle Markdown Lists

Provide some Sample Source Text

  • foo
    • bar
      • baz
  • ThisIsDone
  • ThisIsnot
  1. One thing
  2. Another Thing
  3. One More Thing

Highlight What you'd like to match

Each List Item

What parts of the samples should match? Which pieces of data are important to extract from the match?

  • The list item line
  • The indentation level
  • If the list is a task
  • If the list is a number

Requested RegEx: ?<C_IfDef>

Describe the Desired Expression

Should Match a C/C++ #ifdef preprocessor statement, up until the #endif.

Provide some Sample Source Text

#ifdef Windows
// compile this code
#endif 

#ifndef Windows
// don't compile, it's not Windows
#endif

But not:

//#ifdef Windows
// this really doesn't matter
//#endif

What parts of the samples should match? Which pieces of data are important to extract from the match?

Each if statement should match.

The type of if statement and the remainder of the line will be import.

New-RegEx: -NotCharacterClass, -NotLiteralCharacter

Is your feature request related to a problem? Please describe.

Regex character class subtraction is annoying.

New-Regex should include something to abstract it.

Describe the solution you'd like

New-Regex should add -NotCharacterClass and -NotLiteralCharacter.

Without additional parameters, these should act like -CharacterClass and -LiteralCharacter, except that the selection set should be prefixed with ^ (to indicate that it is not those characters)

When provided with -CharacterClass or -LiteralCharacter, these should create a character class subtraction.

# Match anything but punctuation
New-RegEx -NotCharacterClass Punctuation 

# Match any punctuation except open/close/quote/endquote, and comma.
New-RegEx -CharacterClass Punctuation -NotCharacterClass PunctuationOpen, PunctuationClose, PunctuationInitialQuote, PunctuationFinalQuote -NotLiteralCharacter ','  

Regexes for Subtitles

Describe the Desired Expression

It would be great to have a Regexes to parse subtitle files. SRT and WebVTT are both relatively straightforward.

Provide some Sample Source Text

SRT and WebVTT are both relatively straightforward.

Markdown YAML Headers

Describe the Desired Expression

Markdown YAML Headers

Provide some Sample Source Text


This: "Is a YamlHeader If It's at the top of a doc"

Highlight What you'd like to match

The entire header.

What parts of the samples should match? Which pieces of data are important to extract from the match?

The header and it's content.

Make Use-RegEx -Extract smarter

Is your feature request related to a problem? Please describe.

Use-RegEx -Extract is great for creating a property bag. It could be even more helpful if it auto-converted primitive types, such as [Timespan], [Datetime], [float], [int], [bool] . This functionality is possible with -Coerce, but requires knowledge of each capture.

Describe the solution you'd like

Either:

  • Make -Extract auto-coerce primitive types
  • Make a [switch] parameter controls auto-coercion
  • Make a parameter that allows one to control the method of -Extraction, with a value that allows auto-coercion.

?<FFMpeg_Progress> should have optional whitespace after each field name

Describe the bug

This should have matched ?<FFMpeg_Progress>:

To Reproduce

"frame=10674 fps=1333 q=28.0 size=   20736kB time=00:07:04.80 bitrate= 399.9kbits/s speed=53.1x" | ?<FFmpeg_Progress>

Expected behavior

It matches

Actual behavior

It does not match, because the Regex expects at least one whitespace after frame=

[Question] About the best way to get information about the "matched" file

Hello,

When I pass files as parameters I would like the match to show also the information of the file.
For example if I execute:

Get-ChildItem -Path '.\' -Filter *.ps1|ForEach-Object{ $_|?<PowerShell_Requires> }

I get something like:

StartIndex EndIndex Value
---------- -------- -----
153 192 #requires -Version 2.0 -Modules ShowUI
0 23 #requires -Version 2.0
0 23 #requires -Version 1.0
0 23 #Requires -Version 3.0
24 59 #Requires -Modules ActiveDirectory

Checking the type I see that it is System.Text.RegularExpressions.Match.

To show which file the match corresponds to I used:

Get-ChildItem -Path '.\' -Filter *.ps1|ForEach-Object{ $_.FullName; $_|?<PowerShell_Requires> }

but honestly that's not what I would want to get.

Is there a "neater" way to get the match with the file info included?
For example some type like System.Text.RegularExpressions.FileMatch that in addition to StartIndex, EndIndex and Value contains the File System Object.

Best regards,
Claudio Salvio

Need discoverability of the "smart aliases"

Basically, the problem is that when one reads a script or a blog post that uses the module, they use a bunch of commands like ?<Whatever> that don't exist (or at least, aren't discoverable when the module isn't imported).

I think the best thing would be to put the whole list of aliases in the module manifest.

Last example in README.md does not return values, just keys

Describe the bug

The last example of PowerShell code in the README.md does not return values.

"number: 1
string: 'hello'" | ? -Split |
Foreach-Object {
$key = $_ | ? -Until -Trim -IncludeMatch
$value = $key | ? -Until -Trim
@{$key.Trim(':')=$value}
}

To Reproduce

Steps to reproduce the behavior:

  1. Go to main code page on GitHub for Irregular
  2. Scroll down to bottom of page
  3. Click on icon to copy example code
  4. Paste PowerShell code into PowerShell terminal window
  5. Observe that the value content is missing

Expected behavior
I expect keys and values to be displayed.

Actual behavior
Name Value


number
string

Additional context
This behavior was seen in both Windows PowerShell 5.1 and PowerShell 7.2.5

?<C_Enum>

Describe the Desired Expression

Match an enum in C/C++

Add a Remove-Regex cmdlet

Is your feature request related to a problem? Please describe.

When I delete a module containing a regex, it is still present in the list returned by the Get-Regex cmdlet.

Describe the solution you'd like

In case the imported file contains only one regex, removing the module should remove the regex from the list returned by Get-RegEx.

Additional context

Repro :

Write-RegEx -Name TestExtractDataBetweenMarkers -After ${StartAnchor} -CharacterClass Any  -Greedy -Lazy|  write-regex -Before $EndAnchor |Set-Regex -Path c:\temp\regex
Import-Regex  C:\temp\regex\TestExtractDataBetweenMarkers.regex.txt
Get-RegEx -Name TestExtractDataBetweenMarkers

# Name                          Description
# ----                          -----------
# TestExtractDataBetweenMarkers

Get-Module
# ModuleType Version    Name                                ExportedCommands
# ---------- -------    ----                                ----------------
# Script     0.0        ?<TestExtractDataBetweenMarkers>    ?<TestExtractDataBetweenMarkers>
# Script     0.6        Irregular                           {Export-RegEx, Get-RegEx, Import-RegEx, Set-Regex, Show-Re...
# ...

Remove-Module ?<TestExtractDataBetweenMarkers>
Get-RegEx -Name TestExtractDataBetweenMarkers
# Name                          Description
# ----                          -----------
# TestExtractDataBetweenMarkers


?<TestExtractDataBetweenMarkers>
# ?<TestExtractDataBetweenMarkers> : Le terme «?<TestExtractDataBetweenMarkers>» n'est pas reconnu comme nom d'applet de
# commande, fonction, fichier de script ou programme exécutable. Vérifiez l'orthographe du nom, ou si un chemin d'accès
# existe, vérifiez que le chemin d'accès est correct et réessayez.
# Au caractère Ligne:1 : 1
# + ?<TestExtractDataBetweenMarkers>
# + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#     + CategoryInfo          : ObjectNotFound: (?<TestExtractDataBetweenMarkers>:String) [], CommandNotFoundException
#     + FullyQualifiedErrorId : CommandNotFoundException

Set-Regex throw WriteErrorException when the parameter -Modifier is present with Write-RegEx

Describe the bug
Set-Regex throw WriteErrorException when the parameter -Modifier is present with Write-RegEx

To Reproduce

$StartAnchor='\[1'
$EndAnchor ='P]]'
Write-RegEx -Name ExtractDataBetweenMarkers -Modifier 'SingleLine' -After ${StartAnchor} -CharacterClass Any  -Greedy -Lazy |
Write-Regex -Before $EndAnchor |Set-Regex

Expected behavior
No error (?) :

 Write-RegEx -Name ExtractDataBetweenMarkers -Modifier 'SingleLine' -After ${StartAnchor} -CharacterClass Any  -Greedy -Lazy |
Write-Regex -Before $EndAnchor |Set-Regex

Actual behavior

Set-Regex : Must provide a -Name, or start the pattern with a named capture.
Au caractère Ligne:2 : 34
+  Write-Regex -Before $EndAnchor |Set-Regex
+                                  ~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Irregular.Missing.Name,Set-Regex

My regex seems to have a capture name :

(?s) (?<ExtractDataBetweenMarkers>(?<=\ [1). *?) (?= P]])

When I remove the Modifier parameter I have no error :

 Write-RegEx -Name ExtractDataBetweenMarkers -Modifier 'SingleLine' -After ${StartAnchor} -CharacterClass Any  -Greedy -Lazy | Write-Regex -Before $EndAnchor |Set-Regex

Additional context
PS v5.1 on Windows Server 2012.
The Irregular module was not installed by Install-Module (no internet access on the server) :

IPMO C:\Users\MyAccount\Downloads\irregular\Irregular.psd1

?<Network_MACAddress>

Describe the Desired Expression

An expression to match a MAC address.

Provide some Sample Source Text

3C-9C-0F-8C-34-21

Highlight What you'd like to match

What parts of the samples should match? Which pieces of data are important to extract from the match?

It would be nice to extract out the OUI and NIC portions of the MAC.

?<CamelCaseSpace>

There should be a Regex to find the location a space would exist if text were interpreted as CamelCase

An ASCII version of this Regex could be written as:

New-Regex -Not -Modifier IgnoreCase |
    New-RegEx -After '[a-z]' |
    New-RegEx -Before '[A-Z]'
    
# Which would produce:
'(?-i)(?<=[a-z])(?=[A-Z])'

More FFMpeg Regexes

Describe the Desired Expression

?<FFMpeg_Progress> isn't the only useful piece of information to extract from FFMpeg:

There's also:

  • Configuration
  • Streams
  • Inputs
  • Outputs
  • Metadata

IPv4Address RegEx matches invalid IPs

When matching IPv4 groups of numbers, invalid IPs are matched as well as valid IPs

$ipRegex = Write-RegEx -Pattern ?
$ipRegex.Matches('192.168.86.153,256.199.381.31,10.0.0.1')

StartIndex EndIndex Value


0 14 192.168.86.153
15 29 256.199.381.31 <-- should not match
30 38 10.0.0.1

piping git log to ?<Git_Log>

Describe the bug

A clear and concise description of what the bug is.

To Reproduce
On Mac, Powershell v7.1
Steps to reproduce the behavior:

cd ./Irregular
$log = git log -n 10
$log | ?<Git_Log>

  1. See error
    Use-RegEx: Cannot bind argument to parameter ‘Match’ because it is an empty string.

Expected behavior
PoSH object of the git log

Actual behavior
error as above for each log entry ( 20 )

Additional context
module is installed

get-module irregular

ModuleType Version PreRelease Name Exporte
dComman
ds


Script 0.5.6 Irregular {Expor…

The module irregular seems to work, confirmed by just typing ?<Git_Log> returns the 33 lines of the actual regex as expected.

$PSVersionTable

Name Value


PSVersion 7.1.3
PSEdition Core
GitCommitId 7.1.3
OS Darwin 20.4.0 Darwin Kernel Version 20.4.0: …
Platform Unix
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0

Regular Expression for ISO-8601 Intervals

Describe the Desired Expression
Have a regular expression that matches time intervals, as described in ISO-8601

Provide some Sample Source Text
PT5M
P5M
P3Y6M4DT12H30M5S

Highlight What you'd like to match
All of the above

What parts of the samples should match? Which pieces of data are important to extract from the match?

  • Year
  • Month
  • Week
  • Day
  • Hour
  • Minute
  • Second

Rename Suggested: Write-RegEx should be New-RegEx

** PREFACE: I love this. Thx.**

The documentation (readme and help) for Write-Regex states that:

Write-Regex -CharacterClass Digit -Repeat # This writes the Regex (\d+)

However, this only "writes" 3 blanks lines.

To Reproduce
Write-Regex -CharacterClass Digit -Repeat # This writes the Regex (\d+)

Expected behavior
Expect the pattern '\d+' to appear as the comment indicates.

Actual behavior
3 blank lines.
What is actually being "written" is the Regex object itself, since apparently there is a default formatter for regexes that produce this output.

Additional context
The following will show all of the object properties (as will | Format-Table *)

Write-Regex -CharacterClass Digit -Repeat | Select-Object *

The pattern can also be seen by accessing the "Pattern" property or using the ToString() method of the regex object.

This is not a bug in Irregular's cmdlets (IMO) but rather a serious issue in the documentation and help.

Write-* cmdlets (practically) all produce (some) screen output in the default case -- due to the formatters.

Those who don't understand GetType(), Get-Member, or using Format explicitly or Select-Object will have a hard time understanding what Write-Regex is actually doing.

I can't decide if this is truly non-standard behavior for a Write-* cmdlets or simply surprising because the result is so different (and the text is hidden).

However, it did cost me a few minutes, first thinking that maybe my 7.2 RC PowerShell was buggy, then trying it in 5.1, and finally investigating deeper.

I suspect many people will give up and walk away from it, and maybe even from this excellent module.

STRONGLY recommend

  1. Adding (much) more to the Description area of the Write-Regex Help
  2. Adding explanations in the existing examples
  3. Adding explicit examples to show how to display the fields.**

-Extract should coerce to [Timespan] before [DateTime]

Describe the bug

-Extract should coerce to [Timespan] before [DateTime]

To Reproduce

Use-RegEx -Match "00:00:01.01" -Pattern "(?[\d:.]+)" -Extract

Expected behavior

Timespan is returned as a [Timespan]

Actual behavior

Timespan is returned as a [DateTime]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.