Code Monkey home page Code Monkey logo

regular-expressions-workshop's Introduction

This material was prepared for the workshop held on 2022-04-07 at the Gymnasium Thun (Thun, Switzerland), and updated for the workshop held on 2022-09-02 at EPFL (Lausanne, Switzerland).

Regex 101

pattern meaning usage matches doesn't match
\b word boundary (beginning or end of a word) apfel\b apfel in apfel or erdapfel apfel in apfelsaft
\w alphanumeric character (= any letter or a digit) \wier vier, Bier, hier, Tier Klavier
[abc] set (any of the character between brackets) [vTB]ier vier, Bier, Tier hier, Klavier
(abc|def) group (any of the groups of tokens separated by |) (Klav|T)ier Klavier, Tier Bier
[a]* 0 or more occurrences [\w]*ier Klavier, Tier, vier, Bier, Stier, ier
[a]+ 1 or more occurrences [\w]+ier Klavier, Tier, vier, Bier, Stier ier
[a]? 0 or 1 occurrences [s]?tier tier, stier vier
^ beginning of a line ^Tiere Tiere in beginning of a line ("Tiere sind Lebensformen") Tiere in the middle or end of a line ("So viele Tiere!")

Set Up

  1. Open the RegExr webpage https://regexr.com/ in your browser.
  2. Set the RegEx engine to "JavaScript (Browser)".
  3. Set the Tools to "List".
  4. Set the Flags to "global" and "multiline".
  5. Open scraped_wiki_neuron.md and copy-paste its contents into the text box of the RegExr website.

Exercises

Try to write regular expressions that extract the following contents from the text.

  1. Extract exact mention of the word "neuron". [34 matches]
  2. What about uppercase, e.g. at the beginning of a sentence? Extract mention of the word "neuron" or "Neuron". [35 matches]
  3. What about plurals? Try to extract "neurons", "Neurons", "neuron", or "Neuron". [105 matches]
  4. Now, any word containing "neur" or "Neur" inside of it (e.g. "interneuronal"). [147 matches]
  5. Certain words use the root "nerv" to express similar concepts (e.g. "nervous system"). Extract all words containing "neur", "Neur", "nerv", or "Nerv", inside of it. [170 matches]
  6. Under the section "Classification" we find a bulleted list of neuron types. They are all in format "Xxxx cells", where "Xxxxx" is an adjective starting with a capital letter like "Basket cells" or "Granule cells". Try to extract all of them. [8 matches]
  7. Under the section "Neurotransmitters" we find another bulleted list of neuron types. They are all in format "Xxxx neurons", where "Xxxxx" is an adjective starting with a capital letter like "Cholinergic neurons" or "Purinergic neurons". Try to extract all of them. [9 matches]

Solutions

Click here!
  1. \bneuron\b
  2. \b[nN]euron\b
  3. \b[nN]euron[s]?\b
  4. \b[\w]*[nN]eur[\w]*\b
  5. \b[\w]*[nN]e(ur|rv)[\w]*\b
  6. ^\w+ cells\b
  7. ^\w+ neurons\b

regular-expressions-workshop's People

Contributors

francescocasalegno avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

emiliedel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.