El método grep
permite seleccionar los elementos de una lista:
generaciondecodigos@nereida:~/src/groovy/strings$ cat -n grep.groovy 1 regexp = ~ ".*${args[0]}.*" 2 3 args[1 .. -1].each { filename -> 4 5 file = new File(filename) 6 list = file.readLines() 7 8 select = list.grep(regexp) 9 select.each { 10 println it 11 } 12 }Sigue un ejemplo de ejecución:
generaciondecodigos@nereida:~/src/groovy/strings$ groovy grep.groovy '\d+' c2f.groovy grep.groovy print "Enter a temperature (i.e. 32F, 100C): "; def (num, type) = m[0][1..2] farenheit = (celsius * 9/5)+32 celsius = (farenheit -32)*5/9; printf "%.2f C = %.2f F\n", celsius, farenheit; print "Enter a temperature (i.e. 32F, 100C): "; regexp = ~ ".*${args[0]}.*" args[1 .. -1].each { filename ->
Groovy also makes significant additions to what you can do with
Collections. In addition to each, collect, inject, etc, there is a
regular expression aware iterator called grep
that will pass each
item in the Collection through a filter and return a subset of items
that match the filter. We can use a regular expression as a filter:
// regular expression says 0 or more characters (".*") followed by the string "bar" that is at the end of the string ("$") assert ["foobar", "bazbar"] == ["foobar", "bazbar", "barquux"].grep(~/.*bar$/)
You can achieve the same thing with findAll
but it takes a little more typing:
assert ["foobar", "bazbar"] == ["foobar", "bazbar", "barquux"].findAll { it ==~ /.*bar$/ }
Working with Matchers
As we’ve seen, using the=~
operator will return aMatcher
object. Many of the existing regular expression examples on the web work by treating theMatcher
as a list and getting the first (zero-based) element out of the list:
def matcher = "foobazaarquux" =~ "o(b.*r)q" assert ["obazaarq", "bazaar"] == matcher[0] assert "bazaar" == matcher[0][1] // get the first grouping of the first map
This is a little fragile asmatcher[0]
will throw an error if there was not actually a match. Callingmatches()
doesn’t help as matches only checks if the regular expression matches the WHOLE string:
("foobazaarquux" =~ "o(b.*r)q").matches() // returns false! ("foobazaarquux" =~ ".*(b.*r).*").matches() // returns true, ".*" matches 0 or more chars of any type
You can check getCount()
to see how many matches there were for some safety:
def m = "foobar" =~ /quux/ if (m.getCount()) { // example won't get here as "quux" doesn't exist in "foobar", the count is 0 println m[0] }
A groovier way to work with Matchers leverages collection iterators and the built in closures that Groovy provides to them.Matcher
supports theiterator()
method and with that, gets everything else that any groovyList
orCollection
would have, includingcollect
,inject
,findAll
, etc.
def paragraph = """ Lorem ipsum dolor 12:30 AM sit amet, consectetuer adipiscing 1:15 AM elit. Nunc rutrum diam sagittis nisi 9:22 PM. """ def HOUR = /10|11|12|[0-9]/ def MINUTE = /[0-5][0-9]/ def AM_PM = /AM|PM/ def time = /($HOUR):($MINUTE) ($AM_PM)/ assert ["12:30 AM", "1:15 AM", "9:22 PM"] == (paragraph =~ time).collect { it } assert ["12:30 AM", "1:15 AM"] == (paragraph =~ time).grep(~/.*AM$/)
A limitation of the iterator-based methods is that they don’t give you access to the individual groups (hour, minute, am/pm), just the full matched string(”12:30 AM”)
. Theeach
method is more powerful because as it iterates through, it passes the full match as well as each of the individual groups into the closure.
("foo1 bar30 foo27 baz9 foo600" =~ /foo(\d+)/).each { match, digit -> println "+$digit" } // result: // +1 // +27 // +600
Another example (using the paragraph and time Matcher from above) showing how to pretty print all of the timestamps:
(paragraph =~ time).each {match, hour, minute, amPm -> println "$hour:$minute ${amPm == 'AM' ? 'this morning' : 'this evening' }" } // result: // 12:30 this morning // 1:15 this morning // 9:22 this evening
Regular expressions are a powerful tool that Groovy makes as accessible as any other top-tier scripting language. Using techniques to break more complicated regular expressions into their component pieces can make them much more readable (as in the time example above).
If you’re doing any sort of string processing beyond a simple contains or split, regular expressions in groovy can turn mountains of Java into a couple of lines of code.