Regular Expressions
Regular Expressions (or RegExs) provide a way for the user to search through text. Of course, you can do this in APL anyway by using the Find (⍷) primitive:
text←'Hello mellow felon' 'ello' ⍷ text 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 'lo' ⍷ text 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
But what if you wanted to do something more sophisticated, like find all occurences of either 'Hello' or 'Hallo' in the text above? Or how about finding all occurrences which match 'lo' but not 'llo'?
Of course, you could write some APL code to do it:
⊃∨/('Hello' 'Hallo') ⍷ ¨⊂text 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
However, as the search gets more complicated it becomes easier to search using Regular Expressions.
A RegEx search takes the text to be searched and one or more pattern strings - for example 'H[ea]llo' would match all occurrences of 'Hello' or 'Hallo', and '[^l]lo' would match all occurrences of 'lo' except 'llo' (e.g. 'felon' but not 'mellow')
Regular Expressions are enormously powerful, allowing you to do complicated pattern matching and text substitution, but they are also a complex subject. Wikipedia has a good introduction to the subject.
The way to perform Regular Expression searching depends on which version of APL you are using:
For Dyalog APL under Windows you can use the System.Text.RegularExpressions namespace that's part of Microsoft's .NET framework or the built-in system operator ⎕S under all OSes.
- For other APLs, contact the Vendor for further information
Author: SimonMarsden