How to Compile Regular Expressions
Regex Performance
- Regular Expressions methods compule the regex pattern
- Compilation can be a time-consuming process
Precompiling Regex
- Option for patterns that do not change
- Ideal when working with a single pattern and multipel strings
- Improves efficiency and speed
Using the Regex Module
=> import re odule => compile regular expression pattern => get pattern object => call search methods from pattenr object with string
- if no pattern is foudn it will return None
- if a pttern is found it will return a match object
Pattern Object Methods
re.compile(pattern, flags=0)
# compile a regex pattern and returns a patter object
# takes optional flags
- Similar in form and usage as re search methods
- aready contains a pattern no need to pass it as an argument
- also returns a match object
Search Method
pattern_object.search(string, start, stop)
# return a match object if the pattern_object is found in string
# return None if not found
# start and stop are opttional indexes for the search range in string
Match Method
pattern_object.match(string, start, stop)
# return a match object if the pattern_object is found at the beginnign of a string
# return None if not found
# start and stop are opttional indexes for the search range in string
Find All Method
pattern_object.findall(string, start, stop)
# return a list of substring in string that match the pattern_object
# return None if not found
# start and stop are opttional indexes for the search range in string
Precompilation Example
import re
s = 'software hardware'
pattern= re.compile('ware') # creating a pattern object
match = pattern.search(s)
if match:
print("matched")
# Output matched
Let’s test
# test_regex.py
import unittest
import re
class TestRegexMatch(unittest.TestCase):
def test_match_found(self):
s = 'software hardware'
pattern = re.compile('ware')
match = pattern.search(s)
self.assertIsNotNone(match, "Match should be found")
def test_match_not_found(self):
s = 'hello world'
pattern = re.compile('ware')
match = pattern.search(s)
self.assertIsNone(match, "Match should not be found")
if __name__ == '__main__':
unittest.main()
run test with
python -m unittest test_regex.py
Test Class
class TestRegexMatch(unittest.TestCase)
: Defines a class that inherits fromunittest.TestCase
.
Test Methods:
test_match_found()
- Sets up a string (s) that should contain the match.
- Creates the regular expression pattern (pattern).
- Uses
pattern.search(s)
to search for the pattern. self.assertIsNotNone(match, "Match should be found")
: This assertion checks if the match object is not None, indicating a successful match.
test_match_not_found()
:- Sets up a string (s) that should not contain the match.
- The rest of the logic is similar to the previous test.
self.assertIsNone(match, "Match should not be found")
: This assertion checks if the match object is None, indicating that the match was not found.
Running the Tests:
if __name__ == '__main__'
: unittest.main()`- to run the tests when you execute the script directly.
- Verify the expected outcomes with assertions like
assertIsNotNone
andassertIsNone
- The output will show you the results of each test, indicating whether they passed or failed, along with any error messages.