String Anchors and Python Regex

In this blog post you will learn how to use string anchors from Python's re module syntax. The re module provides regular expression optiions that were similar to the ones found in Perl. If you'd like to know more about this library, please see this link: Lib/re.py. Regular expression syntax is a valuable asset for every programmer to know. It allows the programmer to match particular expressions to a particular string. This is helpful for situations like email verification, password verificaion or even scraping web sites where it can be highly likely that a lot of input your receiving can have countless exceptions. In this blog post we will take a look at string anchors, specifically ones that match a given text to the beginning and end of lines, words, elements, and a few other types of matching if we want a nested text.

The first string anchor we'll take a look at is the sequence \A. This sequence restricts the matching to the start of the string. The text you are looking for is placed after the sequence. Using the re library we use a method called search where we provide up to three arguments, the pattern, the string to check for the given pattern, and any flags we might need. Below is an example of finding 'uni' at the start of any string that is stored in an array. This will not work for new lines as we can see below since it is part of the same string.

Python

                                
>>> import re
>>> words = ['university', 'the\nuniversity', 'unicycle', 'clown\nunit']
>>> [e for e in words if re.search(r'\Auni', e)]
['university', 'unicycle']
                                
                                Code copied
                            

The next string anchor will do the opposite of the previous. This string anchor will find the sequence matching at the end of the string. This sequence is \Z but, we place the text patter we are searching for to the left of \Z.

Python

                                
>>> import re
>>> words = ['university', 'the\nuniversity', 'unicycle', 'clown\nunit']
>>> [e for e in words if re.search(r'uni\Z', e)]
['university', 'the\nuniversity']
                               
                               Code copied
                           

Lastly, we can take a look at using search as a conditional statement. Since Python evaluates None as False in a boolean context, re.search can be used as a conditional expression.

Python

                                
>>> import re
>>> str = 'This is a string'
>>> if re.search(r'his', str):
        print('found')
>>> if not re.search(r'zed', str):
        print('did not find')
                                
                                Code copied
                            

In conclusion, the benefit of using re.search will be useful when we take a look at scraping websites or verifying emails.