python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解-侯体宗的博客

python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解
Python / 管理员发布于 8年前 284

前言

在许多的情况下，很多要匹配内容是一起出现，或者一起不出现的。比如《》，< >，这样的括号，不存在使用半个的情况。因此，在正则表达式里也有一致性的判断，要么两个尖括号一起出现，要么一个也不要出现。怎么样来实现这种判断呢？针对这种情况得引入新的正则表达式的语法：(?=pattern)，这个语法它会向前搜索或者向后搜索相关内容，如果不会出现就不能匹配。不过，这个匹配不会消耗任何输入的字符，它只是查看一下。

例子如下：

#python 3.6 #蔡军生 #http://blog.csdn.net/caimouse/article/details/51749579 # import re  address = re.compile(  '''''  # A name is made up of letters, and may include "."  # for title abbreviations and middle initials.  ((?P<name>   ([\w.,]+\s+)*[\w.,]+   )   \s+  ) # name is no longer optional   # LOOKAHEAD  # Email addresses are wrapped in angle brackets, but only  # if both are present or neither is.  (?= (<.*>$)  # remainder wrapped in angle brackets   |   ([^<].*[^>]$) # remainder *not* wrapped in angle brackets   )   <? # optional opening angle bracket   # The address itself: [email protected]  (?P<email>   [\w\d.+-]+  # username   @   ([\w\d.]+\.)+ # domain name prefix   (com|org|edu) # limit the allowed top-level domains  )   >? # optional closing angle bracket  ''',  re.VERBOSE)  candidates = [  u'First Last <[email protected]>',  u'No Brackets [email protected]',  u'Open Bracket <[email protected]',  u'Close Bracket [email protected]>', ]  for candidate in candidates:  print('Candidate:', candidate)  match = address.search(candidate)  if match:   print(' Name :', match.groupdict()['name'])   print(' Email:', match.groupdict()['email'])  else:   print(' No match')

结果输出如下：

Candidate: First Last <[email protected]> Name : First Last Email: [email protected]: No Brackets [email protected] Name : No Brackets Email: [email protected]: Open Bracket <[email protected] No matchCandidate: Close Bracket [email protected]> No match

python里使用正则表达式的前向搜索否定模式

上面学习前向搜索或后向搜索模式(?=pattern)，这个模式里看到有等于号=，它是表示一定相等，其实前向搜索模式里，还有不相等的判断。比如你需要识别EMAIL地址：[email protected]，这个EMAIL地址大多数是不需要回复的，所以我们要把这个EMAIL地址识别出来，并且丢掉它。怎么办呢？这时你就需要使用前向搜索否定模式，它的语法是这样：(?!pattern)，这里的感叹号就是表示非，不需要的意思。比如遇到这样的字符串：[email protected]，它会判断noreply@是否相同，如果相同，就丢掉这个模式识别，不再匹配。

例子如下：

#python 3.6 #蔡军生 #http://blog.csdn.net/caimouse/article/details/51749579 # import re  address = re.compile(  '''''  ^   # An address: [email protected]   # Ignore noreply addresses  (?!noreply@.*$)   [\w\d.+-]+  # username  @  ([\w\d.]+\.)+ # domain name prefix  (com|org|edu) # limit the allowed top-level domains   $  ''',  re.VERBOSE)  candidates = [  u'[email protected]',  u'[email protected]', ]  for candidate in candidates:  print('Candidate:', candidate)  match = address.search(candidate)  if match:   print(' Match:', candidate[match.start():match.end()])  else:   print(' No match')

结果输出如下：

Candidate: [email protected] Match: [email protected]: [email protected] No match

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家的支持。

上一条：
python中如何正确使用正则表达式的详细模式（Verbose mode expression)
下一条：
Python入门之三角函数全解【收藏】

0条评论 (评论内容有缓存机制,请悉知!)