使用正则表达式python获取模式后的句子

2023年11月15日 268次阅读

在我的字符串中(从
this turorial开始采用的例子)我希望获得所有内容,直到第一个.在通用(年)之后.图案：

str = 'purple alice@google.com, (2002).blah monkey. (1991).@abc.com blah dishwasher'

我想我的代码几乎就在那里但尚未完成：

test = re.findall(r'[\(\d\d\d\d\).-]+([^.]*)', str)

…返回：[‘com,(2002)’,’blah monkey’,'(1991)’,’@ abc’,’com blah洗碗机’]

所需的输出是：

[‘blah monkey’,’@ abc’]

换句话说,我想找到年份模式和下一个点之间的所有内容.

最佳答案如果你想在(年)之间得到所有东西.和第一个.你可以用这个：

\(\d{4}\)\.([^.]*)

见Live Demo.

并在此解释：

"\(\d{4}\)\.([^.]*)"g

\( matches the character ( literally
  \d{4} match a digit [0-9]
    Quantifier: {4} Exactly 4 times
       \) matches the character ) literally
         \. matches the character . literally
1st Capturing group ([^.]*)
    [^.]* match a single character not present in the list below
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
        . the literal character .
g modifier: global. All matches (don't return on first match)