python – pandas ValueError:pattern不包含任何捕获组

使用正则表达式时,我得到:

import re
string = r'http://www.example.com/abc.html'
result = re.search('^.*com', string).group()

在熊猫中,我写道:

df = pd.DataFrame(columns = ['index', 'url'])
df.loc[len(df), :] = [1, 'http://www.example.com/abc.html']
df.loc[len(df), :] = [2, 'http://www.hello.com/def.html']
df.str.extract('^.*com')

ValueError: pattern contains no capture groups

如何解决问题?

谢谢.

最佳答案 根据
docs,您需要为str.extract指定捕获组(即括号),以及提取.

Series.str.extract(pat, flags=0, expand=True)
For each subject
string in the Series, extract groups from the first match of regular
expression pat.

每个捕获组在输出中构成其自己的列.

df.url.str.extract(r'(.*.com)')

                        0
0  http://www.example.com
1    http://www.hello.com
# If you need named capture groups,
df.url.str.extract(r'(?P<URL>.*.com)')

                      URL
0  http://www.example.com
1    http://www.hello.com

或者,如果你需要一个系列,

df.url.str.extract(r'(.*.com)', expand=False)

0    http://www.example.com
1      http://www.hello.com
Name: url, dtype: object
点赞