目录
第二章(pandas)
Python从零开始第三章数据处理与分析python中的dplyr(1)
Python从零开始第三章数据处理与分析python中的dplyr(2)
Python从零开始第三章数据处理与分析python中的dplyr(3)
Python从零开始第三章数据处理与分析python中的dplyr(4)
Python从零开始第三章数据处理与分析python中的dplyr(5)
Python从零开始第三章数据处理与分析python.query()函数
===============================================
本文主要介绍使用python.query()函数对数据框进行(挑选行)的操作
- 构建数据框
import pandas as pd
d={
'name':['a','n','c','d','e','f'],
'Gender':['male','female','male','male','female','female'],
'age':[23,24,24,22,21,20],
'hight':[173,174,164,172,161,160],
'weight1':[53,74,44,62,71,60],
'weight2':[53,64,54,66,81,50]
}
df=pd.DataFrame(d)
df
Out[6]:
name Gender age hight weight1 weight2
0 a male 23 173 53 53
1 n female 24 174 74 64
2 c male 24 164 44 54
3 d male 22 172 62 66
4 e female 21 161 71 81
5 f female 20 160 60 50
- 一般来说如果进行行挑选,可以进行的操作是:
df[df.age==24]
Out[13]:
name Gender age hight weight1 weight2
1 n female 24 174 74 64
2 c male 24 164 44 54
df[(df.age==24 )&( df.hight ==174)]
Out[14]:
name Gender age hight weight1 weight2
1 n female 24 174 74 64
- 但是如果用python.query函数,则更加有逻辑且代码更优雅
df.query("age==24")
Out[20]:
name Gender age hight weight1 weight2
1 n female 24 174 74 64
2 c male 24 164 44 54
df.query("age==24").query('hight==174')
Out[21]:
name Gender age hight weight1 weight2
1 n female 24 174 74 64
Out[29]:
name Gender age hight weight1 weight2
0 a male 23 173 53 53
2 c male 24 164 44 54
3 d male 22 172 62 66
df.query('index > 2')
Out[47]:
name Gender age hight weight1 weight2
3 d male 22 172 62 66
4 e female 21 161 71 81
5 f female 20 160 60 50
df.query('Gender =="male" and name =="a"')
Out[30]:
name Gender age hight weight1 weight2
0 a male 23 173 53 53
df.query('Gender =="male" and age<24')
Out[31]:
name Gender age hight weight1 weight2
0 a male 23 173 53 53
3 d male 22 172 62 66
- 除此之外,query()函数还可以进行不同列之间的值对比:
df.query('weight1 > weight2')
Out[22]:
name Gender age hight weight1 weight2
1 n female 24 174 74 64
5 f female 20 160 60 50