Pandas系列|第三期:Pandas中访问数据
1.pandas中访问数据的几种方法
- df.loc方法,根据行、列的标签值查询
- df.iloc方法,根据行、列的数字位置查询
- df.query方法
2.其中.loc既能查询,又能覆盖写入,强烈推荐!
Pandas使用df.loc查询数据的方法
- 使用单个列值查询数据
- 使用值列表批量查询
- 使用数值区间进行范围查询
- 使用条件表达式查询
3.企业项目实战
.loc赋值用的比较多
.iloc访问数据用的比较多
import pandas as pdarea = pd.Series({'California': 423967, 'Texas': 695662,'New York': 141297, 'Florida': 170312,'Illinois': 149995})
pop = pd.Series({'California': 38332521, 'Texas': 26448193,'New York': 19651127, 'Florida': 19552860,'Illinois': 12882135})
df = pd.DataFrame({'city': area.index, 'area': area, 'pop': pop})
df['density'] = df['pop'] / df['area']
print(df.values)
print(df.values[0])print('------area------')
print(df['area'])
print('------.loc的企业项目常见用法------')
print('------1.使用单个列值查询数据------')
print(df.loc['Texas'])
print('------2.使用值列表批量查询------')
print(df.loc[['Florida','Illinois']])
print(df.loc[:'Illinois', :'pop']) # 显式索引是包含边界的
print('------3.使用数值区间进行范围查询------')
print(df.loc[df.density > 100])
print(df.loc[df.density > 100, ['pop', 'density']])
print('------4.条件表达式查询------')
print(df.loc[df.city == 'Illinois'])
df.loc[df.city == 'Illinois', 'city'] = 'Illinois1' # 赋值
print(df)print('------.iloc的企业项目常见用法------')
print(df.iloc[1:3])
print(df.iloc[:3, :2]) # 隐式索引是不包含边界的
df.iloc[0, 2] = 90 # 赋值
print(df)print('------query的企业项目常见用法------')
result = df.query('pop > 20000000')
print(result)
输出结果:
[['California' 423967 38332521 90.41392608386974]['Texas' 695662 26448193 38.01874042279153]['New York' 141297 19651127 139.07674614464568]['Florida' 170312 19552860 114.80612053173]['Illinois' 149995 12882135 85.88376279209307]]
['California' 423967 38332521 90.41392608386974]
------area------
California 423967
Texas 695662
New York 141297
Florida 170312
Illinois 149995
Name: area, dtype: int64
------.loc的企业项目常见用法------
------1.使用单个列值查询数据------
city Texas
area 695662
pop 26448193
density 38.01874
Name: Texas, dtype: object
------2.使用值列表批量查询------city area pop density
Florida Florida 170312 19552860 114.806121
Illinois Illinois 149995 12882135 85.883763city area pop
California California 423967 38332521
Texas Texas 695662 26448193
New York New York 141297 19651127
Florida Florida 170312 19552860
Illinois Illinois 149995 12882135
------3.使用数值区间进行范围查询------city area pop density
New York New York 141297 19651127 139.076746
Florida Florida 170312 19552860 114.806121pop density
New York 19651127 139.076746
Florida 19552860 114.806121
------4.条件表达式查询------city area pop density
Illinois Illinois 149995 12882135 85.883763city area pop density
California California 423967 38332521 90.413926
Texas Texas 695662 26448193 38.018740
New York New York 141297 19651127 139.076746
Florida Florida 170312 19552860 114.806121
Illinois Illinois1 149995 12882135 85.883763
------.iloc的企业项目常见用法------city area pop density
Texas Texas 695662 26448193 38.018740
New York New York 141297 19651127 139.076746city area
California California 423967
Texas Texas 695662
New York New York 141297city area pop density
California California 423967 90 90.413926
Texas Texas 695662 26448193 38.018740
New York New York 141297 19651127 139.076746
Florida Florida 170312 19552860 114.806121
Illinois Illinois1 149995 12882135 85.883763
------query的企业项目常见用法------city area pop density
Texas Texas 695662 26448193 38.01874
·