Pandas：修订间差异

2020年10月4日 (日) 17:21的版本

Pandas是Python的一个开源软件库，用于数据分析，可以方便对数据进行处理、计算、分析、存储及可视化。

简介

时间轴

2008年，开发者Wes McKinney在AQR Capital Management开始制作pandas来满足在财务数据上进行定量分析对高性能、灵活工具的需要。在离开AQR之前他说服管理者允许他将这个库开放源代码。
2012年，另一个AQR雇员Chang She加入了这项努力并成为这个库的第二个主要贡献者。
2015年，Pandas签约了NumFOCUS的一个财务赞助项目，它是美国的501(c)(3)非营利慈善团体。

安装和导入

使用pip安装Pandas

pip install pandas

如果使用的是Anaconda等计算科学软件包，已经安装好了pandas库。

导入Pandas，在脚本顶部导入，一般写法如下：

import pandas as pd

查看Pandas版本：

pd.__version__

数据结构

pandas定义了2种数据类型，Series和DataFrame，大部分操作都在这两种数据类型上进行。

了解更多 >> Pandas 用户指南：数据结构

Series

Series是一个有轴标签（索引）的一维数组，能够保存任何数据类型（整数，字符串，浮点数，Python对象等）。轴标签称为index。和Python字典类似。

创建Series

创建Series的基本方法为，使用pandas.Series类新建一个Series对象，格式如下：

pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)

轴标签index不是必须，如果省略，轴标签默认为从0开始的整数数组。一些示例如下：

s = pd.Series(["foo", "bar", "foba"])
print(type(s))   #<class 'pandas.core.series.Series'>

s2 = pd.Series(["foo", "bar", "foba"], index=['b','d','c'])

# 创建日期索引
date_index = pd.date_range("2020-01-01", periods=3, freq="D")
s3 = pd.Series(["foo", "bar", "foba"], index=date_index)

Series数据操作

DataFrame

DataFrame是有标记的二维的数据结构，具有可能不同类型的列。由数据，行标签（索引，index），列标签（列，columns）构成。您可以将其视为电子表格或SQL表，或Series对象的字典。它通常是最常用的Pandas对象。

了解更多 >> Pandas 用户指南：DataFrame

创建DataFrame

创建DataFrame对象有多种方法：

使用pandas.DataFrame()构造方法
使用pandas.DataFrame.from_dict()方法，类似构造方法
使用pandas.DataFrame.from_records()方法，类似构造方法
使用函数从导入文件创建，如使用pandas.read_csv()函数导入csv文件创建一个DataFrame对象。

构造方法pandas.DataFrame()的格式为：

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

属性和方法

下面将Series和DataFrame的属性、方法按作用分类展示。

表示例中s为一个Series对象，df为一个DataFrame对象：

>>> s = pd.Series(['a', 'b', 'c'])
>>> s
0    a
1    b
2    c
dtype: object

构造方法

方法名	描述	Series	DataFrame	示例
构造方法	创建一个Series对象或DataFrame对象	pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)	pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)	`s = pd.Series(["a", "b", "c"])` `df = pd.DataFrame([['foo', 22, 3], ['bar', 25, 6], ['test', 18, 7]],columns=['name', 'age', 'number'])`

属性和基本信息

属性/方法	描述	Series	DataFrame	示例
index	索引（行标签）	Series.index	DataFrame.index	`s.index` `df.index`
columns	列标签，Series无	−	DataFrame.columns	`df.columns`
dtypes	返回数据的Numpy数据类型（dtype对象）	Series.index	DataFrame.index	`s.dtypes` `df.dtypes`
dtype	返回数据的Numpy数据类型（dtype对象）	Series.index	−	`s.dtype`
array	返回 Series 或 Index 数据的数组，该数组为pangdas扩展的python数组.	Series.index	−	`s.array` 返回：<PandasArray> ['a', 'b', 'c'] Length: 3, dtype: object
attrs	此对象全局属性字典。	Series.attrs	DataFrame.attrs	`s.attrs`返回{}
axes	返回轴标签的列表。 Series返回[index] DataFrame返回[index, columns]	Series.axes	DataFrame.axes	`s.axes`返回[RangeIndex(start=0, stop=3, step=1)]
hasnans	如果有任何空值（如Python的None，np.NaN）返回True，否则返回False。	Series.hasnans	−	`s2 = pd.Series(['a', None, 'c'])` `s2.hasnans` 返回True

数据选取/索引标签/迭代

属性/方法	描述	Series	DataFrame	示例
at	通过行轴和列轴标签对获取或设置单个值。	Series.at	DataFrame.at	`s.at[1]`返回'b' `s.at[2]='d'`设置索引位置为第三的值等于'd'
iat	通过行轴和列轴整数位置获取或设置单个值。	Series.iat	DataFrame.iat	`s.iat[1]` `s.iat[2]='d'`
iloc	通过索引(行轴)整数位置获取或设置值。	Series.iloc	DataFrame.iloc	`s.iloc[2]`结果为'b' `s.iloc[:2]` 选取索引为0到2（不包含2）的值 `s.iloc[[True,False,True]]`选取索引位置为True的值 `s.iloc[lambda x: x.index % 2 == 0]`选取索引为双数的值

计算/描述统计

属性/方法	描述	Series	DataFrame	示例
abs()	返回 Series/DataFrame 每个元素的绝对值。	Series.abs()	DataFrame.abs()	`s.abs()` `df.abs()`

Pandas绘图

pandas绘图基于Matplotlib，pandas的DataFrame和Series都自带生成各类图表的plot方法，能够方便快速生成各种图表。

了解更多 >> pandas文档：用户指南 - 可视化

基本图形

折线图

plot方法默认生成的就是折线图。如prices是一个DataFrame的含有收盘价close列，绘制收盘价的折线图：

s = prices['close']
s.plot() 

#设置图片大小，使用figsize参数
s.plot(figsize=(20,10))

条形图

对于不连续标签，没有时间序列的数据，可以绘制条形图，使用以下两种方法：

使用plot()函数，设置kind参数为‘bar’ or ‘barh’，
使用plot.bar()函数，plot.barh()函数

df.plot(kind='bar')    #假设df为每天股票数据  
df.plot.bar()          
df.resample('A-DEC').mean().volume.plot(kind='bar')    #重采集每年成交量平均值，绘制条形图（volume为df的成交量列）

df.plot.bar(stacked=True)    #stacked=True表示堆积条形图
df.plot.barh(stacked=True)    #barh 表示水平条形图 </nowiki>

直方图

直方图使用plot.hist()方法绘制，一般为频数分布直方图，x轴分区间，y轴为频数。组数用参数bins控制，如分20组bins=20

df.volume.plot.hist()    #df股票数据中成交量volume的频数分布直方图。
df.plot.hist(alpha=0.5)    #alpha=0.5 表示柱形的透明度为0.5
df.plot.hist(stacked=True, bins=20)    #stacked=True表示堆积绘制，bins=20表示分20组。
df.plot.hist(orientation='horizontal')    #orientation='horizontal' 表示水平直方图
df.plot.hist(cumulative=True)    #表示累计直方图  

df['close'].diff().hist()    #收盘价上应用diff函数，再绘制直方图
df.hist(color='k', bins=50)     #DataFrame.hist函数将每列绘制在不同的子图形上。

箱型图

箱型图可以使用plot.box()函数或DataFrame的boxplot()绘制。参数：

color，用来设置颜色，通过传入颜色字典，如color={'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue', 'caps': 'Gray'}
sym，用来设置异常值样式，如sym='r+'表示异常值用'红色+'表示。

df.plot.box()
df[['close','open', 'high']].plot.box()
#改变箱型颜色，通过传入颜色字典
color={'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue', 'caps': 'Gray'}
df.plot.box(color=color, sym='r+')    #sym用来设置异常值样式，'r+'表示'红色+'
df.plot.box(positions=[1, 4, 5, 6, 8])    #positions表示显示位置，df有5个列， 第一列显示在x轴1上，第二列显示在x轴4上，以此类推
df.plot.box(vert=False)    #表示绘制水平箱型图
df.boxplot()   

#绘制分层箱型图，通过设置by关键词创建分组，再按组，分别绘制箱型图。如下面例子，每列按A组，B组分别绘制箱型图。
df = pd.DataFrame(np.random.rand(10, 2), columns=['Col1', 'Col2'])
df['x'] = pd.Series(['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'])
df.boxplot(by='x')

#还可以再传入一个子分类，再进一步分组绘制。如：
df.boxplot(column=['Col1', 'Col2'], by=['X', 'Y'])

散点图

散点图使用DataFrame.plot.scatter()方法绘制。通过参数x，y指定x轴和y轴的数据列。

df.plot.scatter(x='close', y='volume')    #假如df为每日股票数据，图表示收盘价与成交量的散点图

#将两组散点图绘制在一张图表上，重新ax参数如
ax = df.plot.scatter(x='close', y='volume', color='DarkBlue', label='Group 1')    #设置标签名label设置标名
df.plot.scatter(x='open', y='value', color='DarkGreen', label='Group 2', ax=ax)

#c参数表示圆点的颜色按按volume列大小来渐变表示。
df.plot.scatter(x='close', y='open', c='volume', s=50)    #s表示原点面积大小
df.plot.scatter(x='close', y='open', s=df['volume']/50000)  #圆点的大小也可以根据某列数值大小相应设置。

饼图

饼图使用DataFrame.plot.pie()或Series.plot.pie()绘制。如果数据中有空值，会自动使用0填充。

其他绘图函数

这些绘图函数来自pandas.plotting模块。

矩阵散点图（Scatter Matrix Plot）

矩阵散点图（Scatter Matrix Plot）使用scatter_matrix()方法绘制

from pandas.plotting import scatter_matrix     #使用前需要从模块中导入该函数
scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')    #假设df是每日股票数据，会每一列相对其他每一列生成一个散点图。

密度图（Density Plot）

密度图使用Series.plot.kde()和DataFrame.plot.kde()函数。

df.plot.kde()

安德鲁斯曲线（Andrews Curves）

安德鲁斯曲线

平行坐标图（Parallel Coordinates）

Lag plot

自相关图（Autocorrelation Plot）

自相关图

自举图（Bootstrap plot）

绘图格式

预设置图形样式

matplotlib 从1.5开始，可以预先设置样式，绘图前通过matplotlib.style.use(my_plot_style)。如matplotlib.style.use('ggplot') 定义ggplot-style plots.

样式参数

大多数绘图函数，可以通过一组参数来设置颜色。

标签设置

可通过设置legend参数为False来隐藏图片标签，如

df.plot(legend=False)

尺度

logy参数用来将y轴设置对数标尺
logx参数用来将x轴设置对数标尺
loglog参数用来将x轴和y轴设置对数标尺

ts.plot(logy=True)

双坐标图

两组序列同x轴，但y轴数据不同，可以通过第二个序列设置参数：secondary_y=True，来设置第二个y轴。

#比如想在收盘价图形上显示cci指标：
prices['close'].plot()
prices['cci'].plot(secondary_y=True)

#第二个坐标轴要显示多个，可以直接传入列名
ax = df.plot(secondary_y=['cci', 'RSI'], mark_right=False)    #右边轴数据标签默认会加个右边，设置mark_right为False取消显示
ax.set_ylabel('CD scale')     #设置左边y轴名称
ax.right_ax.set_ylabel('AB scale')    #设置右边y轴名称

子图

DataFrame的每一列可以绘制在不同的坐标轴(axis）中，使用subplots参数设置，例如：

df.plot(subplots=True, figsize=(6, 6))

子图布局

子图布局使用关键词layout设置，

资源

官网

书籍

《利用Python进行数据分析第2版》 - Wes McKinney

参考文献

@@ 第44行： / 第44行： @@
 ====Series数据操作====
-====Series属性====
+===DataFrame===
-下表示例中s为Series对象：
+DataFrame是有标记的二维的数据结构，具有可能不同类型的列。由数据，行标签（索引，index），列标签（列，columns）构成。您可以将其视为电子表格或SQL表，或Series对象的字典。它通常是最常用的Pandas对象。
+{{了解更多|[https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe Pandas 用户指南：DataFrame]}}
+====创建DataFrame====
+创建DataFrame对象有多种方法：
+* 使用<code>pandas.DataFrame()</code>构造方法
+* 使用<code>pandas.DataFrame.from_dict()</code>方法，类似构造方法
+* 使用<code>pandas.DataFrame.from_records()</code>方法，类似构造方法
+* 使用函数从导入文件创建，如使用<code>pandas.read_csv()</code>函数导入csv文件创建一个DataFrame对象。
+构造方法<code>pandas.DataFrame()</code>的格式为：
+ pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
+===属性和方法===
+下面将Series和DataFrame的属性、方法按作用分类展示。
+表示例中s为一个Series对象，df为一个DataFrame对象：
 <syntaxhighlight lang="python" >
 >>> s = pd.Series(['a', 'b', 'c'])
@@ 第53行： / 第70行： @@
     c
 dtype: object
 </syntaxhighlight>
+{{了解更多
+|[https://pandas.pydata.org/docs/reference/frame.html  Pandas API：DataFrame]
+|[https://pandas.pydata.org/docs/reference/series.html Pandas API：Series]}}
+====构造方法====
 {| class="wikitable"
 |-
-!属性名
+!方法名
 !描述
+!Series
+!DataFrame
 !示例
-!结果
 |-
-| T
+|构造方法
-| 返回转置，根据定义，Series转置为自身。
+|创建一个Series对象或DataFrame对象
-| s.T
+|pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)
-| 自身
+|pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
+|<code>s = pd.Series(["a", "b", "c"])</code>  <br \><br \><code>df = pd.DataFrame([['foo', 22, 3], ['bar', 25, 6], ['test', 18, 7]],columns=['name', 'age', 'number'])</code>
+|-
+|}
+====属性和基本信息====
+{| class="wikitable"
+|-
+!属性/方法
+!描述
+!Series
+!DataFrame
+!示例
+|-
+| index
+| 索引（行标签）
+|Series.index
+|DataFrame.index
+| <code>s.index</code> <br \> <code>df.index</code>
+|-
+| columns
+| 列标签，Series无
+| &minus;
+|DataFrame.columns
+| <code>df.columns</code>
+|-
+| dtypes
+| 返回数据的Numpy数据类型（dtype对象）
+|Series.index
+|DataFrame.index
+| <code>s.dtypes</code><br \> <code>df.dtypes</code>
+|-
+| dtype
+| 返回数据的Numpy数据类型（dtype对象）
+| Series.index
+| &minus;
+| <code>s.dtype</code>
 |-
 | array
 | 返回 Series 或 Index 数据的数组，该数组为pangdas扩展的python数组.
-| s.array
+| Series.index
-| <PandasArray><br \>['a', 'b', 'c']<br \>Length: 3, dtype: object
+| &minus;
-|-
+| <code>s.array</code> <br \>返回：<PandasArray><br \>['a', 'b', 'c']<br \>Length: 3, dtype: object
-| at
-| 通过行轴和列轴标签获取或设置单个值。
-| s.at[1]<br \>s.at[2]='d'
-|'b'
 |-
 | attrs
 | 此对象全局属性字典。
-| s.attrs
+| Series.attrs
-| {}
+| DataFrame.attrs
+| <code>s.attrs</code>返回{}
 |-
 | axes
-| 返回行轴标签的列表。
+| 返回轴标签的列表。<br \>Series返回[index] <br \>DataFrame返回[index, columns]
-| s.axes
+| Series.axes
-| [RangeIndex(start=0, stop=3, step=1)]
+| DataFrame.axes
+| <code>s.axes</code>返回[RangeIndex(start=0, stop=3, step=1)]
 |-
-| dtype
+| hasnans
-| 返回数据的Numpy数据类型
+| 如果有任何空值（如Python的None，np.NaN）返回True，否则返回False。
-| s.dtype
+| Series.hasnans
-| dtype('O')
+| &minus;
+| <code>s2 = pd.Series(['a', None, 'c'])</code> <br \><code>s2.hasnans</code> <br \>返回True
+|-
+|}
+====数据选取/索引标签/迭代====
+{| class="wikitable"
 |-
-| dtypes
+!属性/方法
-| 返回数据的Numpy数据类型
+!描述
-| s.dtypes
+!Series
-| dtype('O')
+!DataFrame
+!示例
 |-
-| hasnans
+| at
-| 如果有任何空值（如Python的None，np.NaN）返回True，否则返回False。
+| 通过行轴和列轴标签对获取或设置单个值。
-| s2 = pd.Series(['a', None, 'c']) <br \>s2.hasnans
+| Series.at
-| True
+| DataFrame.at
+| <code>s.at[1]</code>返回'b'<br \><code>s.at[2]='d'</code>设置索引位置为第三的值等于'd'
 |-
 | iat
 | 通过行轴和列轴整数位置获取或设置单个值。
-| s.iat[1]<br \>s.iat[2]='d'
+| Series.iat
-|'b'
+| DataFrame.iat
+| <code>s.iat[1]</code><br \><code>s.iat[2]='d'</code>
 |-
 | iloc
 |通过索引(行轴)整数位置获取或设置值。
-|1. <code>s.iloc[2]</code> <br \>2. <code>s.iloc[:2]</code> <br \>3. <code><nowiki>s.iloc[[True,False,True]]</nowiki></code> <br \>4. <code>s.iloc[lambda x: x.index % 2 == 0]</code>
+| Series.iloc
-|1. 'b'<br \>2. 选取索引为0到2（不包含2）的值<br \>3. 选取索引位置为True的值 <br \>4. 选取索引为双数的值
+| DataFrame.iloc
-|-
+|<code>s.iloc[2]</code>结果为'b' <br \><code>s.iloc[:2]</code> 选取索引为0到2（不包含2）的值 <br \><code><nowiki>s.iloc[[True,False,True]]</nowiki></code>选取索引位置为True的值 <br \><code>s.iloc[lambda x: x.index % 2 == 0]</code>选取索引为双数的值
-| index
-| The index (axis labels) of the Series.
-|-
-| is_monotonic
-| Return boolean if values in the object are monotonic_increasing.
-|-
-| is_monotonic_decreasing
-| Return boolean if values in the object are monotonic_decreasing.
-|-
-| is_monotonic_increasing
-| Alias for is_monotonic.
-|-
-| is_unique
-| Return boolean if values in the object are unique.
-|-
-| loc
-| Access a group of rows and columns by label(s) or a boolean array.
-|-
-| name
-| Return the name of the Series.
-|-
-| nbytes
-| Return the number of bytes in the underlying data.
-|-
-| ndim
-| Number of dimensions of the underlying data, by definition 1.
-|-
-| shape
-| Return a tuple of the shape of the underlying data.
-|-
-| size
-| Return the number of elements in the underlying data.
 |-
-| values
-| Return Series as ndarray or ndarray-like depending on the dtype.
 |}
-{{了解更多|[https://pandas.pydata.org/docs/reference/api/pandas.Series.html#pandas.Series Pandas API：pandas.Series]}}
+====计算/描述统计====
-====Series方法====
 {| class="wikitable"
 |-
-! 方法
+!属性/方法
-! 描述
+!描述
-! 示例
+!Series
-! 结果
+!DataFrame
+!示例
 |-
 | abs()
 | 返回 Series/DataFrame 每个元素的绝对值。
-| s.abs()
+| Series.abs()
-|
+| DataFrame.abs()
-|-
+| <code>s.abs()</code> <br \> <code>df.abs()</code>
-| add(other[, level, fill_value, axis])
-| Return Addition of series and other, element-wise (binary operator add).
-|
-|
-|-
-| add_prefix(prefix)
-| Prefix labels with string prefix.
-|-
-| add_suffix(suffix)
-| Suffix labels with string suffix.
-|-
-| agg([func, axis])
-| Aggregate using one or more operations over the specified axis.
 |-
-| aggregate([func, axis])
-| Aggregate using one or more operations over the specified axis.
-|-
-| align(other[, join, axis, level, copy, …])
-| Align two objects on their axes with the specified join method.
-|-
-| all([axis, bool_only, skipna, level])
-| Return whether all elements are True, potentially over an axis.
-|-
-| any([axis, bool_only, skipna, level])
-| Return whether any element is True, potentially over an axis.
-|-
-| append(to_append[, ignore_index, …])
-| Concatenate two or more Series.
-|-
-| apply(func[, convert_dtype, args])
-| Invoke function on values of Series.
-|-
-| argmax([axis, skipna])
-| Return int position of the largest value in the Series.
-|-
-| argmin([axis, skipna])
-| Return int position of the smallest value in the Series.
-|-
-| argsort([axis, kind, order])
-| Return the integer indices that would sort the Series values.
-|-
-| asfreq(freq[, method, how, normalize, …])
-| Convert TimeSeries to specified frequency.
-|-
-| asof(where[, subset])
-| Return the last row(s) without any NaNs before where.
-|-
-| astype(dtype[, copy, errors])
-| Cast a pandas object to a specified dtype dtype.
-|-
-| at_time(time[, asof, axis])
-| Select values at particular time of day (e.g., 9:30AM).
-|-
-| autocorr([lag])
-| Compute the lag-N autocorrelation.
-|-
-| backfill([axis, inplace, limit, downcast])
-| Synonym for DataFrame.fillna() with method='bfill'.
-|-
-| between(left, right[, inclusive])
-| Return boolean Series equivalent to left <= series <= right.
-|-
-| between_time(start_time, end_time[, …])
-| Select values between particular times of the day (e.g., 9:00-9:30 AM).
-|-
-| bfill([axis, inplace, limit, downcast])
-| Synonym for DataFrame.fillna() with method='bfill'.
-|-
-| bool()
-| Return the bool of a single element Series or DataFrame.
-|-
-| cat
-| alias of pandas.core.arrays.categorical.CategoricalAccessor
-|-
-| clip([lower, upper, axis, inplace])
-| Trim values at input threshold(s).
-|-
-| combine(other, func[, fill_value])
-| Combine the Series with a Series or scalar according to func.
-|-
-| combine_first(other)
-| Combine Series values, choosing the calling Series’s values first.
-|-
-| compare(other[, align_axis, keep_shape, …])
-| Compare to another Series and show the differences.
-|-
-| convert_dtypes([infer_objects, …])
-| Convert columns to best possible dtypes using dtypes supporting pd.NA.
-|-
-| copy([deep])
-| Make a copy of this object’s indices and data.
-|-
-| corr(other[, method, min_periods])
-| Compute correlation with other Series, excluding missing values.
-|-
-| count([level])
-| Return number of non-NA/null observations in the Series.
-|-
-| cov(other[, min_periods, ddof])
-| Compute covariance with Series, excluding missing values.
-|-
-| cummax([axis, skipna])
-| Return cumulative maximum over a DataFrame or Series axis.
-|-
-| cummin([axis, skipna])
-| Return cumulative minimum over a DataFrame or Series axis.
-|-
-| cumprod([axis, skipna])
-| Return cumulative product over a DataFrame or Series axis.
-|-
-| cumsum([axis, skipna])
-| Return cumulative sum over a DataFrame or Series axis.
-|-
-| describe([percentiles, include, exclude, …])
-| Generate descriptive statistics.
-|-
-| diff([periods])
-| First discrete difference of element.
-|-
-| div(other[, level, fill_value, axis])
-| Return Floating division of series and other, element-wise (binary operator truediv).
-|-
-| divide(other[, level, fill_value, axis])
-| Return Floating division of series and other, element-wise (binary operator truediv).
-|-
-| divmod(other[, level, fill_value, axis])
-| Return Integer division and modulo of series and other, element-wise (binary operator divmod).
-|-
-| dot(other)
-| Compute the dot product between the Series and the columns of other.
-|-
-| drop([labels, axis, index, columns, level, …])
-| Return Series with specified index labels removed.
-|-
-| drop_duplicates([keep, inplace])
-| Return Series with duplicate values removed.
-|-
-| droplevel(level[, axis])
-| Return DataFrame with requested index / column level(s) removed.
-|-
-| dropna([axis, inplace, how])
-| Return a new Series with missing values removed.
-|-
-| dt
-| alias of pandas.core.indexes.accessors.CombinedDatetimelikeProperties
-|-
-| duplicated([keep])
-| Indicate duplicate Series values.
-|-
-| eq(other[, level, fill_value, axis])
-| Return Equal to of series and other, element-wise (binary operator eq).
-|-
-| equals(other)
-| Test whether two objects contain the same elements.
-|-
-| ewm([com, span, halflife, alpha, …])
-| Provide exponential weighted (EW) functions.
-|-
-| expanding([min_periods, center, axis])
-| Provide expanding transformations.
-|-
-| explode([ignore_index])
-| Transform each element of a list-like to a row.
-|-
-| factorize([sort, na_sentinel])
-| Encode the object as an enumerated type or categorical variable.
-|-
-| ffill([axis, inplace, limit, downcast])
-| Synonym for DataFrame.fillna() with method='ffill'.
-|-
-| fillna([value, method, axis, inplace, …])
-| Fill NA/NaN values using the specified method.
-|-
-| filter([items, like, regex, axis])
-| Subset the dataframe rows or columns according to the specified index labels.
-|-
-| first(offset)
-| Select initial periods of time series data based on a date offset.
-|-
-| first_valid_index()
-| Return index for first non-NA/null value.
-|-
-| floordiv(other[, level, fill_value, axis])
-| Return Integer division of series and other, element-wise (binary operator floordiv).
-|-
-| ge(other[, level, fill_value, axis])
-| Return Greater than or equal to of series and other, element-wise (binary operator ge).
-|-
-| get(key[, default])
-| Get item from object for given key (ex: DataFrame column).
-|-
-| groupby([by, axis, level, as_index, sort, …])
-| Group Series using a mapper or by a Series of columns.
-|-
-| gt(other[, level, fill_value, axis])
-| Return Greater than of series and other, element-wise (binary operator gt).
-|-
-| head([n])
-| Return the first n rows.
-|-
-| hist([by, ax, grid, xlabelsize, xrot, …])
-| Draw histogram of the input series using matplotlib.
-|-
-| idxmax([axis, skipna])
-| Return the row label of the maximum value.
-|-
-| idxmin([axis, skipna])
-| Return the row label of the minimum value.
-|-
-| infer_objects()
-| Attempt to infer better dtypes for object columns.
-|-
-| interpolate([method, axis, limit, inplace, …])
-| Please note that only method='linear' is supported for DataFrame/Series with a MultiIndex.
-|-
-| isin(values)
-| Whether elements in Series are contained in values.
-|-
-| isna()
-| Detect missing values.
-|-
-| isnull()
-| Detect missing values.
-|-
-| item()
-| Return the first element of the underlying data as a python scalar.
-|-
-| items()
-| Lazily iterate over (index, value) tuples.
-|-
-| iteritems()
-| Lazily iterate over (index, value) tuples.
-|-
-| keys()
-| Return alias for index.
-|-
-| kurt([axis, skipna, level, numeric_only])
-| Return unbiased kurtosis over requested axis.
-|-
-| kurtosis([axis, skipna, level, numeric_only])
-| Return unbiased kurtosis over requested axis.
-|-
-| last(offset)
-| Select final periods of time series data based on a date offset.
-|-
-| last_valid_index()
-| Return index for last non-NA/null value.
-|-
-| le(other[, level, fill_value, axis])
-| Return Less than or equal to of series and other, element-wise (binary operator le).
-|-
-| lt(other[, level, fill_value, axis])
-| Return Less than of series and other, element-wise (binary operator lt).
-|-
-| mad([axis, skipna, level])
-| Return the mean absolute deviation of the values for the requested axis.
-|-
-| map(arg[, na_action])
-| Map values of Series according to input correspondence.
-|-
-| mask(cond[, other, inplace, axis, level, …])
-| Replace values where the condition is True.
-|-
-| max([axis, skipna, level, numeric_only])
-| Return the maximum of the values for the requested axis.
-|-
-| mean([axis, skipna, level, numeric_only])
-| Return the mean of the values for the requested axis.
-|-
-| median([axis, skipna, level, numeric_only])
-| Return the median of the values for the requested axis.
-|-
-| memory_usage([index, deep])
-| Return the memory usage of the Series.
-|-
-| min([axis, skipna, level, numeric_only])
-| Return the minimum of the values for the requested axis.
-|-
-| mod(other[, level, fill_value, axis])
-| Return Modulo of series and other, element-wise (binary operator mod).
-|-
-| mode([dropna])
-| Return the mode(s) of the dataset.
-|-
-| mul(other[, level, fill_value, axis])
-| Return Multiplication of series and other, element-wise (binary operator mul).
-|-
-| multiply(other[, level, fill_value, axis])
-| Return Multiplication of series and other, element-wise (binary operator mul).
-|-
-| ne(other[, level, fill_value, axis])
-| Return Not equal to of series and other, element-wise (binary operator ne).
-|-
-| nlargest([n, keep])
-| Return the largest n elements.
-|-
-| notna()
-| Detect existing (non-missing) values.
-|-
-| notnull()
-| Detect existing (non-missing) values.
-|-
-| nsmallest([n, keep])
-| Return the smallest n elements.
-|-
-| nunique([dropna])
-| Return number of unique elements in the object.
-|-
-| pad([axis, inplace, limit, downcast])
-| Synonym for DataFrame.fillna() with method='ffill'.
-|-
-| pct_change([periods, fill_method, limit, freq])
-| Percentage change between the current and a prior element.
-|-
-| pipe(func, *args, **kwargs)
-| Apply func(self, *args, **kwargs).
-|-
-| plot
-| alias of pandas.plotting._core.PlotAccessor
-|-
-| pop(item)
-| Return item and drops from series.
-|-
-| pow(other[, level, fill_value, axis])
-| Return Exponential power of series and other, element-wise (binary operator pow).
-|-
-| prod([axis, skipna, level, numeric_only, …])
-| Return the product of the values for the requested axis.
-|-
-| product([axis, skipna, level, numeric_only, …])
-| Return the product of the values for the requested axis.
-|-
-| quantile([q, interpolation])
-| Return value at the given quantile.
-|-
-| radd(other[, level, fill_value, axis])
-| Return Addition of series and other, element-wise (binary operator radd).
-|-
-| rank([axis, method, numeric_only, …])
-| Compute numerical data ranks (1 through n) along axis.
-|-
-| ravel([order])
-| Return the flattened underlying data as an ndarray.
-|-
-| rdiv(other[, level, fill_value, axis])
-| Return Floating division of series and other, element-wise (binary operator rtruediv).
-|-
-| rdivmod(other[, level, fill_value, axis])
-| Return Integer division and modulo of series and other, element-wise (binary operator rdivmod).
-|-
-| reindex([index])
-| Conform Series to new index with optional filling logic.
-|-
-| reindex_like(other[, method, copy, limit, …])
-| Return an object with matching indices as other object.
-|-
-| rename([index, axis, copy, inplace, level, …])
-| Alter Series index labels or name.
-|-
-| rename_axis(**kwargs)
-| Set the name of the axis for the index or columns.
-|-
-| reorder_levels(order)
-| Rearrange index levels using input order.
-|-
-| repeat(repeats[, axis])
-| Repeat elements of a Series.
-|-
-| replace([to_replace, value, inplace, limit, …])
-| Replace values given in to_replace with value.
-|-
-| resample(rule[, axis, closed, label, …])
-| Resample time-series data.
-|-
-| reset_index([level, drop, name, inplace])
-| Generate a new DataFrame or Series with the index reset.
-|-
-| rfloordiv(other[, level, fill_value, axis])
-| Return Integer division of series and other, element-wise (binary operator rfloordiv).
-|-
-| rmod(other[, level, fill_value, axis])
-| Return Modulo of series and other, element-wise (binary operator rmod).
-|-
-| rmul(other[, level, fill_value, axis])
-| Return Multiplication of series and other, element-wise (binary operator rmul).
-|-
-| rolling(window[, min_periods, center, …])
-| Provide rolling window calculations.
-|-
-| round([decimals])
-| Round each value in a Series to the given number of decimals.
-|-
-| rpow(other[, level, fill_value, axis])
-| Return Exponential power of series and other, element-wise (binary operator rpow).
-|-
-| rsub(other[, level, fill_value, axis])
-| Return Subtraction of series and other, element-wise (binary operator rsub).
-|-
-| rtruediv(other[, level, fill_value, axis])
-| Return Floating division of series and other, element-wise (binary operator rtruediv).
-|-
-| sample([n, frac, replace, weights, …])
-| Return a random sample of items from an axis of object.
-|-
-| searchsorted(value[, side, sorter])
-| Find indices where elements should be inserted to maintain order.
-|-
-| sem([axis, skipna, level, ddof, numeric_only])
-| Return unbiased standard error of the mean over requested axis.
-|-
-| set_axis(labels[, axis, inplace])
-| Assign desired index to given axis.
-|-
-| shift([periods, freq, axis, fill_value])
-| Shift index by desired number of periods with an optional time freq.
-|-
-| skew([axis, skipna, level, numeric_only])
-| Return unbiased skew over requested axis.
-|-
-| slice_shift([periods, axis])
-| Equivalent to shift without copying data.
-|-
-| sort_index([axis, level, ascending, …])
-| Sort Series by index labels.
-|-
-| sort_values([axis, ascending, inplace, …])
-| Sort by the values.
-|-
-| sparse
-| alias of pandas.core.arrays.sparse.accessor.SparseAccessor
-|-
-| squeeze([axis])
-| Squeeze 1 dimensional axis objects into scalars.
-|-
-| std([axis, skipna, level, ddof, numeric_only])
-| Return sample standard deviation over requested axis.
-|-
-| str
-| alias of pandas.core.strings.StringMethods
-|-
-| sub(other[, level, fill_value, axis])
-| Return Subtraction of series and other, element-wise (binary operator sub).
-|-
-| subtract(other[, level, fill_value, axis])
-| Return Subtraction of series and other, element-wise (binary operator sub).
-|-
-| sum([axis, skipna, level, numeric_only, …])
-| Return the sum of the values for the requested axis.
-|-
-| swapaxes(axis1, axis2[, copy])
-| Interchange axes and swap values axes appropriately.
-|-
-| swaplevel([i, j, copy])
-| Swap levels i and j in a MultiIndex.
-|-
-| tail([n])
-| Return the last n rows.
-|-
-| take(indices[, axis, is_copy])
-| Return the elements in the given positional indices along an axis.
-|-
-| to_clipboard([excel, sep])
-| Copy object to the system clipboard.
-|-
-| to_csv([path_or_buf, sep, na_rep, …])
-| Write object to a comma-separated values (csv) file.
-|-
-| to_dict([into])
-| Convert Series to {label -> value} dict or dict-like object.
-|-
-| to_excel(excel_writer[, sheet_name, na_rep, …])
-| Write object to an Excel sheet.
-|-
-| to_frame([name])
-| Convert Series to DataFrame.
-|-
-| to_hdf(path_or_buf, key[, mode, complevel, …])
-| Write the contained data to an HDF5 file using HDFStore.
-|-
-| to_json([path_or_buf, orient, date_format, …])
-| Convert the object to a JSON string.
-|-
-| to_latex([buf, columns, col_space, header, …])
-| Render object to a LaTeX tabular, longtable, or nested table/tabular.
-|-
-| to_list()
-| Return a list of the values.
-|-
-| to_markdown([buf, mode, index])
-| Print Series in Markdown-friendly format.
-|-
-| to_numpy([dtype, copy, na_value])
-| A NumPy ndarray representing the values in this Series or Index.
-|-
-| to_period([freq, copy])
-| Convert Series from DatetimeIndex to PeriodIndex.
-|-
-| to_pickle(path[, compression, protocol])
-| Pickle (serialize) object to file.
-|-
-| to_sql(name, con[, schema, if_exists, …])
-| Write records stored in a DataFrame to a SQL database.
-|-
-| to_string([buf, na_rep, float_format, …])
-| Render a string representation of the Series.
-|-
-| to_timestamp([freq, how, copy])
-| Cast to DatetimeIndex of Timestamps, at beginning of period.
-|-
-| to_xarray()
-| Return an xarray object from the pandas object.
-|-
-| tolist()
-| Return a list of the values.
-|-
-| transform(func[, axis])
-| Call func on self producing a Series with transformed values.
-|-
-| transpose(*args, **kwargs)
-| Return the transpose, which is by definition self.
-|-
-| truediv(other[, level, fill_value, axis])
-| Return Floating division of series and other, element-wise (binary operator truediv).
-|-
-| truncate([before, after, axis, copy])
-| Truncate a Series or DataFrame before and after some index value.
-|-
-| tshift([periods, freq, axis])
-| (DEPRECATED) Shift the time index, using the index’s frequency if available.
-|-
-| tz_convert(tz[, axis, level, copy])
-| Convert tz-aware axis to target time zone.
-|-
-| tz_localize(tz[, axis, level, copy, …])
-| Localize tz-naive index of a Series or DataFrame to target time zone.
-|-
-| unique()
-| Return unique values of Series object.
-|-
-| unstack([level, fill_value])
-| Unstack, also known as pivot, Series with MultiIndex to produce DataFrame.
-|-
-| update(other)
-| Modify Series in place using values from passed Series.
-|-
-| value_counts([normalize, sort, ascending, …])
-| Return a Series containing counts of unique values.
-|-
-| var([axis, skipna, level, ddof, numeric_only])
-| Return unbiased variance over requested axis.
-|-
-| view([dtype])
-| Create a new view of the Series.
-|-
-| where(cond[, other, inplace, axis, level, …])
-| Replace values where the condition is False.
-|-
-| xs(key[, axis, level, drop_level])
-| Return cross-section from the Series/DataFrame.
 |}
-{{了解更多|[https://pandas.pydata.org/docs/reference/api/pandas.Series.html#pandas.Series Pandas API：pandas.Series]}}
-===DataFrame===
-DataFrame是有标记的二维的数据结构，具有可能不同类型的列。由数据，行标签（索引，index），列标签（列，columns）构成。您可以将其视为电子表格或SQL表，或Series对象的字典。它通常是最常用的Pandas对象。
-{{了解更多|[https://pandas.pydata.org/docs/user_guide/dsintro.html#dataframe pandas文档：用户指南 - DataFrame]}}
-====创建DataFrame====
-创建DataFrame对象有多种方法：
-* 使用<code>pandas.DataFrame()</code>构造方法
-* 使用<code>pandas.DataFrame.from_dict()</code>方法，类似构造方法
-* 使用<code>pandas.DataFrame.from_records()</code>方法，类似构造方法
-* 使用函数从导入文件创建，如使用<code>pandas.read_csv()</code>函数导入csv文件创建一个DataFrame对象。
-构造方法<code>pandas.DataFrame()</code>的格式为：
- pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
 ==Pandas绘图==