pandas基础(part1)--Series

网友投稿 244 2022-09-26

pandas基础(part1)--Series

学习笔记,这个笔记以例子为主。 开发工具:Spyder

文章目录

​​pandas介绍​​​​Series​​

​​创建Series​​​​访问Series中的数据​​

​​pandas日期处理​​

​​DateTimeIndex​​

pandas介绍

pandas是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。Pandas 纳入 了大量库和一些标准的数据模型,提供了高效地操作大型结构化数据集所需的工具。

Series

Series可以理解为一个一维的数组,只是index名称可以自己改动。类似于定长的有序字典,有Index和 value。

创建Series

语法

import pandas as pd# 创建一个空的系列s = pd.Series()# 从ndarray创建一个系列data = np.array(['a','b','c','d'])s = pd.Series(data)s = pd.Series(data,index=[100,101,102,103])# 从字典创建一个系列 data = {'a' : 0., 'b' : 1., 'c' : 2.}s = pd.Series(data)# 从标量创建一个系列s = pd.Series(5, index=[0, 1, 2, 3])

例子

代码1(从ndarray创建一个系列):

import numpy as npimport pandas as pddata = np.array(['Ada', 'Bunny', 'Jack', 'Black'])s1 = pd.Series(data)print(s1)

结果1:

0 Ada1 Bunny2 Jack3 Blackdtype: object

代码2(自定义index):

s2 = pd.Series(data, index = [10, 20, 30, 40])print(s2)

结果2:

10 Ada20 Bunny30 Jack40 Blackdtype: object

代码3(从字典创建一个系列):

data = {"a":0, "b":1, "c":2, 'e':3}#字典的key为Series的indexs3 = pd.Series(data)print(s3)

结果3:

a 0b 1c 2e 3dtype: int64

代码4(从标量创建一个系列):

s4 = pd.Series(10, index = [0, 1, 2, 3])print(s4)

结果4:

0 101 102 103 10dtype: int64

访问Series中的数据

语法

# 使用索引检索元素s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])print(s[0], s[:3], s[-3:])# 使用标签检索数据print(s['a'], s[['a','c','d']])

例子

代码1:

import numpy as npimport pandas as pddata = np.array(['Ada', 'Bunny', 'Jack', 'Black'])s = pd.Series(data, index = ["a", "b", "c", "d"])print(s[0], '\n\n',s[:3],'\n\n', s[-3: ])

结果1:

Ada a Adab Bunnyc Jackdtype: object b Bunnyc Jackd Blackdtype: object

代码2:

print(s["a"], '\n\n',s[["a", "b", "c"]])

结果2:

Ada a Adab Bunnyc Jackdtype: object

pandas日期处理

语法

# pandas可以识别的日期字符串格式dates = pd.Series(['2011', '2011-02', '2011-03-01', '2011/04/01', '2011/05/01 01:01:01', '01 Jun 2011'])# to_datetime()方法可以转换为日期数据类型dates = pd.to_datetime(dates)

例子

代码1(识别日期):

import numpy as npimport pandas as pddates = pd.Series(['1997', '2015-09', '2019-03-01', '2019/04/01', '2019/05/01 01:01:01', '01 Jun 2019'])print(dates)print("-"*20)dates = pd.to_datetime(dates)print(dates)

结果1:

0 19971 2015-092 2019-03-013 2019/04/014 2019/05/01 01:01:015 01 Jun 2019dtype: object--------------------0 1997-01-01 00:00:001 2015-09-01 00:00:002 2019-03-01 00:00:003 2019-04-01 00:00:004 2019-05-01 01:01:015 2019-06-01 00:00:00dtype: datetime64[ns]

代码2(日期运算):

delta = dates - pd.to_datetime('1970-01-01')print(delta)print("-"*20)#通过Series的dt接口,可以访问偏移量数据print(delta.dt.days)

结果2:

0 9862 days 00:00:001 16679 days 00:00:002 17956 days 00:00:003 17987 days 00:00:004 18017 days 01:01:015 18048 days 00:00:00dtype: timedelta64[ns]--------------------0 98621 166792 179563 179874 180175 18048dtype: int64

Series.dt提供了很多日期相关操作, 部分操作如下:

Series.dt的日期相关操作

含义

Series.dt.year

The year of the datetime.

Series.dt.month

The month as January=1, December=12.

Series.dt.day

The days of the datetime.

Series.dt.hour

The hours of the datetime.

Series.dt.minute

The minutes of the datetime.

Series.dt.second

The seconds of the datetime.

Series.dt.microsecond

The microseconds of the datetime.

Series.dt.week

The week ordinal of the year.

Series.dt.weekofyear

The week ordinal of the year.

Series.dt.dayofweek

The day of the week with Monday=0, Sunday=6.

Series.dt.weekday

The day of the week with Monday=0, Sunday=6.

Series.dt.dayofyear

The ordinal day of the year.

Series.dt.quarter

The quarter of the date.

Series.dt.is_month_start

Indicates whether the date is the first day of the month.

Series.dt.is_month_end

Indicates whether the date is the last day of the month.

Series.dt.is_quarter_start

Indicator for whether the date is the first day of a quarter.

Series.dt.is_quarter_end

Indicator for whether the date is the last day of a quarter.

Series.dt.is_year_start

Indicate whether the date is the first day of a year.

Series.dt.is_year_end Indicate

whether the date is the last day of the year.

Series.dt.is_leap_year

Boolean indicator if the date belongs to a leap year.

Series.dt.days_in_month

The number of days in the month.

代码3(dt接口的各项操作演示):

print(dates.dt.month)

结果3:

0 11 92 33 44 55 6dtype: int64

DateTimeIndex

通过指定周期和频率,使用pd.date_range()函数就可以创建日期序列。

语法

import pandas as pd# 以日为频率(默认值), 2019/08/21为起始,创建5个时间数据datelist = pd.date_range('2019/08/21', periods = 5)# 以月为频率datelist = pd.date_range('2019/08/21', periods=5,freq='M')# 构建某个区间的时间序列start = pd.datetime(2017, 11, 1)end = pd.datetime(2017, 11, 5)dates = pd.date_range(start, end)

例子

代码1:

import numpy as npimport pandas as pddates1 = pd.date_range('2020-01-01', periods = 5, freq = 'D')print(dates1)print("-"*20)dates2 = pd.date_range('2015-01-10', periods = 5, freq = 'M')print(dates2)print("-"*20)start_num = pd.datetime(2019, 1, 1)end_num = pd.datetime(2019, 1, 5)dates3 = pd.date_range(start_num, end_num)print(dates3)

结果1:

DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'], dtype='datetime64[ns]', freq='D')--------------------DatetimeIndex(['2015-01-31', '2015-02-28', '2015-03-31', '2015-04-30', '2015-05-31'], dtype='datetime64[ns]', freq='M')--------------------DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04', '2019-01-05'], dtype='datetime64[ns]', freq='D')

代码2:

dates1 = pd.bdate_range('2020-01-01', periods = 10)print(dates1)

备注:​​bdate_range()​​​用来表示商业日期范围,不同于​​date_range()​​,它不包括星期六和星期天。

结果2:

DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', '2020-01-13', '2020-01-14'], dtype='datetime64[ns]', freq='B')

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Java中MessageFormat的使用详解
下一篇:小白的奇幻数学课堂(part1)--数学永远包含着两个部分,一部分是发明,另一部分是发现
相关文章

 发表评论

暂时没有评论,来抢沙发吧~