linux cpu占用率如何看
244
2022-09-26
pandas基础(part1)--Series
学习笔记,这个笔记以例子为主。 开发工具:Spyder
文章目录
pandas介绍Series
创建Series访问Series中的数据
pandas日期处理
DateTimeIndex
pandas介绍
pandas是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。Pandas 纳入 了大量库和一些标准的数据模型,提供了高效地操作大型结构化数据集所需的工具。
Series
Series可以理解为一个一维的数组,只是index名称可以自己改动。类似于定长的有序字典,有Index和 value。
创建Series
语法
import pandas as pd# 创建一个空的系列s = pd.Series()# 从ndarray创建一个系列data = np.array(['a','b','c','d'])s = pd.Series(data)s = pd.Series(data,index=[100,101,102,103])# 从字典创建一个系列 data = {'a' : 0., 'b' : 1., 'c' : 2.}s = pd.Series(data)# 从标量创建一个系列s = pd.Series(5, index=[0, 1, 2, 3])
例子
代码1(从ndarray创建一个系列):
import numpy as npimport pandas as pddata = np.array(['Ada', 'Bunny', 'Jack', 'Black'])s1 = pd.Series(data)print(s1)
结果1:
0 Ada1 Bunny2 Jack3 Blackdtype: object
代码2(自定义index):
s2 = pd.Series(data, index = [10, 20, 30, 40])print(s2)
结果2:
10 Ada20 Bunny30 Jack40 Blackdtype: object
代码3(从字典创建一个系列):
data = {"a":0, "b":1, "c":2, 'e':3}#字典的key为Series的indexs3 = pd.Series(data)print(s3)
结果3:
a 0b 1c 2e 3dtype: int64
代码4(从标量创建一个系列):
s4 = pd.Series(10, index = [0, 1, 2, 3])print(s4)
结果4:
0 101 102 103 10dtype: int64
访问Series中的数据
语法
# 使用索引检索元素s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])print(s[0], s[:3], s[-3:])# 使用标签检索数据print(s['a'], s[['a','c','d']])
例子
代码1:
import numpy as npimport pandas as pddata = np.array(['Ada', 'Bunny', 'Jack', 'Black'])s = pd.Series(data, index = ["a", "b", "c", "d"])print(s[0], '\n\n',s[:3],'\n\n', s[-3: ])
结果1:
Ada a Adab Bunnyc Jackdtype: object b Bunnyc Jackd Blackdtype: object
代码2:
print(s["a"], '\n\n',s[["a", "b", "c"]])
结果2:
Ada a Adab Bunnyc Jackdtype: object
pandas日期处理
语法
# pandas可以识别的日期字符串格式dates = pd.Series(['2011', '2011-02', '2011-03-01', '2011/04/01', '2011/05/01 01:01:01', '01 Jun 2011'])# to_datetime()方法可以转换为日期数据类型dates = pd.to_datetime(dates)
例子
代码1(识别日期):
import numpy as npimport pandas as pddates = pd.Series(['1997', '2015-09', '2019-03-01', '2019/04/01', '2019/05/01 01:01:01', '01 Jun 2019'])print(dates)print("-"*20)dates = pd.to_datetime(dates)print(dates)
结果1:
0 19971 2015-092 2019-03-013 2019/04/014 2019/05/01 01:01:015 01 Jun 2019dtype: object--------------------0 1997-01-01 00:00:001 2015-09-01 00:00:002 2019-03-01 00:00:003 2019-04-01 00:00:004 2019-05-01 01:01:015 2019-06-01 00:00:00dtype: datetime64[ns]
代码2(日期运算):
delta = dates - pd.to_datetime('1970-01-01')print(delta)print("-"*20)#通过Series的dt接口,可以访问偏移量数据print(delta.dt.days)
结果2:
0 9862 days 00:00:001 16679 days 00:00:002 17956 days 00:00:003 17987 days 00:00:004 18017 days 01:01:015 18048 days 00:00:00dtype: timedelta64[ns]--------------------0 98621 166792 179563 179874 180175 18048dtype: int64
Series.dt提供了很多日期相关操作, 部分操作如下:
Series.dt的日期相关操作 | 含义 |
Series.dt.year | The year of the datetime. |
Series.dt.month | The month as January=1, December=12. |
Series.dt.day | The days of the datetime. |
Series.dt.hour | The hours of the datetime. |
Series.dt.minute | The minutes of the datetime. |
Series.dt.second | The seconds of the datetime. |
Series.dt.microsecond | The microseconds of the datetime. |
Series.dt.week | The week ordinal of the year. |
Series.dt.weekofyear | The week ordinal of the year. |
Series.dt.dayofweek | The day of the week with Monday=0, Sunday=6. |
Series.dt.weekday | The day of the week with Monday=0, Sunday=6. |
Series.dt.dayofyear | The ordinal day of the year. |
Series.dt.quarter | The quarter of the date. |
Series.dt.is_month_start | Indicates whether the date is the first day of the month. |
Series.dt.is_month_end | Indicates whether the date is the last day of the month. |
Series.dt.is_quarter_start | Indicator for whether the date is the first day of a quarter. |
Series.dt.is_quarter_end | Indicator for whether the date is the last day of a quarter. |
Series.dt.is_year_start | Indicate whether the date is the first day of a year. |
Series.dt.is_year_end Indicate | whether the date is the last day of the year. |
Series.dt.is_leap_year | Boolean indicator if the date belongs to a leap year. |
Series.dt.days_in_month | The number of days in the month. |
代码3(dt接口的各项操作演示):
print(dates.dt.month)
结果3:
0 11 92 33 44 55 6dtype: int64
DateTimeIndex
通过指定周期和频率,使用pd.date_range()函数就可以创建日期序列。
语法
import pandas as pd# 以日为频率(默认值), 2019/08/21为起始,创建5个时间数据datelist = pd.date_range('2019/08/21', periods = 5)# 以月为频率datelist = pd.date_range('2019/08/21', periods=5,freq='M')# 构建某个区间的时间序列start = pd.datetime(2017, 11, 1)end = pd.datetime(2017, 11, 5)dates = pd.date_range(start, end)
例子
代码1:
import numpy as npimport pandas as pddates1 = pd.date_range('2020-01-01', periods = 5, freq = 'D')print(dates1)print("-"*20)dates2 = pd.date_range('2015-01-10', periods = 5, freq = 'M')print(dates2)print("-"*20)start_num = pd.datetime(2019, 1, 1)end_num = pd.datetime(2019, 1, 5)dates3 = pd.date_range(start_num, end_num)print(dates3)
结果1:
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-05'], dtype='datetime64[ns]', freq='D')--------------------DatetimeIndex(['2015-01-31', '2015-02-28', '2015-03-31', '2015-04-30', '2015-05-31'], dtype='datetime64[ns]', freq='M')--------------------DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04', '2019-01-05'], dtype='datetime64[ns]', freq='D')
代码2:
dates1 = pd.bdate_range('2020-01-01', periods = 10)print(dates1)
备注:bdate_range()用来表示商业日期范围,不同于date_range(),它不包括星期六和星期天。
结果2:
DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', '2020-01-13', '2020-01-14'], dtype='datetime64[ns]', freq='B')
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。
发表评论
暂时没有评论,来抢沙发吧~