Pandas DataFrame reindex 重置行索引-伙伴云

Pandas DataFrame reindex 重置行索引

网友投稿 1461 2022-05-30

所属的课程名称及链接

[AI基础课程--常用框架工具]

环境信息

* ModelArts

* notebook - Multi-Engine 2.0 (python3)

* JupyterLab - Notebook - Conda-python3

Pandas DataFrame reindex 重置行索引

* pandas 0.22.0

Pandas DataFrame reindex 重置行索引

import pandas as pd import numpy as np my_df = pd.DataFrame(data=np.arange(20).reshape(4,5), # 4*5的矩阵 index=list("acef"), # 行索引缺少bd，一会用reindex补上 columns=list("ABCDE")) # 列索引 print("my_df\n",my_df) ''' reindex( labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=nan, limit=None, tolerance=None) ''' # 重置行索引 # 对于新行，填充的是NaN # 注意阅读帮助文档 print(my_df.reindex(list("abcdefg")))

my_df A B C D E a 0 1 2 3 4 c 5 6 7 8 9 e 10 11 12 13 14 f 15 16 17 18 19 A B C D E a 0.0 1.0 2.0 3.0 4.0 b NaN NaN NaN NaN NaN c 5.0 6.0 7.0 8.0 9.0 d NaN NaN NaN NaN NaN e 10.0 11.0 12.0 13.0 14.0 f 15.0 16.0 17.0 18.0 19.0 g NaN NaN NaN NaN NaN

help

help(my_df.reindex) Help on method reindex in module pandas.core.frame: reindex(labels=None, index=None, columns=None, axis=None, method=None, copy=True, level=None, fill_value=nan, limit=None, tolerance=None) method of pandas.core.frame.DataFrame instance Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. A new object is produced unless the new index is equivalent to the current one and copy=False Parameters ---------- labels : array-like, optional New labels / index to conform the axis specified by 'axis' to. index, columns : array-like, optional (should be specified using keywords) New labels / index to conform to. Preferably an Index object to avoid duplicating data axis : int or str, optional Axis to target. Can be either the axis name ('index', 'columns') or number (0, 1). method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}, optional method to use for filling holes in reindexed DataFrame. Please note: this is only applicable to DataFrames/Series with a monotonically increasing/decreasing index. * default: don't fill gaps * pad / ffill: propagate last valid observation forward to next valid * backfill / bfill: use next valid observation to fill gap * nearest: use nearest valid observations to fill gap copy : boolean, default True Return a new object, even if the passed indexes are the same level : int or name Broadcast across a level, matching Index values on the passed MultiIndex level fill_value : scalar, default np.NaN Value to use for missing values. Defaults to NaN, but can be any "compatible" value limit : int, default None Maximum number of consecutive elements to forward or backward fill tolerance : optional Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations most satisfy the equation ``abs(index[indexer] - target) <= tolerance``. Tolerance may be a scalar value, which applies the same tolerance to all values, or list-like, which applies variable tolerance per element. List-like includes list, tuple, array, Series, and must be the same size as the index and its dtype must exactly match the index's type. .. versionadded:: 0.17.0 .. versionadded:: 0.21.0 (list-like tolerance) Examples -------- ``DataFrame.reindex`` supports two calling conventions * ``(index=index_labels, columns=column_labels, ...)`` * ``(labels, axis={'index', 'columns'}, ...)`` We *highly* recommend using keyword arguments to clarify your intent. Create a dataframe with some fictional data. >>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror'] >>> df = pd.DataFrame({ ... 'http_status': [200,200,404,404,301], ... 'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]}, ... index=index) >>> df http_status response_time Firefox 200 0.04 Chrome 200 0.02 Safari 404 0.07 IE10 404 0.08 Konqueror 301 1.00 Create a new index and reindex the dataframe. By default values in the new index that do not have corresponding records in the dataframe are assigned ``NaN``. >>> new_index= ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10', ... 'Chrome'] >>> df.reindex(new_index) http_status response_time Safari 404.0 0.07 Iceweasel NaN NaN Comodo Dragon NaN NaN IE10 404.0 0.08 Chrome 200.0 0.02 We can fill in the missing values by passing a value to the keyword ``fill_value``. Because the index is not monotonically increasing or decreasing, we cannot use arguments to the keyword ``method`` to fill the ``NaN`` values. >>> df.reindex(new_index, fill_value=0) http_status response_time Safari 404 0.07 Iceweasel 0 0.00 Comodo Dragon 0 0.00 IE10 404 0.08 Chrome 200 0.02 >>> df.reindex(new_index, fill_value='missing') http_status response_time Safari 404 0.07 Iceweasel missing missing Comodo Dragon missing missing IE10 404 0.08 Chrome 200 0.02 We can also reindex the columns. >>> df.reindex(columns=['http_status', 'user_agent']) http_status user_agent Firefox 200 NaN Chrome 200 NaN Safari 404 NaN IE10 404 NaN Konqueror 301 NaN Or we can use "axis-style" keyword arguments >>> df.reindex(['http_status', 'user_agent'], axis="columns") http_status user_agent Firefox 200 NaN Chrome 200 NaN Safari 404 NaN IE10 404 NaN Konqueror 301 NaN To further illustrate the filling functionality in ``reindex``, we will create a dataframe with a monotonically increasing index (for example, a sequence of dates). >>> date_index = pd.date_range('1/1/2010', periods=6, freq='D') >>> df2 = pd.DataFrame({"prices": [100, 101, np.nan, 100, 89, 88]}, ... index=date_index) >>> df2 prices 2010-01-01 100 2010-01-02 101 2010-01-03 NaN 2010-01-04 100 2010-01-05 89 2010-01-06 88 Suppose we decide to expand the dataframe to cover a wider date range. >>> date_index2 = pd.date_range('12/29/2009', periods=10, freq='D') >>> df2.reindex(date_index2) prices 2009-12-29 NaN 2009-12-30 NaN 2009-12-31 NaN 2010-01-01 100 2010-01-02 101 2010-01-03 NaN 2010-01-04 100 2010-01-05 89 2010-01-06 88 2010-01-07 NaN The index entries that did not have a value in the original data frame (for example, '2009-12-29') are by default filled with ``NaN``. If desired, we can fill in the missing values using one of several options. For example, to backpropagate the last valid value to fill the ``NaN`` values, pass ``bfill`` as an argument to the ``method`` keyword. >>> df2.reindex(date_index2, method='bfill') prices 2009-12-29 100 2009-12-30 100 2009-12-31 100 2010-01-01 100 2010-01-02 101 2010-01-03 NaN 2010-01-04 100 2010-01-05 89 2010-01-06 88 2010-01-07 NaN Please note that the ``NaN`` value present in the original dataframe (at index value 2010-01-03) will not be filled by any of the value propagation schemes. This is because filling while reindexing does not look at dataframe values, but only compares the original and desired indexes. If you do want to fill in the ``NaN`` values present in the original dataframe, use the ``fillna()`` method. See the :ref:`user guide ` for more. Returns ------- reindexed : DataFrame

备注

1. 感谢老师的教学与课件

2. 欢迎各位同学一起来交流学习心得^_^

3. 沙箱实验、认证、论坛和直播，其中包含了许多优质的内容，推荐了解与学习。

Python 云学院

【手摸手学ModelArts】零代码轻松实现图像分类

1461 2022-05-30

Pandas DataFrame reindex 重置行索引

【手摸手学ModelArts】零代码轻松实现图像分类

使用modelarts部署bert命名实体识别模型（基于bert的命名实体识别）

华为ModelArts-Lab拓展试验记录（三）

推荐文章

企业生产管理是什么，企业生产管理软件

进盘点进销存软件排行榜前十名

进销存系统哪个简单好用？进销存系统优点

工厂生产管理（工厂生产管理流程及制度）

生产管理软件，机械制造业生产管理，制造业生产过程管理软件

进销存软件和ERP有什么区别？进销存与erp软件理解

进销存如何进行库存管理

如何利用excel制作销售订单管理系统？

数据库订单管理系统有哪些功能？数据库订单管理系统怎么设计？

什么是数据库管理系统？

最近发表

热评文章

零代码开发是什么？2022低代码平台排行榜

智能进销存库存管理系统（智慧进销存）

在线文档哪家强？8款在线文档编辑软件推荐

WPS2016怎么绘制简单的价格表?

连锁餐饮管理系统的功能有哪些？餐饮服务系统的构成及工

智能定制家居管理系统：重新定义家庭生活方式

友情链接

Pandas DataFrame reindex 重置行索引

微信扫一扫：分享

推荐文章

最近发表

热评文章

友情链接