Pandas 用户指南目录
“用户指南” 按主题划分区域涵盖了几乎所有Pandas的功能。每个小节都介绍了一个主题(例如“处理缺失的数据”),并讨论了Pandas如何解决问题,其中包含许多示例。
刚开始接触Pandas的同学应该从十分钟入门Pandas开始看起。
有关任何特定方法的更多信息,请参阅API参考。
- IO工具(文本,CSV,HDF5,…)
- CSV & text files
- JSON
- HTML
- Excel files
- OpenDocument Spreadsheets
- Clipboard
- Pickling
- msgpack
- HDF5 (PyTables)
- Feather
- Parquet
- SQL queries
- Google BigQuery
- Stata format
- SAS formats
- Other file formats
- Performance considerations
- 索引和数据选择器
- Different choices for indexing
- Basics
- Attribute access
- Slicing ranges
- Selection by label
- Selection by position
- Selection by callable
- IX indexer is deprecated
- Indexing with list with missing labels is deprecated
- Selecting random samples
- Setting with enlargement
- Fast scalar value getting and setting
- Boolean indexing
- Indexing with isin
- The
where()
Method and Masking - The
query()
Method - Duplicate data
- Dictionary-like
get()
method - The
lookup()
method - Index objects
- Set / reset index
- Returning a view versus a copy
- 多索引/高级索引
- Hierarchical indexing (MultiIndex)
- Advanced indexing with hierarchical index
- Sorting a
MultiIndex
- Take methods
- Index types
- Miscellaneous indexing FAQ
- 合并、联接和连接
- Concatenating objects
- Database-style DataFrame or named Series joining/merging
- Timeseries friendly merging
- 重塑和数据透视表
- Reshaping by pivoting DataFrame objects
- Reshaping by stacking and unstacking
- Reshaping by Melt
- Combining with stats and GroupBy
- Pivot tables
- Cross tabulations
- Tiling
- Computing indicator / dummy variables
- Factorizing values
- Examples
- Exploding a list-like column
- 处理文本字符串
- Splitting and replacing strings
- Concatenation
- Indexing with
.str
- Extracting substrings
- Testing for Strings that match or contain a pattern
- Creating indicator variables
- Method summary
- 处理丢失的数据
- Values considered “missing”
- Sum/prod of empties/nans
- NA values in GroupBy
- Filling missing values: fillna
- Filling with a PandasObject
- Dropping axis labels with missing data: dropna
- Interpolation
- Replacing generic values
- String/regular expression replacement
- Numeric replacement
- 分类数据
- Object creation
- CategoricalDtype
- Description
- Working with categories
- Sorting and order
- Comparisons
- Operations
- Data munging
- Getting data in/out
- Missing data
- Differences to R’s factor
- Gotchas
- Nullable整型数据类型
- 可视化
- Basic plotting:
plot
- Other plots
- Plotting with missing data
- Plotting Tools
- Plot Formatting
- Plotting directly with matplotlib
- Trellis plotting interface
- 计算工具
- Statistical functions
- Window Functions
- Aggregation
- Expanding windows
- Exponentially weighted windows
- 组操作: 拆分-应用-组合
- Splitting an object into groups
- Iterating through groups
- Selecting a group
- Aggregation
- Transformation
- Filtration
- Dispatching to instance methods
- Flexible
apply
- Other useful features
- Examples
- 时间序列/日期方法
- Overview
- Timestamps vs. Time Spans
- Converting to timestamps
- Generating ranges of timestamps
- Timestamp limitations
- Indexing
- Time/date components
- DateOffset objects
- Time Series-Related Instance Methods
- Resampling
- Time span representation
- Converting between representations
- Representing out-of-bounds spans
- Time zone handling
- 时间增量
- Parsing
- Operations
- Reductions
- Frequency conversion
- Attributes
- TimedeltaIndex
- Resampling
- 样式
- Building styles
- Finer control: slicing
- Finer Control: Display Values
- Builtin styles
- Sharing styles
- Other Options
- Fun stuff
- Export to Excel
- Extensibility
- 选项和设置
- Overview
- Getting and setting options
- Setting startup options in Python/IPython environment
- Frequently Used Options
- Available options
- Number formatting
- Unicode formatting
- Table schema display
- 提高性能
- Cython (writing C extensions for pandas)
- Using Numba
- Expression evaluation via
>eval()
- 稀疏数据结构
- SparseArray
- SparseDtype
- Sparse accessor
- Sparse calculation
- Migrating
- Interaction with scipy.sparse
- Sparse subclasses
- 常见问题(FAQ)
- DataFrame memory usage
- Using if/truth statements with pandas
NaN
, IntegerNA
values andNA
type promotions- Differences with NumPy
- Thread-safety
- Byte-Ordering issues
- 烹饪指南
- Idioms
- Selection
- MultiIndexing
- Missing data
- Grouping
- Timeseries
- Merge
- Plotting
- Data In/Out
- Computation
- Timedeltas
- Aliasing axis names
- Creating example data