Python Pandas cheat sheet

Updated: Oct 2, 2022

Python Pandas Library Cheat Sheet for a Beginner

This is a python pandas cheat sheet for a quick lookup and review. For more information and explanation, please visit pandas doc: https://pandas.pydata.org/docs/reference/frame.html

Import: Import pandas as pd
Read CSV: pd.read_csv(‘file_name.csv’)
Union all multiple dataframes: pd.concat([df1, df2, df3, …])
Merge:
pd.merge(table1, table2)
table1.merge(table2)
Using left_on or right_on: pd.merge(table1, table2, left_on: table1_column, right_on: table2_column)
Useful option: suffixes = [table1_suffix, table2_suffix]
Join types: pd.merge(table1, table2, how=’join_type’)
Join types = [‘left’, ‘right’, ‘outer’]
Default join type = ‘inner’

Column name change: dataframe.rename(columns = {‘old_column_name’:’new_column_name’, ‘old_column_name2’:’new_column_name2’})
Create a new column:
New_column_value can be one value: ex) dataframe[‘new_column_name’] = ‘Yes’
New_column_value can be a simple calc: ex) dataframe[‘new_column_name’] = dataframe[column1] * dataframe[column2]
New_column_value can be more complex calc using lambda: ex) dataframe[‘new_column_name’] = dataframe.apply( lambda x: x[‘column1’]**2 if x[‘column2’] > (x[‘column1’]*2) else x[‘column2’]*3)
Dealing with missing data:
Drop them: dataframe.dropna()
Fill them:
One column: dataframe.column.fillna(value)
Multiple columns: dataframe.fillna(value = {column1: value1, column2:value2})
Pivot:
dataframe.pivot(index = , columns = , values = ).reset_index()
pd.melt(dataframe, id_vars = , var_name = , value_name = )