apply function to rows with variable and expand multiple return variables to columns

dfnew = df.apply(function_name, args=(arg1, arg2),

                 result_type='expand', axis=1)

def function_name(row, arg1, arg2):

        do stuff with row.column_name

        return [out1, out2]

ploty in pandas

import cufflinks as cf

cf.go_offline()

df.iplot(kind='scatter')

rename columns

df2 = df.rename(columns={'int_col' : 'some_other_name'})

df2.rename(columns={'some_other_name' : 'int_col'}, inplace = True)

concatenate dataframes

result = pd.concat([df1, df2, df3, df4])

deal with date string / datetime

pd.to_datetime(z.datestring, unit='ms')

pd.to_datetime(unix_time_stamp, unit='ms')

Index issues

pd.reindex()

pd.reset_index()

missing values

df2.dropna()

df3['float_col'].fillna(mean)

delete a row in pandas

#delete row where col_name = ‘some val’

df1 = df1[df.col_name != ‘some val’]

df1 = df1.query(df.col_name != ‘some val’)

pivot tables / plotting after groupby

reset_index

enumerate groupby

for k, gdf in df.groupby('col_name'):

count the unique number of values in column (or set of columns)

df.value_counts(subset=None, normalize=False, sort=True, ascending=False)

Subset -> group of columns

Normalize -> give % instead of counts

unique number of values in row or col

df.nunique(axis)

axis=1 ->columns