Pandas - DataFrames
Create a dataframe using random numbers, index and columns
select the values of a particular column
Select the values of multiple columns
Check the type of columns
Create a new column (sum of two columns)
Remove the column using axis -- drop
remove the row using axis -- drop
inplace parameter
shape of dataframe
Select the values of a row -- loc and iloc
Select a particular value from the dataframe -- [row, column]
select the values of particular rows from the specific columns
Conditional selection -- boolean values and real values
conditional selection on dataframe
conditional selection on particular column
select values of a particular column after conditional selection
Select the data based on two conditions
pandas operators (&, | )
reset_index()
set_index()
conditional selection on dataframe
conditional selection on particular column
select values of a particular column after conditional selection
Select the data based on two conditions
pandas operators (&, | )
reset_index()
set_index()
import numpy as np
import pandas as pd
from numpy.random import randn
randn(1)
df = pd.DataFrame(data = randn(5,4), index = ['A','B','C','D','E'], columns=['W','X','Y','Z'])
print(df)
print(df['Y'])
print(df[['Y','Z']])
type(df['Y'])
type(df)
df['S'] = df['W'] + df['X']
print(df)
df['T'] = [1,2,3,4,5]
print(df)
del df['T']
print(df)
df.drop('S', axis= 1)
print(df)
df.drop('S', axis= 1, inplace=True)
print(df)
print(df.loc['E'])
print(df.iloc[4])
df.loc['F'] = [1,2,3,4]
print(df)
df1 = pd.DataFrame([[5,6,7,8],[9,10,11,12]],columns=['W','X','Y','Z'])
df1
df2 = df.append(df1, ignore_index=True)
df
df2
df.shape
df.drop('F', axis=0, inplace= True)
print(df)
print(df.loc['A','Y'])
print(df['Y'].loc['A'])
print(df.loc['A']['Y'])
print(df.loc[['A','B'],['Y','Z']])
print(df)
bool_df = df > 0
bool_df
df[bool_df]
df[df>0]
df['Y'] > 0
df[df['Y']>0]
df['Y'][df['Y']>0]
result = df[df['X']>0]
result
result[result['Z'] > 0 ]
result[result['Z'] > 0][['X','Y']]
(df['X'] > 0) & (df['Z']>0)
print(df)
df[(df['X'] > 0) & (df['Z']>0)]
df[(df['X'] > 0) & (df['Z']>0)][['X','Z']]
df[['X','Z']][(df['X'] > 0) & (df['Z']>0)]
df
df.reset_index()
df['States'] = ['IL', 'CA', 'TX', 'OH', 'FL']
df
df2 = df.set_index('States')
print(df2)
df2.loc['FL']