32.pandas groupby

Groupby

Create a dataframe
groupby a particular column (company)
sum of sales of every company
mean of sales of a company
standard deviation
sum of sales of a particular company
groupby function -- count, max, min, describe

import pandas as pd
data = {'Company':['GOOG','GOOG','MFST','MFST','FB','FB'],
       'Person':['Sam','Charlie','Amy','Vanessa','Carl','Sarah'],
       'Sales':[200,120,340,124,243,350]}
df = pd.DataFrame(data)
df
Company Person Sales
0 GOOG Sam 200
1 GOOG Charlie 120
2 MFST Amy 340
3 MFST Vanessa 124
4 FB Carl 243
5 FB Sarah 350
df.groupby('Company')
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fda8180ae10>
byComp = df.groupby('Company')
byComp.sum()
Sales
Company
FB 593
GOOG 320
MFST 464
byComp.count()
Person Sales
Company
FB 2 2
GOOG 2 2
MFST 2 2
byComp.mean()
Sales
Company
FB 296.5
GOOG 160.0
MFST 232.0
byComp.std()
Sales
Company
FB 75.660426
GOOG 56.568542
MFST 152.735065
df
Company Person Sales
0 GOOG Sam 200
1 GOOG Charlie 120
2 MFST Amy 340
3 MFST Vanessa 124
4 FB Carl 243
5 FB Sarah 350
byComp.min()
Person Sales
Company
FB Carl 243
GOOG Charlie 120
MFST Amy 124
byComp.max()
Person Sales
Company
FB Sarah 350
GOOG Sam 200
MFST Vanessa 340
byComp.max().loc['FB']
Person    Sarah
Sales       350
Name: FB, dtype: object
byComp.max().loc['FB']['Sales']
350
byComp.describe()
Sales
count mean std min 25% 50% 75% max
Company
FB 2.0 296.5 75.660426 243.0 269.75 296.5 323.25 350.0
GOOG 2.0 160.0 56.568542 120.0 140.00 160.0 180.00 200.0
MFST 2.0 232.0 152.735065 124.0 178.00 232.0 286.00 340.0
byComp.describe().transpose()
Company FB GOOG MFST
Sales count 2.000000 2.000000 2.000000
mean 296.500000 160.000000 232.000000
std 75.660426 56.568542 152.735065
min 243.000000 120.000000 124.000000
25% 269.750000 140.000000 178.000000
50% 296.500000 160.000000 232.000000
75% 323.250000 180.000000 286.000000
max 350.000000 200.000000 340.000000