Problem
This data frame is what I’m working with:
Fruit Date Name Number
Apples 10/6/2016 Bob 7
Apples 10/6/2016 Bob 8
Apples 10/6/2016 Mike 9
Apples 10/7/2016 Steve 10
Apples 10/7/2016 Bob 1
Oranges 10/7/2016 Bob 2
Oranges 10/6/2016 Tom 15
Oranges 10/6/2016 Mike 57
Oranges 10/6/2016 Bob 65
Oranges 10/7/2016 Tony 1
Grapes 10/7/2016 Bob 1
Grapes 10/7/2016 Tom 87
Grapes 10/7/2016 Bob 22
Grapes 10/7/2016 Bob 12
Grapes 10/7/2016 Tony 15
To acquire a total number of Fruit per Name, I’d like to aggregate this by Name and then by fruit. Consider the following scenario:
Bob,Apples,16
I tried grouping by Name and Fruit, but I can’t seem to figure out how to get the overall amount of fruits.
Asked by Trying_hard
Solution #1
Use GroupBy.sum:
df.groupby(['Fruit','Name']).sum()
Out[31]:
Number
Fruit Name
Apples Bob 16
Mike 9
Steve 10
Grapes Bob 35
Tom 87
Tony 15
Oranges Bob 67
Mike 57
Tom 15
Tony 1
Answered by Steven G
Solution #2
You can also use the agg function.
df.groupby(['Name', 'Fruit'])['Number'].agg('sum')
Answered by Saurabh
Solution #3
If you want to keep the original columns Fruit and Name, use reset_index(). If not, Fruit and Name will be included in the index.
df.groupby(['Fruit','Name'])['Number'].sum().reset_index()
Fruit Name Number
Apples Bob 16
Apples Mike 9
Apples Steve 10
Grapes Bob 35
Grapes Tom 87
Grapes Tony 15
Oranges Bob 67
Oranges Mike 57
Oranges Tom 15
Oranges Tony 1
As may be seen in the other responses:
df.groupby(['Fruit','Name'])['Number'].sum()
Number
Fruit Name
Apples Bob 16
Mike 9
Steve 10
Grapes Bob 35
Tom 87
Tony 15
Oranges Bob 67
Mike 57
Tom 15
Tony 1
Answered by Gazala Muhamed
Solution #4
Both of the other options achieve your goal.
You can use the pivot functionality to arrange the data in a nice table
df.groupby(['Fruit','Name'],as_index = False).sum().pivot('Fruit','Name').fillna(0)
Name Bob Mike Steve Tom Tony
Fruit
Apples 16.0 9.0 10.0 0.0 0.0
Grapes 35.0 0.0 0.0 87.0 15.0
Oranges 67.0 57.0 0.0 15.0 1.0
Answered by Demetri Pananos
Solution #5
df.groupby(['Fruit','Name'])['Number'].sum()
To sum numbers, you can choose from a variety of columns.
Answered by jared
Post is based on https://stackoverflow.com/questions/39922986/how-do-i-pandas-group-by-to-get-sum