alexa

How to speed up conditional groupby sum in python pandas ?

How to speed up conditional groupby sum in python pandas ?

One option is a double merge and a groupby:

 date = pd.Series(df.Date1.unique(), name='Date')
step1 = df.merge(date, left_on = 'Date2', right_on = 'Date', how = 'outer')
step2 = step1.loc[step1.Date1 < step1.Date]
step2 = step2.groupby(['Case', 'Id', 'Date']).agg(sum=('Quantity','sum'))
(df
.loc[:, ['Case', 'Id', 'Date2']]
.drop_duplicates()
.rename(columns={'Date2':'Date'})
.merge(step2, how = 'left', on = ['Case', 'Id', 'Date'])
.fillna({'sum': 0}, downcast='infer')
)

   Case  Id       Date  sum
0     1   1 2020-01-01    0
1     1   1 2020-02-01  100
2     1   2 2020-01-01    0
3     1   2 2020-02-01   35

185 0
7

Write a Comments


* Be the first to Make Comment

GoodFirms Badge
GoodFirms Badge

Fix Your Meeting With Our SEO Consultants in India To Grow Your Business Online

Facebook
Twitter
LinkedIn
Instagram
Whatsapp
Call Now
Quick Inquiry