我有一个 pandas 数据框,我想计算 days_until_next_event 用于:

df = pd.DataFrame({'message_count': [1, 3, 5, 6, 2, 8, 10, 2], 'event_date': ['2016-01-05', '2016-01-05', '2016-01-05', '2016-01-13', '2016-01-13', '2016-01-13', '2016-01-28', '2016-01-28'], 'message_date': ['2016-01-05', '2016-01-06', '2016-01-10', '2016-01-13', '2016-01-16', '2016-01-22', '2016-01-28', '2016-01-30']}) 
 
event_date  message_count   message_date 
2016-01-05       1           2016-01-05 
2016-01-05       3           2016-01-06 
2016-01-05       5           2016-01-10 
2016-01-13       6           2016-01-13 
2016-01-13       2           2016-01-16 
2016-01-13       8           2016-01-22 
2016-01-28       10          2016-01-28 
2016-01-28       2           2016-01-30 

预期的数据框如下所示:

days_until_next_event   event_date  message_count   message_date     
      0 days            2016-01-05       1           2016-01-05  
      7 days            2016-01-05       3           2016-01-06  
      3 days            2016-01-05       5           2016-01-10  
      0 days            2016-01-13       6           2016-01-13  
      12 days           2016-01-13       2           2016-01-16  
      6 days            2016-01-13       8           2016-01-22  
      0 days            2016-01-28      10           2016-01-28  
      NaT               2016-01-28       2           2016-01-30  

days_until_next_eventmessage_date 和下一个 new event_date 之间的差异。如果这两个日期相同,那么它的值为 0。我可以通过以下方式获得自上次事件以来的天数:

df2['days_since_last_dte'] = [(message - event) for message, event in zip(df2['message_date'], df2['event_date'])] 

但是我无法添加最后一 block 比较它与下一个"new"event_date

请您参考如下方法:

IIUC(PS:假设你的 df 是排序的,如果不是 sort_values 首先)

df['New']=df.event_date.map(pd.Series(df.event_date.unique()[1:],index=df.event_date.unique()[:-1])) 
 
df.loc[df.groupby('event_date').head(1).index,'DiffDays']=0 
 
df 
Out[1191]:  
   event_date  message_count message_date        New          DiffDays 
0  2016-01-05              1   2016-01-05 2016-01-13                 0 
1  2016-01-05              3   2016-01-06 2016-01-13   7 days 00:00:00 
2  2016-01-05              5   2016-01-10 2016-01-13   3 days 00:00:00 
3  2016-01-13              6   2016-01-13 2016-01-28                 0 
4  2016-01-13              2   2016-01-16 2016-01-28  12 days 00:00:00 
5  2016-01-13              8   2016-01-22 2016-01-28   6 days 00:00:00 
6  2016-01-28             10   2016-01-28        NaT                 0 
7  2016-01-28              2   2016-01-30        NaT               NaT 


评论关闭
IT序号网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!