assign a group number based on time series data in python

You can use np.where with mask to get 1 where there is a change of ID or there is 1 in status and not in the previous row. then use cumsum to increment the value. For the 0 you want to get -, you can do it after, using loc with another mask.

df['gr'] = np.cumsum( 
    np.where(df['ID'].ne(df['ID'].shift())  #new ID
            | (df['Status'].eq(1) #status 1
               & df['Status'].ne(df['Status'].shift())), # previous status not the same
             1, 0))

# I would rather use np.nan than '-' to keep numeric values but up to you
df.loc[df['Status'].eq(0) 
       & df['Status'].eq(df['Status'].shift()), 'gr'] = '-'

print(df)
        ID               Timestamp  Value  Status gr
103177  64 2010-09-21 23:13:21.090   21.5     1.0  1
252019  64 2010-09-22 00:44:14.890   21.5     1.0  1
271381  64 2010-09-22 00:44:15.890   21.5     0.0  1
268939  64 2010-09-22 00:44:17.890   23.0     0.0  -
259875  64 2010-09-22 00:44:18.440   23.0     1.0  2
18870   64 2010-09-22 00:44:19.890   24.5     1.0  2
205910  32 2010-09-22 00:44:23.440   24.5     1.0  3
103865  32 2010-09-22 01:04:33.440   23.5     0.0  3
152281  32 2010-09-22 01:27:01.790   22.5     1.0  4
138988  32 2010-09-22 02:18:52.850   21.5     0.0  4

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top