Python – pandas – find most frequent combination with tie-resolution – performance

Let us try groupby with transform , then get the count of most common value, then sort_values with drop_duplicates

df['help'] = df.groupby(['id','string_col_A','string_col_B'])['string_col_A'].transform('count')
out = df.sort_values(['help','creation_date'],na_position='first').drop_duplicates('id',keep='last').drop(['help','creation_date'],1)
out
Out[122]: 
      id string_col_A string_col_B
3  x21ab       STR_X4       STR_Y4
5  x11aa       STR_X3       STR_Y3
0  x12ga       STR_X1       STR_Y1

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top