create a new column by counting the values of several items in another column in pandas

Use GroupBy.transform with DataFrameGroupBy.nunique and Series.mask for replace 1 to 0:

df['COL3'] = (df.groupby(['COL1_1', 'COL1_3']).COL2.transform('nunique')
                .mask(lambda x: x == 1, 0))

Or use replace:

df['COL3'] = df.groupby(['COL1_1', 'COL1_3']).COL2.transform('nunique').replace({1:0})

print (df)
    COL1_1        COL1_3 COL2  COL3
0   Chr1_0   Canis_lupus    A     2
1   Chr1_0   Canis_lupus    A     2
2   Chr1_0   Canis_lupus    B     2
3   Chr1_0   Canis_lupus    B     2
4   Chr1_0   Canis_lupus    B     2
5   Chr1_0  Felis_cattus    B     0
6   Chr1_0  Felis_cattus    B     0
7   Chr2_0  Felis_cattus    A     2
8   Chr2_0  Felis_cattus    B     2
9   Chr2_1  Felis_cattus    C     3
10  Chr2_1  Felis_cattus    D     3
11  Chr2_1  Felis_cattus    E     3

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top