fastest way to remove rows that contain substrings of values in the same column of a pandas dataframe

You could use some list comprehesnion to check if row strings are in other rows of the dataframe:

m = df['B'].apply(lambda x: any([x for y in df['B'] if x != y if x in y]))
df = df[~m]
    A            B
2  44         abcd
5  77     john Doe
7  99  john hi Doe

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top