finding the difference between two dataframes in python

Based on this answer here, you can try pd.concat method:

pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique().tolist()

Output:

# if you just want to see the differences between the dataframe
>>> pd.concat([A,B]).drop_duplicates(keep=False)
  column1  column2
1     def        2
1     def        1
# if you just want to see the differences and with only 'column1'
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1']
1    def
1    def
Name: column1, dtype: object
# if you want unique values in the column1 as a numpy array after taking the differences
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique()
array(['def'], dtype=object) 
# if you want unique values in the column1 as a list after taking the differences
>>> pd.concat([A,B]).drop_duplicates(keep=False)['column1'].unique().tolist() 
['def']

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top