conditional substitution in two data frames

You have to do left join with df_1 in the left side and then use the existing goal column in df_1 to fill the nulls produced by join.

df_1 = pd.DataFrame()
df_2 = pd.DataFrame()

df_1['campaign'] = ['a', 'b', 'c', 'd']
df_1['goal'] =['order', 'order', 'off', 'order']

df_2['campaign'] = ['a', 'b', 'c']
df_2['goal'] = ['Subscription', 'order', 'Subscription']

# left join
df = df_1.merge(df_2.rename(columns={'goal': 'new_goal'}), on=['campaign'], how='left')
# replace nulls 
df['new_goal'].fillna(df['goal'], inplace=True)

df

+---+----------+-------+--------------+
|   | campaign | goal  |   new_goal   |
+---+----------+-------+--------------+
| 0 |    a     | order | Subscription |
| 1 |    b     | order |    order     |
| 2 |    c     |  off  | Subscription |
| 3 |    d     | order |    order     |
+---+----------+-------+--------------+

You can select the columns you need and rename them as per your need

df_final = df[['campaign', 'new_goal']].rename(columns={'new_goal': 'goal'})

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top