In general I believe your approach works, except for a few improvements:
# no need to set_index. Do so on smaller/filtered data if needed
# df = df.set_index('A')
# this is good
df['sum'] = df.groupby('A')['D'].transform('sum')
# there's a bit difference between `'max'` and `max`.
# one is vectorized, one is not
idx = df.groupby(['A'])['C'].transform('max') == df['C']
df= df[idx]
Another improvement is that you can do lazy groupby:
groups = df.groupby('A')
df['sum'] = groups['D'].transform('sum')
idx = groups['C'].transform('max') == df['C']
df = df[idx]
CLICK HERE to find out more related problems solutions.