merge and unfold reference row values in pandas

Let’s create a function unfold that uses networkx to create a Digraph to map the relationship between out_item and in_item:

import networkx as nx

def unfold():
    G = nx.from_pandas_edgelist(df, 'out_item', 'in_item', create_using=nx.DiGraph())
    M = df.drop_duplicates('in_item').set_index('in_item')['in_item_cost']
    for i in df['out_item'].loc[lambda x: ~x.isin(df['in_item'])]:
        n = nx.descendants(G, i)
        yield [[i]*len(n), [*n], M[n].tolist()]

out = pd.DataFrame(np.hstack(unfold()).T, columns=df.columns)

print(out)

   out_item  in_item  in_item_cost
0         1        2             5
1         1        3            10
2         1        4            15
3         1        5            20
4         7        8            54
5         7        3            10
6         7        4            15
7         7        5            20

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top