python sorts dataframe columns in groups by integer tied to strings



I hope I understood your problem correctly.
I am a little drunk so there may be a mistake, weeks of mandatory home office are harder than expected xD.

Anyway, there is the solution:
# Import pandasand numpy
import pandas as pd
import numpy as np

# Sample df
product = ['blue', 'pink', 'cyan']
v1_vendor = ['shop1', 'shop3', 'shop1']
v2_vendor = ['shop3', 'shop2', 'shop2']
v3_vendor = ['shop2', 'shop1', 'shop3']
price_shop1 = [500, 700, 0]
price_shop2 = [600, 650, 200]
price_shop3 = [550, 600, 300]
url_shop1 = ['1.com/blue', '1.com/pink', '1.com/cyan']
url_shop2 = ['2.com/blue', '2.com/pink', '2.com/cyan']
url_shop3 = ['3.com/blue', '3.com/pink', '3.com/cyan']

df = pd.DataFrame({'product':product, '1_vendor' : v1_vendor, '2_vendor' : v2_vendor, '3_vendor' : v3_vendor, 'price_shop1' : price_shop1, 'price_shop2' : price_shop2, 'price_shop3' : price_shop3,'url_shop1' : url_shop1,'url_shop2' : url_shop2,'url_shop3' : url_shop3})

enter image description here

# Create second dataframe that we will fill with final data
df_f = pd.DataFrame({'product':product})
df_f['1_vendor'] = np.nan
df_f['1_price'] = np.nan
df_f['1_url'] = np.nan
df_f['2_vendor'] = np.nan
df_f['2_price'] = np.nan
df_f['2_url'] = np.nan
df_f['3_vendor'] = np.nan
df_f['3_price'] = np.nan
df_f['3_url'] = np.nan

enter image description here

Now we can use simple for function to loop in original df and pull out results.

# For loop to fill in the final dataframe
for i in list(df.index.values):
    df_f.loc[i, '1_vendor'] = df.loc[i,'1_vendor']
    df_f.loc[i, '2_vendor'] = df.loc[i,'2_vendor']
    df_f.loc[i, '3_vendor'] = df.loc[i,'3_vendor']
    df_f.loc[i, '1_price'] = df.loc[i, 'price_'+df_f.loc[i,'1_vendor']]
    df_f.loc[i, '2_price'] = df.loc[i, 'price_'+df_f.loc[i,'2_vendor']]
    df_f.loc[i, '3_price'] = df.loc[i, 'price_'+df_f.loc[i,'3_vendor']]
    df_f.loc[i, '1_url'] = df.loc[i, 'url_'+df_f.loc[i,'1_vendor']]
    df_f.loc[i, '2_url'] = df.loc[i, 'url_'+df_f.loc[i,'2_vendor']]
    df_f.loc[i, '3_url'] = df.loc[i, 'url_'+df_f.loc[i,'3_vendor']]

enter image description here

EDIT: for the export, just use the to_csv command, if you have issues let me know.

Ok, that should be it.
If I didnt get the question or you have any questions, let me know.
Good luck!<br/

(if it is the correct answer please mark it, thanks)

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top