i have 3 csv files, 1st has 1m records, 2nd has 2m, 3rd has 5m records. file 1 has columns cust_id,fname,lname file 2 has columns cust_id, prod_id, price, date file 3 has columns prod_id, prod_code, price, quantity
so, what i want is select details of 10 customers from above three files and place them into 3 different new csv files. i.e. for each customer (from 10 customers) i want cust_id,fname,lname from file1 and place the result in new csv file, cust_id, prod_id, price, date from file2 place the result in new csv file, prod_id, prod_code, price, quantity from file3 place the result in new csv file.
code:
import pandas as pd
customers = pd.read_csv("customers10.csv")
customer_details = pd.read_csv("file1.csv")
products = pd.read_csv("file2.csv")
product_items = pd.read_csv("file3.csv")
table1 = pd.DataFrame(columns=file1.columns)
table1 = pd.concat([customer_details[customer_details['cust_id'].isin(customer_details['cust_id'])],table1])
table2 = pd.DataFrame(columns=products.columns)
table2 = pd.concat([products[products['cust_id'].isin(customer_details['cust_id')],table2])
table3 = pd.DataFrame(columns=product_items.columns)
table3 = pd.concat([product_items[product_items['prod_id'].isin(products['prod_id'])],table3])
i want to operate this on files with millions of records, is this efficient to do or there are any other ways?