I have a tsv file with multiple columns. There are 10 and more columns but the columns important to me are the ones with the name user_name, shift_id, url_id. I want to create a data frame that first separates the entire csv file based on user_names i.e only rows with same user_name are grouped together. From that chunk I make another chunk where only rows with certain shift_id are grouped together and then from that chunk make a chunk with same url. I unfortunately cannot share the data because of the company rule and making an imaginary data table might be more confusing.
Two of the other columns have time-stamps. I want to calculate the time duration of the chunk but only after I group chunk according to those columns.
I have seen answers that split data-frame by a specific column value,but in my case I have three column values and the order in which they are separated matters too.
Thank you for your help!
dataframethat selects only 3 columns that are important?dataframewithwhere username = x, col2 = y and col3 = z?