0

Below is the scenario:

Given a file that contains a log (timestamp, customer id, page id), please write a script to parse it and output the list of pages visited by each customer.

Input CSV File:

Time, Customer ID, Page ID

1, C1, P1
2, C2, P2
3, C3, P3
4, C2, P1
5, C2, P3
6, C2, P2
7, C1, P3
8, C1, P2
9, C3, P1
10, C2, P1
11, C2, P3
12, C2, P2
13, C1, P1
14, C1, P3
15, C1, P2

Example execution of script. The Customer ID must be passed as a parameter. That is, ./script "C1"

Output:

P1, P3, P2, P1, P3, P2

As of now, I got the following code to parse a CSV file

Code:

INPUT=/filepath/customers.csv
CUSTOMER_NAME=$1
OLDIFS=$IFS
IFS=','
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read f1 f2 f3
do
        echo "Time : $f1"
        echo "Customer ID : $f2"
        echo "Page_ID : $f3"
done < $INPUT
IFS=$OLDIFS

How can I write the logic to filter the data based on the customer input?

1
  • 3
    Why don't you use awk for this? Like awk -F', ' -vq=C1 '$2==q{printf "%s%s",sep,$3;sep=FS} END{print ""}' file, or keep pages in an array and join them in the END block Commented Nov 27, 2019 at 10:05

1 Answer 1

1

Your scrip was not far from what you wanted.

Let's see what is missing:

  • you read the customer ID in $f2 but when reading, the space between comma and the customer name is stored in variable. (Check it with echo "f2 is: \"$f2\"").

    To remove the extra space, you can use tr: CNAME=$(echo "$f2" | tr -d ' \t') will remove space from f2 and store the result in CNAME

  • Once you've get the customer name from file, you can compare it with CUSTOMER_NAME

  • For the output, you can store the pages index in a RESULT variable inserting the necessary comma.

So your script could looks like:

#!/bin/sh
INPUT=customers.csv                                            
CUSTOMER_NAME=$1                                               
OLDIFS=$IFS                                                    
IFS=','         
RESULT=""
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }  
while read f1 f2 f3                                            
do                                                             
    CNAME=$(echo "$f2" | tr -d ' \t')                              
    if [ "$CNAME" = "$CUSTOMER_NAME" ]                         
    then                                                                   
        if [ -z "$RESULT" ] 
        then
            RESULT="$f3"
        else
            RESULT="$RESULT,$f3"
        fi
    fi                                                         
done < $INPUT                                                  
IFS=$OLDIFS
echo "$RESULT"

Note that if one of the customer ID has a space in it, this script won't work.

You should consider using awk as suggested in comments.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.