0

I have a function which outputs a list of email addresses found in a large text file and outputs as follows:

[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected], [email protected]
[email protected]
[email protected], [email protected]
[email protected]
[email protected], [email protected]

I need to convert this output into a comma delimited array of email address that I can iterate over in a for loop, I also would like to remove duplicates.

I've tried a few variations of sed and not really been able to do what I want. Any tips would be brilliant.

4
  • 3
    Please add your desired output (no description) for that sample input to your question (no comment). Commented Jul 13, 2020 at 10:48
  • 3
    Please show what you tried. Commented Jul 13, 2020 at 10:49
  • 2
    "Array" is a data structure, "comma delimited" describes a string. Data structures aren't delimited, they're abstract. Commented Jul 13, 2020 at 10:54
  • If you want help with your problem then please show your attempt in solving this. Commented Jul 13, 2020 at 20:41

2 Answers 2

1

Here is the quick and dirty awk that will do this for you:

awk 'BEGIN{FS="[[:blank:],]+"; OFS=","}{for(i=1;i<=NF;++i) a[tolower($i)]}
     END{s=""; for(i in a) s=s (s?OFS:"") i; print s}' file

This takes care of duplicate emails with different capitalisation. It does not sort the list.

If you want the order to be identical, I would do this:

awk 'BEGIN{FS="[[:blank:],]+"; OFS=","}
     { for (i=1;i<=NF;++i) {
         e=tolower($i)
         if (!(e in a)) { printf (p==0?"":OFS) "%s", e; a[e]; p=1 }
     }}' file
Sign up to request clarification or add additional context in comments.

Comments

0

have you tried replacing only '\n' with ',' ? sort -u before running the sed would ensure unique values (no duplicates)

I copied your email list and stuck it into email.txt

sort -u email.txt -o email.txt && sed -i ':a;N;$!ba;s/\n/,/g' email.txt

Here is the output of the file that I got from the above command

[email protected],[email protected],[email protected], [email protected],[email protected],[email protected], [email protected],[email protected]

using tr should work as well

tr '\n' ',' < email.txt > csv.out

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.