1

I want to split file according to it's content.
My dummy file looks like that:

info   info    info    cat
info   info    info    cow
info   info    info    dog
info   info    info    dinosaur 
info   info    info    bat

$4 words starts with different letters (C,D,B) - I want to split file into multiples according to the first letter of $4.
Preferable output (3 different files) looks like that:

file_c

info   info    info    cat  
info   info    info    cow

file_d

info   info    info    dog
info   info    info    dinosaur 

file_b

info   info    info    bat

Hope someone can help me with this.

1

4 Answers 4

5

this oneliner should work:

awk '{print $0 > "file_"substr($4,0,1)}' yourfile
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks it works! Though I had to replace substr($4,0,1) with substr($4,1,1) - don't know why.
because "0" is assumed on some awk distribution, to be silently replaced by "1", but it's not the case on all awk out there.
3
awk '{name="file_"substr($4,0,1);print >name}' your_file

tested below:

> cat temp
info   info    info    cat
info   info    info    cow
info   info    info    dog
info   info    info    dinosaur 
info   info    info    bat
> awk '{name="file_"substr($4,0,1);print >name}' temp
> cat file_b
info   info    info    bat
> cat file_c
info   info    info    cat
info   info    info    cow
> cat file_d
info   info    info    dog
info   info    info    dinosaur 

1 Comment

Thanks it works! Though I had to replace substr($4,0,1) with substr($4,1,1) - don't know why.
2
$ while read a b c d; do echo $a $b $c $d >> file_${d:0:1}; done < dummy.txt 

2 Comments

something went wrong. There was only one file_ & it's content was "info info info bat". No file for a or d
@Poe Strange, it works for me. I tried to change spaces to tabs in the input, but it's still working for me. Are you sure you're on bash? Are there backslashes in the real file?
0

Using Python

with open("temp.txt","r") as f:
     for line in f.readlines():
        col = line.split()[3][0]
        filename = "file_"+col
        f2 = open(filename,"a")
        f2.write(line)
f2.close()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.