0

I have this file that is constantly gathering data from website visitors:

IP-ADDR : DATE : BITCOIN-ADDR

I was wondering if there is a way to find lines that have the same IP-ADDR but different BITCOIN-ADDR and print them.

For example, running the script on this file:

11.11.11.11 : 19-04-2017 08:01:33am  : 3N1zXzkjYYNcUSZHD98wcG7UXjNxkCXXXX
22.22.22.22 : 19-04-2017 08:01:35am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
12.12.12.12 : 19-04-2017 08:02:24am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBYYYY

every line is different, no output is printed.

Also, is very important that running on

11.11.11.11 : 19-04-2017 08:01:33am  : 3N1zXzkjYYNcUSZHD98wcG7UXjNxkCXXXX
22.22.22.22 : 19-04-2017 08:01:35am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
22.22.22.22 : 19-04-2017 08:02:24am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
22.22.22.22 : 19-04-2017 08:01:35am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
22.22.22.22 : 19-04-2017 08:02:24am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX

won't print anything.

BUT, running on

11.11.11.11 : 19-04-2017 08:01:33am  : 3N1zXzkjYYNcUSZHD98wcG7UXjNxkCXXXX
22.22.22.22 : 19-04-2017 08:01:35am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
22.22.22.22 : 19-04-2017 08:02:24am  : 1HSJDWp5gLybnhowBZcnoYTBBmuJxBYYYY

will see that IP 22.22.22.22 has a different bitcoin address and will print:

1HSJDWp5gLybnhowBZcnoYTBBmuJxBXXXX
1HSJDWp5gLybnhowBZcnoYTBBmuJxBYYYY

I'm using a code someone here helped me with a while ago:

awk -F " : " '{ printf "%s_%s\n" , $1, $3 }' test.txt | sort | sed 's/\(\s*\)\(.*\)\(\s\)/\2/' | uniq | perl -pe 's/(\s*)(.*?)_(.*)/\2/' | uniq -d

which, if run on last example, will print

22.22.22.22

but i can't wrap my head around it to make it work for bitcoin addresses.

Here are three more examples:

1.1.1.1 : 19-04-2017 08:01:33am : aaaaa
2.2.2.2 : 19-04-2017 08:01:33am : bbbbb

3.3.3.3 : 19-04-2017 08:01:33am : ccccc
3.3.3.3 : 19-04-2017 08:01:33am : ccccc

4.4.4.4 : 19-04-2017 08:01:33am : ddddd
4.4.4.4 : 19-04-2017 08:01:33am : eeeee

First example, every ip and btc is different, i don't mind.

Second example, same ip but also same btc, i don't mind that either, it's just a honest returning visitor that's using the same btc over and over, i don't want the script to show that either.

Now, third example, there is a visitor that is abusing the rules and uses different btc addr from the same ip addr. Using the script I have posted, i am able to print his ip and, through another script, to add it to an iptables firewall. But i need another script (the one i'm asking for help here) to print me the following output:

ddddd
eeeee

So i can use another script and block his access.

Some help, please? Thanks!

LE: Found the solution (thanks to @danielbmartin):

awk '{if (index(a[$1],$NF)==0) a[$1]=a[$1]" " $NF}
  END{for (j in a)
  {n=split(a[j],b);
   if (n>1) print j" references "a[j]}}' \
$InFile >$OutFile

1 Answer 1

1
$ cat ip.txt 
1.1.1.1 : 19-04-2017 08:01:33am : aaaaa
2.2.2.2 : 19-04-2017 08:01:33am : bbbbb

3.3.3.3 : 19-04-2017 08:01:33am : ccccc
3.3.3.3 : 19-04-2017 08:01:33am : ccccc

4.4.4.4 : 19-04-2017 08:01:33am : ddddd
4.4.4.4 : 19-04-2017 08:01:33am : eeeee

$ awk -F: '($1 in a) && a[$1]!=$NF{print $1} {a[$1]=$NF}' ip.txt 
4.4.4.4 
  • -F: use : as field separator
  • {a[$1]=$NF} create an array with first column as key and last column as value
  • ($1 in a) && a[$1]!=$NF if first column is already present as key but the value doesn't match
    • print $1 print first column


To print last column

$ awk -F: '($1 in a) && a[$1]!=$NF{print a[$1]"\n"$NF} {a[$1]=$NF}' ip.txt 
 ddddd
 eeeee

Note: this code doesn't take into consideration more than one mismatch

Sign up to request clarification or add additional context in comments.

1 Comment

upvoted, well put together answer. any chance you could check out my recent question on iptables / networking? stackoverflow.com/questions/43508741/… your wisdom would be ever so helpful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.