1

Im using curl to send a POST request in debian linux terminal and its working properly, This is the curl command:

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/

Now i want to capture the content between the <textarea> tags by executing this command:

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>' 

But it returns nothing. I tested the regex and it works properly:

regex101.com

Is the problem with the regex or grep syntax?

4
  • It works when your input is a multiline string. grep parses input line by line. Commented Apr 5, 2018 at 17:18
  • The xmllint answer is obviously the best, but the perl is: perl -0 -ne 'print for /<textarea.*>([\s\S]*?)<\/textarea>/gi' (Your regex works too, if you make the inner group non-capturing ((?:)) Commented Apr 5, 2018 at 17:40
  • using grep -Pzo "<textarea.*>(.*(\n.*)*)<\/textarea>" but <textarea> tag still exist Commented Apr 5, 2018 at 18:56
  • @zzxyz worked perfect with perl Commented Apr 5, 2018 at 18:58

2 Answers 2

2

Since the result of the crucial HTTP request is HTML document the right way is to apply xml/html parsers.

xmllint is one of such:

curl -d "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ \
| xmllint --html --xpath '//textarea/text()' - 2>/dev/null

The output:

PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=1.12 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=1.05 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=61 time=1.14 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 1.052/1.107/1.144/0.039 ms

http://xmlsoft.org/xmllint.html

Sign up to request clarification or add additional context in comments.

Comments

0

by default, grep parses the input individually per line, and your textarea has newlines in it, thus your regex doesn't work. but you can (ab)use the --null-data parameter, then it will separate the input by NULL bytes instead of newlines, and since there's no NULL bytes in your textarea, it works!

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>' --null-data

(but i recommend using a proper HTML parser instead, the xmllint recommended by @RomanPerekhrest would probably be a better solution, if it's available to you)

1 Comment

This will return the <textarea> too. I just want the content between the tags. That xmllint seems working but i want to do this with grep or perl which is possible, Already done with perl in @zzxyz comment, Now im trying with grep.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.