perl bash script outputing input

Question

I'm trying to learn bash scripting. As an exercise, I'm getting the Alt text and URL of the Google doodle.

I am stuck on using perl to parse out the link URL. I have it finding and outputting the alt text and url, but it is also outputting the whole webpage too. It does the same thing when I just put it in shell.

curl -s google.com --Location | perl -pe 's|.*<img.*alt="(.*?)".*src="(.*?)".*>.*|\1 http://google.com\2|'

How can I get this to stop outputting the webpage.

Note that I tried separating these to make sure it was perl doing to output of the page and not something with curl. It is definitely the perl part. If there is a better way to do this, let me know. The goal is to output the alt text and URL of the doodle.

Borodin · Accepted Answer · 2017-06-19 10:50:39Z

2

This is an ugly way to do things, but it may work if you print each line from the web page where a substitution has been made

perl -ne 'print if s/<img.*alt="(.*?)".*src="(.*?)".*>/$1 http://google.com$2/'

But it would be cleaner to do just a regex match and use negated character classes instead of non-greedy quantifiers

perl -ne 'print "$1 http://google.com$2\n" if /<img[^<>]+alt="([^"]+)"[^<>]+src="([^"]+)"/'

But both of these rely on (amongst other things) all of the contents of the opening <img> tag appearing on a single line, which isn't necessarily true. They will also report the contents of every <img> element in the page that has both an alt and a src attribute.

edited Jun 19, 2017 at 10:50

answered Jun 18, 2017 at 22:46

Borodin

127k9 gold badges72 silver badges146 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

perl bash script outputing input

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related