1

Python noob here. I'm trying to print lines that contain a substring in an HTML file with Python. I know that the string is in the file because when I ctrl+f the string I'm searching for in the html file I find it. However when I run my code it doesn't print the desired result. Could someone explain what I'm doing wrong?

import requests
import datetime


from BeautifulSoup import BeautifulSoup

now =datetime.datetime.now()

cmonth = now.month
cday = now.day
cyear = now.year
find = 'boxscores/201'


url = 'http://www.basketball-reference.com/boxscores/index.cgi?lid=header_dateoutput&month={0}&day=17&year={2}'.format(cmonth,cday,cyear)
response = requests.get(url)
html = response.content
print html

for line in html:
    if find in line:
        print line

2 Answers 2

2

As snakecharmerb said, by using

for line in html :

you iterate over the characters of html when it's a string, not the lines. But you can use

for line in html.split("\n") :

to iterate over the lines.

Sign up to request clarification or add additional context in comments.

1 Comment

Addition to both answers. response.content is in bytes. You have to use response.text to have string and do .split("\n")
1

In the requests package response.content is a string, so you should search like this:

if find in html:
    # do something

By iterating over response.content with

for line in html

you are iterating over the individual characters in the string, not lines.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.