2

Do u know, why am I getting this ID ÐоÑРееÑÑÑа instead of getting ID ГосРеестра. I know that there is some issue with encoding, because it's cyrillic. Have no idea how to solve it.

Scraping web-page is link

My code is:

dfo_url = "https://opi.dfo.kz/p/ru/DfoObjects/objects/teaser-view/26730?OptionName=ExtraData"
r = requests.get(dfo_url)

tree = html.fromstring(r.content)
tr_elements = tree.xpath('//tr')
#Create empty list
col=[]
i=0
#For each row, store each first element (header) and an empty list
for t in tr_elements[2]:
    i+=1
    name=t.text_content()

    print ('%d:"%s"'%(i,name))
    col.append((name,[]))

1 Answer 1

2

This may fix it, try to do this right before the print:

name.encode(encoding='UTF-8',errors='strict')

Or try this link.

Sign up to request clarification or add additional context in comments.

2 Comments

@Dias take a look at this
No problem, if you want you can accept my answer, i will update it with the link :) @Dias

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.