0

Alright,

I got following Code working:

import re


with open('html.txt') as f:
    urls = f.read()
    links = re.findall('"((http)s?://.*?)"', urls)
for url in links:
    print(url[0].replace("#038;", "")) #Replace is for making invalid URL in a working one

HTML Textfile sample:

<td class="download-file" data-title="Download">
      <a href="https://URL.com/?download_file=259&#038;order=wc_order_xBxDxBxD&#038;emailtestmail%40gmail.com&#038;key=1234-1234-1234-1234-12345678" class="woocommerce-MyAccount-downloads-file button alt">
    INSTRUCTION</a>                 

</td>

Problem:

There are couple of those Links in the HTML.txt File i created.

I also have a List of strings that match the URL Text, example: [Instruction, File2, File3, etc...]

Now I would like to match the strings in the List with the matching URL in my .txt File.

Basicly I want to create a Second List, that has the URL's of the matching Strings

However its not important that I have a specific order in the List, I just want to make sure each String in my List [Instruction, File2, File3, etc...] finds his matching URL from the Textfile.

Really struggled alot and cant find a solution, so I really appreciate your help on this matter.

1
  • The output of my List = ['Instruction', 'File2', 'File3', ...] Commented May 8, 2020 at 15:32

1 Answer 1

1

You may want to consider using the BeautifulSoup library to parse HTML files (I would also clarify that it looks like you are parsing a .html file, not a .txt file.) (Unfortunately I do not have enough reputation to comment.)

Sign up to request clarification or add additional context in comments.

1 Comment

Hey there, initaliy im using BeautifulSoup, its just I couldnt get the HTML-Source AFTER Login... really tried it, didnt work out. thats why i just copy pasted the source in the txt file. Its intentional that im parsing a .txt file in this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.