1

I have an array with data and I notice that I have each data twice. Is there any method to remove the duplicate data to simplify the array content? Below is the code that I made in python:

import requests
import re
import bs4

r = requests.get("http://as.com/tag/moto_gp/a/")

r.raise_for_status()

html = r.text


matches = re.findall(r"http://motor\.as\.com/motor/\d+/\d+/\d+/motociclismo/\d+_\d+.html", html)

print (matches)

1 Answer 1

7

I hope your matches is a list.Then you can use simple method.

In [1]: a = [1,1,2,2,3,3,4,4,5]
In [2]: list(set(a))
Out[2]: [1, 2, 3, 4, 5]

For your code only one modification.

matches = list(set(re.findall(r"http://motor\.as\.com/motor/\d+/\d+/\d+/motociclismo/\d+_\d+.html", html)))
Sign up to request clarification or add additional context in comments.

1 Comment

@SergeiLebedev You are right. It will work with all iterable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.