I am trying to extract data from https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population for my project. I am trying to take the data from the top 20 cities into a pandas dataframe as follows: RANK | CITY | LATITUDE | LONGITUDE
This is so that I can extract the coordinates in the later part of my code and calculate the various parameters I need. This is what I have come up with so far, but it seems to be failing:
rank=[]
city=[]
state=[]
population_present=[]
population_past=[]
changepercent=[]
info = requests.get('https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population').text
bs = BeautifulSoup(info, 'html.parser')
for row in bs.find('table').find_all('tr'):
p = row.find_all('td')
for row in bs.find('table').find_all('tr'):
p= row.find_all('td')
if(len(p) > 0):
rank.append(p[0].text)
city.append(p[1].text)
latitude.append(p[2].text.rstrip('\n'))