0

I have these two dataframe: 1) data here are grouped by station_id( from 1 up to 98) and time( a data every hour from 27-01-2020 to 26-05-2020)

enter image description here

  1. In the second dataframe I have a latitude and longitude value for every station_id.

enter image description here

My aim is to create a list of list in this format:

     latitude           longitude      flow   hour  month  day
[[53.37947845458979, -1.46990168094635, 278.0, 0.0, 1.0, 27.0], 
 [53.379791259765604, -1.46999669075012, 122.0, 0.0, 1.0, 27.0], 
 [53.380035400390604, -1.47001004219055, 58.0, 0.0, 1.0, 27.0], ...]

In order to have a list [latitude, longitude, flow, month, day] for every row in the first dataframe. I tried with the following code:

import pandas as pd
import datetime as dt

df = pd.read_csv("readings_by_hour.csv")
df['time'] = pd.to_datetime(df['time'])
df1 = pd.read_csv("stations_info.csv")

i = 0
a = []
b = []
count = df1['station_id'].count()

while i < count:
    if df['station_id'][i] == df1['station_id'][i]:
        a = print(df1['latitude'][i] + ", " + df1['longitude'][i] + ", " + df['flow'][i] + ", " + df['time'].dt.hour + ", " + df['time'].dt.month + ", " + df['time'].dt.day)
        b += [a]
        i += 1

print(b)

but it seems it doesn't work, indeed didn't giving any output though it didn't give any error.

2 Answers 2

1

You could merge the two dataframes on the station_id column, then create your list of lists like so:

merged_df = pd.merge(df, df1, left_on = 'station_id', right_on = 'station_id')

list_of_lists =[] 
  
# Iterate over each row 
for index, row in merged_df.iterrows():

    # Create list for the current row 
    rowlist =[row.latitude, row.longitude, row.flow, row.hour, row.month, row.day] 
      
    # append the list to the final list 
    list_of_lists.append(rowlist) 

You can use the datetime module to extract the month, day, hour from the Date column

See the pandas docs on pd.merge for more info: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

Sign up to request clarification or add additional context in comments.

Comments

1

In the given code, you are trying to assign return value of print function to a and then adding it to b. Here, the value of a is null. So, when you try to print the value, you will get empty string.

I have made corrections so that it works. Hope it helps..

while i < count:
    if df['station_id'][i] == df1['station_id'][i]:
        a = [df1['latitude'][i],df1['longitude'][i], df['flow'][i], df['time'][i].hour,df['time'][i].month,df['time'][i].day]
        b.append(a)
        i += 1

print(b)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.