0

We're working on a python project, where we're using SQLalchemy to retrieve data from our MySQL database. We have succeeded in establishing a connection to the database:

engine = create_engine("mysql://xxx:xxx@localhost/xxx")
conn = engine.connect()

When we are connected, we are retrieving first and last timestamp, so we can set up a range and an interval of 20 minutes. We are doing so by using following line of code:

lastLpnTime = pd.read_sql('SELECT MAX(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) FROM Raw_Data WHERE topic = "lpn1"', conn).astype(str).values.tolist()

We then want to round down the timestamp for firstLpnTime, so the minutes will be round down to either xx.00, xx.20, xx:40, so for example 15:09:25 will round down to 15:00:00

firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')

And then we set up a range between firstLpnTime and lastLpnTime and an interval of 20 min

index = pd.date_range(start=firstLpnTime.min(), end=lastLpnTime.max(), freq='20T')

However... we are receiving an error for this line of code:

firstLpnTime = pd.to_datetime(firstLpnTime).round('20T')

Error message is saying:

TypeError: arg must be a string, datetime, list, tuple, 1-d array, or Series

Variable firstLpnTime has an astype(str), so it should be a string, so we don't get why it's not working. Maybe the datatype is not changing to string?

The output, when I print firstLpnTime, is: [['2019-07-26 15:09:25']]

2 Answers 2

1

It looks like your firstLpnTime is nested within an array.

So, you can try taking first index from this variable and then casting it using pandas.

firstLpnTime = pd.to_datetime(firstLpnTime[0]).round('20T')

Then, firstLpnTime should give this result:

DatetimeIndex(['2019-07-26 15:00:00'], dtype='datetime64[ns]', freq=None)

I hope it solves your problem.

Sign up to request clarification or add additional context in comments.

6 Comments

pandas.read_sql returns a dataframe. Then, you are using dataframe.values which returns numpy.ndarray. Hence, you are getting nested array.
Can this be prevented? And if so, how do I do that?
You can try using dataframe directly. It should be much easier as you can extract single value easily then perform casting.
I'm not sure if I understand what to look for in the answer from the link you provided. Do you mind elaborating? That would make my day!
|
0

In order to avoid value nested within array, you can use DataFrame directly as shown in below code:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('mysql://xxxx:xxxx@localhost:3306/test')
conn = engine.connect()

lastLpnTime = pd.read_sql('SELECT MAX(timestamp) as max_time FROM 
     test.test_table WHERE topic = "lpn1";', conn).iloc[0]['max_time']
firstLpnTime = pd.read_sql('SELECT MIN(timestamp) as min_time FROM 
     test.test_table WHERE topic = "lpn1"', conn).iloc[0]['min_time']

firstLpnTime = firstLpnTime.round('20T')
firstLpnTime

This will provide the expected output in Pandas Timestamp class.

Timestamp('2019-07-26 15:00:00')

Hope it helps! :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.