0

I'm trying to pull a single value back from a MSSQL DB, the code looks like:

import pandas as pd
import pypyodbc
query = 'SELECT TOP(1) value FROM [dbo].[table]'
connection = pypyodbc.connect(...)
df = pd.read_sql_query(query, connection)

But this returns a dataframe object, is there a method to instead just return a single string value, possibly without Pandas. Key priority being speed. Is there a quicker way generally of connecting/querying?

3
  • 1
    A few things in your question are not clear to me. 1) There is no "WHERE" clause - don't you care about what single row to select? 2) Why speed is important? It looks like you want to use this approach in a loop - am i right? Can you explain bit more about what you want to achieve? Commented Jul 26, 2017 at 10:49
  • Sorry, no WHERE for brevity, I'm not concerned by the SQL syntax just connection. Yes, it will be used in a while loop to constantly acquire new values according to an algorithm. Commented Jul 26, 2017 at 10:59
  • In this case I would suggest you to put more details about your algorithm and post a small reproducible data set (in CSV/TEXT form) and your desired data set in the question. Most probably there is a way to implement your algorithm in vectorized Pandas/Numpy/SciPy/SKLearn way. In this case you will not need to loop through your data row by row - you will read all data (or a chunk that fits in memory) and process the whole data set. Commented Jul 26, 2017 at 11:06

1 Answer 1

1

Consider implementing your algorithm in a vectorized Pandas/NumPy/SciPy/SKLearn way if processing speed is important for you.

Using vectorized approach usually means using internal functions that are working with vectors and matrices instead of scalars and that are implemented with C, C-Python, etc. (optimized) instead of writing loops.

If your algorithm can't be vectorized you still can speed up your algorithm - read all the data you want to process at once instead of doing it in a loop:

query = 'SELECT value FROM [dbo].[table]'  # <-- NOTE: i have intentionally removed `TOP(1)`
connection = pypyodbc.connect(...)
df = pd.read_sql_query(query, connection)

# process your data (from the DataFrame) here...
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks and good point, I didn't think of retrieving the data prior to analysis. Seems obvious now...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.