3

I'm currently trying convert a pandas dataframe into a list of tuples. However I'm having difficulties getting the Index (which is the Date) for the values in the tuple as well. My first step was going here, but they do not add any index to the tuple.

Pandas convert dataframe to array of tuples

My only problem is accessing the index for each row in the numpy array. I have one solution shown below, but it uses an additional counter indexCounter and it looks sloppy. I feel like there should be a more elegant solution to retrieving an index from a particular numpy array.

def get_Quandl_daily_data(ticker, start, end):
prices = []
symbol = format_ticker(ticker)


try:
    data = quandl.get("WIKI/" + symbol, start_date=start, end_date=end)
except Exception, e:
    print "Could not download QUANDL data: %s" % e

subset = data[['Open','High','Low','Close','Adj. Close','Volume']]

indexCounter = 0
for row in subset.values:
    dateIndex = subset.index.values[indexCounter]
    tup = (dateIndex, "%.4f" % row[0], "%.4f" % row[1], "%.4f" % row[2], "%.4f" % row[3], "%.4f" % row[4],row[5])
    prices.append(tup)
    indexCounter += 1

Thanks in advance for any help!

1 Answer 1

10

You can iterate over the result of to_records(index=True).

Say you start with this:

In [6]: df = pd.DataFrame({'a': range(3, 7), 'b': range(1, 5), 'c': range(2, 6)}).set_index('a')

In [7]: df
Out[7]: 
   b  c
a      
3  1  2
4  2  3
5  3  4
6  4  5

then this works, except that it does not include the index (a):

In [8]: [tuple(x) for x in df.to_records(index=False)]
Out[8]: [(1, 2), (2, 3), (3, 4), (4, 5)]

However, if you pass index=True, then it does what you want:

In [9]: [tuple(x) for x in df.to_records(index=True)]
Out[9]: [(3, 1, 2), (4, 2, 3), (5, 3, 4), (6, 4, 5)]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your reply Ami. Your answer is very helpful! I am curious however if I can use reset_index() function because the index is not arbitrary. Each index is the Date for the Open, High, Low, Close, Volume price data for a specific stock. So I would like to somehow use previous indexs used in the 'subset' numpy array. Would this still achieve the functionality I am trying to create?
@user3547551 I've shortened the steps a bit so that it doesn't use reset_index in any case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.