0

I have the following .dat file

https://github.com/lukepolson/School/blob/master/Phys%20411/Assignment%205/JamesBay_temperature.dat

When I open it in pandas using

df_james = pd.read_csv('JamesBay_temperature.dat', sep=" ",
                        skiprows=[0,1,2], names=['Temperature'])

the values it contains are an array of arrays:

In [18]: df_james.values
Out[18]:
array([[ 4.89],
       [ 4.89],
       [ 4.89],
       ...,
       [14.77],
       [14.67],
       [14.67]])

Why is pandas doing this? Is it something about the file I'm opening, or am I using pd.read_csv wrong?

1 Answer 1

1

The result that you obtained is not an array of arrays. It is a single numpy array object with float entries:

In [1]: arr = df_james.values

In [2]: type(arr)  # Show object type
Out[2]: numpy.ndarray

In [3]: arr.dtype  # Show data type of array entries
Out[3]: dtype('float64')   

In [4]: arr.shape  # Show number of rows and columns
Out[4]: (2979360, 1)
Sign up to request clarification or add additional context in comments.

2 Comments

I respectfully beg to differ. When I type df_james.values[0] I get the output array([4.89]). This is an issue, because I want to plot the values.
The expression df_james.values[0] creates a new array object. To directly extract the first float value you have to use df_james.values[0, 0].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.