2

I have this dataframe

     Begin    End    Duration  ID
42   40680    40846    167     18

and I want to convert a numpy array in this form :

array([40680 , 40860 ,167,18])

I am using for conversion as_matrix function and I used after it reshape(1,4) but it is not working!! It is getting me this format : [[40680 40846 167 18]] any suggestions please ? I need to convert it to that format so I can apply 'precision_recall_curve' function.

5
  • 1
    Try array.ravel() instead of reshape. So, if the dataframe is df : df.values.ravel() or simply : np.ravel(df). Commented Dec 24, 2016 at 9:17
  • Sorry but even this solution is not working it get me an array like this : [40680 40846 167 18] Commented Dec 24, 2016 at 9:37
  • Isn't that what you were expecting? Commented Dec 24, 2016 at 9:38
  • no . I want it to be this way : array([40680 , 40860 ,167,18]) Commented Dec 24, 2016 at 9:43
  • @jaouaemna, you seems to be confused by the result of print(array) command, which per default doesn't show comas - try @ Divakar's solution: np.ravel(df) in iPython or in Jupiter Commented Dec 24, 2016 at 13:14

1 Answer 1

2

You have something like this:

pd.DataFrame({'a':[1],'b':[2],'c':[3]}, index=[42])
Out[27]: 
    a  b  c
42  1  2  3

You want to get a single row as a NumPy array:

df.loc[42].values
Out[30]: array([1, 2, 3])
Sign up to request clarification or add additional context in comments.

5 Comments

sorry ! but it is not working! this is what I get ! [40680 40846 167 18] and I have applied y_true = eval_seg.loc[42].values! I am sure that the input format is a dataframe this is what I get when I print type of (eval_seg) <class 'pandas.core.frame.DataFrame'>
@jaouaemna: Sorry but I have no idea what you are saying now. Maybe if you can add more detail with full executable code to your question....
well I have as input a dataframe ! and I want to convert it to numpy array as the format that I mentioned before. because I want to use that array as input in a function ''precision_recall_curve'' that calculates the precision and recall between two arrays. So when I use this array [40680 40846 167 18] the function get me as error : "ValueError: Data is not binary and pos_label is not specified"
@jaouaemna: I see. That's a different question--you need to read the docs at scikit-learn.org/stable/modules/generated/… and note that it requires "binary" input, not any numbers as you are using.
Oh I see. I didn't pay attention to that! I will see how I can calculate the precision and recall in a different way or may be I could implement the formula directly in python.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.