python: import data from text

Question

I tried importing float numbers from P-I curve.txt file which contains my data. however i get an error when converting this into float. i used the following code.

with open('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt') as csvfile:
    data= csv.reader(csvfile, delimiter = '\t')
    current=[]

    P_15=[]
    P_20=[]
    P_25=[]
    P_30=[]
    P_35=[]
    P_40=[]
    P_45=[]
    P_50=[]

    for row in data:

        current.append(float(row[0].replace(',','.')))  
        P_15.append(float(row[2].replace(',','.')))
        P_20.append(float(row[4].replace(',','.')))
        P_25.append(float(row[6].replace(',','.')))
        P_30.append(float(row[8].replace(',','.')))
        P_35.append(float(row[10].replace(',','.')))
        P_40.append(float(row[12].replace(',','.')))
        P_45.append(float(row[14].replace(',','.')))
        P_50.append(float(row[16].replace(',','.')))

with this code i got the following error which i understand that row 2 is a string but if so then why did this error not occur for row 1. Is there any other data to import float numbers without using csv import? I have copied and pasted the data from excel to a .txt file.

returned error:

  File "C:/Users/Kevin/Documents/Python Scripts/P-I curves.py", line 29, in <module>
    P_15.append(float(row[2].replace(',','.')))

ValueError: could not convert string to float:

I tried another following code:

import pandas as pd

df=pd.read_csv('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt', decimal=',', sep='\t',header=0,names=['current','15','20','25','30','35','40','45','50']  )

#curre=df['current']
print(current)

The txt file has a header and looks like this:

1.8   1.9  0.4     1.9  0.4     1.9  0.4     1.9       0.4
3.8   1.9  1.3     1.9  1.3     1.9  1.3     1.9       1.2
5.8   2.0  2.5     2.0  2.4     2.0  2.3     2.0       2.2
7.8   2.0  3.7     2.0  3.6     2.0  3.5     2.0       3.4
9.8   2.1  5.2     2.0  5.1     2.0  4.9     2.0       4.7
11.8  2.1  6.9     2.1  6.7     2.1  6.4     2.1       6.1
13.8  2.1  9.0     2.0  8.6     2.1  8.2     2.1       7.8
15.8  2.1  11.5    2.1  10.8    2.1  10.2    2.1       9.7
17.8  2.2  14.7    2.2  13.7    2.2  12.7    2.2      11.8
19.8  2.2  19.5    2.2  17.5    2.2  15.9    2.2      14.5
21.8  2.2  28.9    2.2  23.6    2.2  20.3    2.2      17.9
23.8  2.3  125.8   2.2  38.4    2.2  27.8    2.2      22.8
25.8  2.3  1669.0  2.3  634.0   2.3  51.7    2.3      31.4
27.8  2.3  3142.0  2.3  2154.0  2.3  982.0   2.3      62.2
29.8  2.3  4560.0  2.3  3594.0  2.3  2460.0  2.3    1075.0
31.8  2.3  5950.0  2.3  5010.0  2.3  3872.0  2.3    2540.0
33.8  2.4  7320.0  2.4  6360.0  2.4  5230.0  2.3    3880.0
35.8  2.4  8670.0  2.4  7700.0  2.4  6550.0  2.4    5210.0
37.8  NaN  NaN     NaN  NaN     2.4  7850.0  2.4    6480.0
39.8  NaN  NaN     NaN  NaN     NaN  NaN     NaN       NaN
41.8  NaN  NaN     NaN  NaN     NaN  NaN     NaN       NaN
Name: current, dtype: float64

python seems to be returning everything instead of just line 1 which i want by printing the header current. I only want to take this line so i can save it as in an array. But How do i specifically draw the line with header current out of the data?.

I am not sure why it returned everything but i think that there is something wrong with encoding because i copied and pasted the data from excel.

Please look at the image of how the .txt looks like when copied from excel.

i tried out another short code (i also deleted the header manually for the .txt file!!), see description below:

data=np.loadtxt('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/ttest.txt',delimiter='\t')

data=float(data.replace(',','.'))


print(data[0])

with this code, i get the followin error.

ValueError: could not convert string to float: b'1,8'

I find this weird to occur. Is floating and replacing not enough for this

@usr2564301 i copied the data from excel and pasted into a txt file. This would save time. — kevin
– kevin, Commented Jan 11, 2018 at 10:34
Much better approach would be to use Pandas: import pandas as pd; df=pd.read_csv(filename, decimal=',', sep='\t') — Bartłomiej
– Bartłomiej, Commented Jan 11, 2018 at 10:51
@Bartłomiej i approached your suggestions. But this did not return sufficient results, please see the code above — kevin
– kevin, Commented Jan 11, 2018 at 11:59

jezrael · Accepted Answer · 2018-01-12 07:14:39Z

1

I think you need omit header=0:

df=pd.read_csv('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt', 
                decimal=',', 
                sep='\t',
                names=['current','15','20','25','30','35','40','45','50'])

EDIT:

df=pd.read_csv('ttest.txt', 
                decimal=',', 
                sep='\t',
                names=['current','15','20','25','30','35','40','45','50'])
print (df)
    current      15      20      25      30      35      40      45     50
0       1.8     0.4     0.4     0.4     0.4     0.4     0.4     0.3    0.3
1       3.8     1.3     1.3     1.3     1.2     1.2     1.1     1.1    1.1
2       5.8     2.5     2.4     2.3     2.2     2.2     2.1     2.0    1.9
3       7.8     3.7     3.6     3.5     3.4     3.3     3.1     3.0    2.9
4       9.8     5.2     5.1     4.9     4.7     4.5     4.3     4.1    4.0
5      11.8     6.9     6.7     6.4     6.1     5.9     5.6     5.3    5.1
6      13.8     9.0     8.6     8.2     7.8     7.4     7.0     6.6    6.3
7      15.8    11.5    10.8    10.2     9.7     9.1     8.6     8.0    7.6
8      17.8    14.7    13.7    12.7    11.8    11.0    10.3     9.6    9.0
9      19.8    19.5    17.5    15.9    14.5    13.3    12.2    11.3   10.5
10     21.8    28.9    23.6    20.3    17.9    16.0    14.5    13.2   12.2
11     23.8   125.8    38.4    27.8    22.8    19.6    17.2    15.4   14.1
12     25.8  1669.0   634.0    51.7    31.4    24.5    20.6    17.9   16.2
13     27.8  3142.0  2154.0   982.0    62.2    33.1    25.3    21.0   18.5
14     29.8  4560.0  3594.0  2460.0  1075.0    60.0    32.6    25.0   21.3
15     31.8  5950.0  5010.0  3872.0  2540.0   903.0    49.9    30.8   24.6
16     33.8  7320.0  6360.0  5230.0  3880.0  2294.0   387.0    40.9   28.8
17     35.8  8670.0  7700.0  6550.0  5210.0  3621.0  1733.0    71.0   34.8
18     37.8     NaN     NaN  7850.0  6480.0  4880.0  3026.0   751.0   44.6
19     39.8     NaN     NaN     NaN     NaN  6100.0  4240.0  1998.0   70.2
20     41.8     NaN     NaN     NaN     NaN     NaN     NaN  3161.0  650.0

#list from column 15 with all values include NaNs
L1 = df['15'].tolist()
print (L1)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0, nan, nan, nan]

#list from column 15 with removing NaNs
L2 = df['15'].dropna().tolist()
print (L2)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0]

#convert all NaNs in all columns to 0
df = df.fillna(0)
print (df)
    current      15      20      25      30      35      40      45     50
0       1.8     0.4     0.4     0.4     0.4     0.4     0.4     0.3    0.3
1       3.8     1.3     1.3     1.3     1.2     1.2     1.1     1.1    1.1
2       5.8     2.5     2.4     2.3     2.2     2.2     2.1     2.0    1.9
3       7.8     3.7     3.6     3.5     3.4     3.3     3.1     3.0    2.9
4       9.8     5.2     5.1     4.9     4.7     4.5     4.3     4.1    4.0
5      11.8     6.9     6.7     6.4     6.1     5.9     5.6     5.3    5.1
6      13.8     9.0     8.6     8.2     7.8     7.4     7.0     6.6    6.3
7      15.8    11.5    10.8    10.2     9.7     9.1     8.6     8.0    7.6
8      17.8    14.7    13.7    12.7    11.8    11.0    10.3     9.6    9.0
9      19.8    19.5    17.5    15.9    14.5    13.3    12.2    11.3   10.5
10     21.8    28.9    23.6    20.3    17.9    16.0    14.5    13.2   12.2
11     23.8   125.8    38.4    27.8    22.8    19.6    17.2    15.4   14.1
12     25.8  1669.0   634.0    51.7    31.4    24.5    20.6    17.9   16.2
13     27.8  3142.0  2154.0   982.0    62.2    33.1    25.3    21.0   18.5
14     29.8  4560.0  3594.0  2460.0  1075.0    60.0    32.6    25.0   21.3
15     31.8  5950.0  5010.0  3872.0  2540.0   903.0    49.9    30.8   24.6
16     33.8  7320.0  6360.0  5230.0  3880.0  2294.0   387.0    40.9   28.8
17     35.8  8670.0  7700.0  6550.0  5210.0  3621.0  1733.0    71.0   34.8
18     37.8     0.0     0.0  7850.0  6480.0  4880.0  3026.0   751.0   44.6
19     39.8     0.0     0.0     0.0     0.0  6100.0  4240.0  1998.0   70.2
20     41.8     0.0     0.0     0.0     0.0     0.0     0.0  3161.0  650.0

#list from column 15
L3 = df['15'].tolist()
print (L3)
[0.4, 1.3, 2.5, 3.7, 5.2, 6.9, 9.0, 11.5, 14.7, 19.5, 28.9, 125.8, 1669.0, 
 3142.0, 4560.0, 5950.0, 7320.0, 8670.0, 0.0, 0.0, 0.0]

edited Jan 12, 2018 at 7:14

answered Jan 11, 2018 at 12:00

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

kevin Over a year ago

python must have picked up something else when copying and pasting this data

jezrael Over a year ago

If need no data like excel screen, it is in pandas not possible, for it is always used NaN. Check missing data in docs

kevin Over a year ago

Thank you for your reply. I understand that no data is present, a value NaN is returned. That is correct as NaN is present in the .txt and no data is given. My goal here however is to return any row from the data imported by panda. The problem is that when i try to print out any desired row, it returns all rows.

jezrael Over a year ago

Do you try loc? Like for row with index=4 df = df.loc[4] ?

kevin Over a year ago

I still did not get line 4 when using this code df = df.loc[4] . It prints out exactly as shown in the problem above

|

kevin · Accepted Answer · 2018-01-11 20:08:06Z

0

if importing data from .txt file as csv, the missing data should be added. So in this by manually adding 0 to the .txt file and retrying this code with open('C:/Users/Kevin/Documents/4e Jaar/fotonica/Metingen/P-I curve.txt') as csvfile: data= csv.reader(csvfile, delimiter = '\t') current=[]

P_15=[]
P_20=[]
P_25=[]
P_30=[]
P_35=[]
P_40=[]
P_45=[]
P_50=[]

for row in data:

    current.append(float(row[0].replace(',','.')))  
    P_15.append(float(row[2].replace(',','.')))

 print(P_15)

it works for any row to print out.

answered Jan 11, 2018 at 20:08

kevin

1514 silver badges15 bronze badges

Collectives™ on Stack Overflow

python: import data from text

2 Answers 2

10 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

10 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related