Using Python for adding column in a CSV file

Question

I have csv file (inputFile) like below:

Temperature,2,3
Temperature,5,6
Pressure,11,14,45
Pressure,13,23,16
Humidity,21,24,25
Humidity,27,28,26

and I want to write it into another file(outputFile), but in the following format:

Temperature,2,3,Pressure,11,14,45,Humidity,21,24,25
Temperature,5,6,Pressure,13,23,16,Humidity,27,28,26

I have tried following Python code:

with open('inputFile.csv','r') as csvinput:
	with open('outputFile.csv','w') as csvoutput:
		writer = csv.writer(csvoutput, delimiter= ',')
		writer = csv.writer(csvoutput)
		for row in csv.reader(csvinput):
			if (row[0] == "Pressure" or row[0] == "Humidity"):
				type =row[0]
				Value = row[1])
	        writer.writerow(row + [np.asarray(type)] + [np.asarray(Value)])

Which is giving the output in the follwoing format:

Temperature,2,3,Humidity,27

Temperature,5,6,Humidity,27

Temperature,8,9,Humidity,27

Pressure,11,14,45,Pressure,11

Pressure,13,23,16,Pressure,13

Humidity,21,24,25,Humidity,21

Humidity,27,28,26,Humidity,27

Please help!

Is the input format always the same? i.e. is it safe to assume equal rows for temperature, pressure and humidity? All in order? — shad0w_wa1k3r
– shad0w_wa1k3r, Commented Mar 19, 2017 at 10:28
yes the number of rows of temperature, humidity, pressure will always same like here it is 2 — user2398267
– user2398267, Commented Mar 19, 2017 at 10:31

user4252362 · Accepted Answer · 2017-03-19 10:46:13Z

3

Binary mode is missing.

To increase readability I suggest to separate read, change data and write because you have to read the whole input file before writing.

Example (without error handling):

  import csv
  f = open('inputFile.csv','rb')
  reader = csv.reader(f)
  data = {}
  keys = set ()
  for row in reader : 
    key = row [0]
    data.setdefault (key, []).append (row) 
  f.close ()
  odata = []
  for (t, p, h) in zip (data ["Temperature"], data ["Pressure"], data   ["Humidity"]) :
    odata.append (t + p + h)
  g = open('outputFile.csv','wb')
  writer = csv.writer (g)
  writer.writerows (odata)
  g.close ()

answered Mar 19, 2017 at 10:46

user4252362

Sign up to request clarification or add additional context in comments.

Comments

James Wilson · Accepted Answer · 2017-03-19 10:39:57Z

0

Try opening the file as wb rather than w.

This may only apply if you are running on Windows. It's an issue with line seperators in file handles.

answered Mar 19, 2017 at 10:39

James Wilson

95013 silver badges32 bronze badges

2 Comments

user2398267 Over a year ago

Ok, that's removes the problem of new line, but i have issues also in writing the output file, as it is not same as i want..

James Wilson Over a year ago

One problem solved:) It looks like in the if row == Pressure or bit you are only checking for two things - have you tried checking for all three?

shad0w_wa1k3r · Accepted Answer · 2017-03-19 10:45:01Z

0

import csv

with open('inputFile.csv','r') as csvinput:
    with open('outputFile.csv','w') as csvoutput:
        writer = csv.writer(csvoutput, delimiter=',')
        types = ('temperature', 'pressure', 'humidity')
        data = {key: [] for key in types}
        for row in csv.reader(csvinput):
            data[row[0].lower()].append(row[1:])
        for entry_no in range(len(data['temperature'])):
            row = []
            for key in types:
                row.extend([key.title()]+data[key][entry_no])
            writer.writerow(row)

answered Mar 19, 2017 at 10:45

shad0w_wa1k3r

13.4k11 gold badges72 silver badges94 bronze badges

Comments

Stian · Accepted Answer · 2017-03-19 11:01:15Z

0

If you could do anything with the way the inputFile.csv is written it would make life for you much easier. Either way here is an pandas alternative that does solve your problem.

import pandas as pd

df = pd.read_csv('inputfile.csv', names=['type', 'val1', 'val2', 'val3'])
df = df.T

a = range(0, len(df.columns))
rows = [a[::2], a[1::2]]

dic = {}
for i in range(0, 2):
    dic[i] = [df[df.columns[j]].tolist() for j in rows[i]]
    dic[i] = [j for x in dic[i] for j in x]
    dic[i] = [x for x in dic[i] if str(x) != 'nan']
df1 = pd.DataFrame(dic)
df1.T.to_csv('outputFile.csv', index=False, header=False)

answered Mar 19, 2017 at 11:01

Stian

7847 silver badges10 bronze badges

Collectives™ on Stack Overflow

Using Python for adding column in a CSV file

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related