Help with an if else loop in python

Question

Hi here is my problem. I have a program that calulcates the averages of data in columns. Example

Bob
1
2
3

the output is

Bob
2

Some of the data has 'na's So for Joe

Joe
NA
NA
NA

I want this output to be NA

so I wrote an if else loop

The problem is that it doesn't execute the second part of the loop and just prints out one NA. Any suggestions?

Here is my program:

with open('C://achip.txt', "rtU") as f:
    columns = f.readline().strip().split(" ")
    numRows = 0
    sums = [0] * len(columns)

    numRowsPerColumn = [0] * len(columns) # this figures out the number of columns

    for line in f:
        # Skip empty lines since I was getting that error before
        if not line.strip():
            continue

        values = line.split(" ")
        for i in xrange(len(values)):
            try: # this is the whole strings to math numbers things
                sums[i] += float(values[i])
                numRowsPerColumn[i] += 1
            except ValueError:
                continue 

    with open('c://chipdone.txt', 'w') as ouf:
        for i in xrange(len(columns)):
           if numRowsPerColumn[i] ==0 :
               print 'NA' 
           else:
               print>>ouf, columns[i], sums[i] / numRowsPerColumn[i] # this is the average calculator

The file looks like so:

Joe Bob Sam
1 2 NA
2 4 NA
3 NA NA
1 1  NA

and final output is the names and the averages

Joe Bob Sam 
1.5 1.5 NA

Ok I tried Roger's suggestion and now I have this error:

Traceback (most recent call last): File "C:/avy14.py", line 5, in for line in f: ValueError: I/O operation on closed file

Here is this new code:

with open('C://achip.txt', "rtU") as f: columns = f.readline().strip().split(" ") sums = [0] * len(columns) rows = 0 for line in f: line = line.strip() if not line: continue

rows += 1 for col, v in enumerate(line.split()): if sums[col] is not None: if v == "NA": sums[col] = None else: sums[col] += int(v)

with open("c:/chipdone.txt", "w") as out: for name, sum in zip(columns, sums): print >>out, name, if sum is None: print >>out, "NA" else: print >>out, sum / rows

Use "C:\\file" or "c:/file", with the latter usually preferred; Using "//" will be interpreted incorrectly in many cases (just not in this exact one). — Roger Pate
– Roger Pate, Commented Sep 24, 2010 at 14:59
Could you paste an example of what the source file looks like, and a sample of what the complete output should look like? — Josh Wright
– Josh Wright, Commented Sep 24, 2010 at 15:00
...and also, could you include the code of the "second part of the loop"? The code provided only contains two alternative instructions (if/else)... — mac
– mac, Commented Sep 24, 2010 at 15:03

score 1 · Accepted Answer · 2010-09-24 15:58:40Z

1

with open("c:/achip.txt", "rU") as f:
  columns = f.readline().strip().split()
  sums = [0.0] * len(columns)
  row_counts = [0] * len(columns)

  for line in f:
    line = line.strip()
    if not line:
      continue

    for col, v in enumerate(line.split()):
      if v != "NA":
        sums[col] += int(v)
        row_counts[col] += 1

with open("c:/chipdone.txt", "w") as out:
  for name, sum, rows in zip(columns, sums, row_counts):
    print >>out, name,
    if rows == 0:
      print >>out, "NA"
    else:
      print >>out, sum / rows

I'd also use the no-parameter version of split when getting the column names (it allows you to have multiple space separators).

Regarding your edit to include input/output sample, I kept your original format and my output would be:

Joe 1.75
Bob 2.33333333333
Sam NA

This format is 3 rows of (ColumnName, Avg) columns, but you can change the output if you want, of course. :)

edited Sep 24, 2010 at 15:58

answered Sep 24, 2010 at 15:06

Roger Pate

Sign up to request clarification or add additional context in comments.

4 Comments

Roger Pate Over a year ago

@Robert: The code you included in your edit is misindented with the for loop outside of the with, closing the file before the for loop runs. Updated my code to show what I mean.

Roger Pate Over a year ago

@Robert: I also see that the code I wrote (before you included the example) is wrong, as I misinterpreted you. Fixed.

Robert A. Fettikowski Over a year ago

Still not working Roger. Now when i have a name like Joe 2 NA 1....the final value should be 1.5 and it outputs as NA

Roger Pate Over a year ago

@Robert: Using 0.0 instead of 0 for sums (so floating point is used) and I get Joe 1.75, Bob 2.333.., Sam NA for the input sample you gave in the question. These values match what I figure out by hand.

unutbu · Accepted Answer · 2010-09-24 15:28:54Z

0

Using numpy:

import numpy as np

with open('achip.txt') as f:
    names=f.readline().split()
    arr=np.genfromtxt(f)

print(arr)
# [[  1.   2.  NaN]
#  [  2.   4.  NaN]
#  [  3.  NaN  NaN]
#  [  1.   1.  NaN]]

print(names)
# ['Joe', 'Bob', 'Sam']

print(np.ma.mean(np.ma.masked_invalid(arr),axis=0))
# [1.75 2.33333333333 --]

answered Sep 24, 2010 at 15:28

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Comments

Chris · Accepted Answer · 2010-09-24 16:22:21Z

Using your original code, I would add one loop and edit the print statement

    with open(r'C:\achip.txt', "rtU") as f:
    columns = f.readline().strip().split(" ")
    numRows = 0
    sums = [0] * len(columns)

    numRowsPerColumn = [0] * len(columns) # this figures out the number of columns

    for line in f:
        # Skip empty lines since I was getting that error before
        if not line.strip():
            continue

        values = line.split(" ")

        ### This removes any '' elements caused by having two spaces like
        ### in the last line of your example chip file above
        for count, v in enumerate(values):      
            if v == '':     
                values.pop(count)
        ### (End of Addition)

        for i in xrange(len(values)):
            try: # this is the whole strings to math numbers things
                sums[i] += float(values[i])
                numRowsPerColumn[i] += 1
            except ValueError:
                continue 

    with open('c://chipdone.txt', 'w') as ouf:
        for i in xrange(len(columns)):
           if numRowsPerColumn[i] ==0 :
               print>>ouf, columns[i], 'NA' #Just add the extra parts
           else:
               print>>ouf, columns[i], sums[i] / numRowsPerColumn[i]

This solution also gives the same result in Roger's format, not your intended format.

Edmond Sesay · Accepted Answer · 2019-01-02 13:02:58Z

Solution below is cleaner and has fewer lines of code ...

import pandas as pd

# read the file into a DataFrame using read_csv
df = pd.read_csv('C://achip.txt', sep="\s+")

# compute the average of each column
avg = df.mean()

# save computed average to output file
avg.to_csv("c:/chipdone.txt")

They key to the simplicity of this solution is the way the input text file is read into a Dataframe. Pandas read_csv allows you to use regular expressions for specifying the sep/delimiter argument. In this case, we used the "\s+" regex pattern to take care of having one or more spaces between columns.

Once the data is in a dataframe, computing the average and saving to a file can all be done with straight forward pandas functions.

Collectives™ on Stack Overflow

Help with an if else loop in python

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related