Python File reading problem, Possible infile loop?

Question

The question is as follows; "Write a Python program to read a file with lake and fish data and set report the lake identification number, the lake name, and the fish weight in a tabular format (use string zones with formatting). The program should calculate the average fish weight reported."

Lake identification;

1000 Chemo
1100 Greene
1200 Toddy

The file I must read "FishWeights.txt" contains the following data;

My code;

f = open("fishweights.txt")
print(f.read(4), "Chemo", f.readline(4))
print(f.read(5), "Greene", f.read(5))
print(f.read(4), "Toddy", f.read(5))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Greene", f.read(4))
print(f.read(5), "Toddy", f.read(4))

The output I receive is;

1000 Chemo  4.0

1100 Greene  2.0

1200 Toddy  1.5

1000  Chemo 2.0

1000  Chemo 2.2

1100  Greene 1.9

1200  Toddy 2.8

This is correct to the extent that I must have the Lake's ID number, Name, and fish weight per lake displayed. But I need to be able to have a calculation where it averages all the fishes weights at the end. The output SHOULD be formatted neatly and look as follows;

1000     Chemo      4.0
1100     Greene     2.0
1200     Toddy      1.5
1000     Chemo      2.0
1000     Chemo      2.2
1100     Greene     1.9
1200     Toddy      2.8
The average fish weight is: 2.34

Any help is appreciated, Just a beginning coder here seeking help to have a full understanding of the subject. Thank you!

Rather than just printing the data as you read it in, you will need to store the values in variable(s) and act on them to do the math as you read them in — G. Anderson
– G. Anderson, Commented Oct 24, 2018 at 16:43
How are you receiving the Lake Id info ? Is that in another text file or you are free to format it as you please? — vash_the_stampede
– vash_the_stampede, Commented Oct 24, 2018 at 17:20

freakish · Accepted Answer · 2018-10-24 16:56:16Z

Yes, you need to loop over lines. This is the construct you are looking for:

with open("fishweights.txt") as fo:
    for line in fo:
        pass

Now in order to retrieve each piece of each line you can use line.split(). Reading fixed number of bytes (as you did) is good assuming that ids are of fixed length. Are you sure that each id will always have exactly 4 digits? Something like this might be better:

raw_data = []
with open("fishweights.txt") as fo:
    for line in fo:
        row = line.strip().split()
        if not row:
            continue  # ignore empty lines
        id = int(row[0])
        no = float(row[1])
        raw_data.append((id, no))

Now that you have raw data you need to aggregate it:

sum = 0
count = 0
for id, no in raw_data:
    sum += no
    count += 1
avg = sum / count

or one-liner

avg = sum(no for id, no in raw_data) / len(raw_data)

and finally you need a mapping of ids into names for the final print:

id_to_name = {
    1000: 'Chemo',
    1100: 'Greene',
    1200: 'Toddy',
}
for id, no in raw_data:
    print(id, id_to_name[id], no)
print('Average: ', avg)

Of course all three loops can be combined into one loop. I divided it so that you can clearly see each stage of the code. The final (and a bit optimized) result may look like this:

id_to_name = {
    1000: 'Chemo',
    1100: 'Greene',
    1200: 'Toddy',
}
sum = 0
count = 0
with open("fishweights.txt") as fo:
    for line in fo:
        row = line.strip().split()
        if not row:
            continue  # ignore empty lines
        id = int(row[0])
        no = float(row[1])
        sum += no
        count += 1
        print(id, id_to_name[id], no)
print('Average:', sum/count)

BernardL · Accepted Answer · 2018-10-24 16:54:56Z

You can store your lake names into a dictionary and your data in a list. From there you just have to loop through your list fish in this example and get the lake names corresponding to the id. Finally print your average below by just summing the weight from the list and dividing it by the length of fish.

with open('LakeID.txt','r') as l:
    lake = l.readlines()
    lake = dict([i.rstrip('\n').split() for i in lake])

with open('FishWeights.txt','r') as f:
    fish = f.readlines()
    fish = [i.rstrip('\n').split() for i in fish]

for i in fish:
    print(i[0],lake[i[0]],i[1])    

print('The total average is {}'.format(sum(float(i[1]) for i in fish)/len(fish)))

It is also encouraged that you use the with open(..) context manager that makes sure the file is closed when it exits.

tomh1012 · Accepted Answer · 2018-10-24 16:57:12Z

0

So here you could store your fish weight and lake data in two arrays. See the following where it reads each line, then splits them up into a list of fish weights and a list of lake data.

text=f.readlines()
fishWeights=[] 
lakeData=[]
for item in text:
    fishWeights.append(item.split(' ')[1])
    lakeData.append(item.split(' ')[1])

From here you can output the information with

for i in range(len(fishWeights)) :
    print(lakeData[i], "Your Text", fishWeights[i])

And you can work out your average with

total=0
for weight in fishWeights:
    total+=weight
total/=len(fishWeights)

answered Oct 24, 2018 at 16:57

tomh1012

2841 silver badge11 bronze badges

Comments

Farhan.K · Accepted Answer · 2018-10-24 17:05:26Z

You don't need to use offsets to read the lines. Also, you can use with to make sure the file is closed when you're done. For the average you can put all the numbers in a list and find the average at the end. Use a dictionary to map lake IDs to names:

lakes = {
    1000: "Chemo",
    1100: "Greene",
    1200: "Toddy"
}
allWeights = []

with open("test.txt", "r") as f:
    for line in f:
        line = line.strip()  # get rid of any whitespace at the end of the line
        line = line.split()

        lake, weight = line
        lake = int(lake)
        weight = float(weight)
        print(lake, lakes[lake], weight, sep="\t")
        allWeights.append(weight)

avg = sum(allWeights) / len(allWeights)
print("The average fish weight is: {0:.2f}".format(avg)) # format to 2 decimal places

Output:

1000    Chemo   4.0
1100    Greene  2.0
1200    Toddy   1.5
1000    Chemo   2.0
1000    Chemo   2.2
1100    Greene  1.9
1200    Toddy   2.8
The average fish weight is: 2.34

There are more efficient ways to do this but this is probably the most simple to help you understand what's going on.

Prince Francis · Accepted Answer · 2018-10-24 17:38:20Z

It can be achieved eazily using dataframe. Please find the sample code below.

import pandas as pd

# load lake data into a dataframe
lakeDF = pd.read_csv('Lake.txt', sep=" ", header=None)
lakeDF.columns = ["Lake ID", "Lake Name"]
#load fish data into a dataframe
fishWeightDF = pd.read_csv('FishWeights.txt', sep=" ", header=None)
fishWeightDF.columns = ["Lake ID", "Fish Weight"]
#sort fishweight with 'Lake ID' (common field in both lake and fish)
fishWeightDF = fishWeightDF.sort_values(by= ['Lake ID'],ascending=True)
# join fish with lake
mergedFrame = pd.merge_asof(
    fishWeightDF, lakeDF,
    on='Lake ID'
    )
#print the result
print(mergedFrame)
#find the average
average = mergedFrame['Fish Weight'].mean()
print(average)

Collectives™ on Stack Overflow

Python File reading problem, Possible infile loop?

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related