find max of a column in a csv file using python

Question

I am trying to find max of below colm in csv

list['1154293', '885773', '-448704', '563679', '555394', '631974', '957395', '1104047', '693464', '454932', '727272', '125016', '339251', '78523', '977084', '1158718', '332681', '-341227', '173826', '742611', '1189806', '607363', '-1172384', '587993', '295198', '-300390', '468995', '698452', '967828', '-454873', '375723', '1140526', '83836', '413189', '551363', '1195111', '657081', '66659', '803301', '-953301', '883934']

I ran the code i wrote

  for row in csvReader:


        Revenue.append(row[1])
        max_revenue=max(Revenue)
        print("max revenue"+str(max_revenue))

But somhow its not fetching max value , output am getting is

        max revenue 977084

Please advice ,

Because it's treating them as strings... you need to get the max of the integer values. "9" is greater than "11" if it's a string — TemporalWolf
– TemporalWolf, Commented Mar 7, 2018 at 18:50
Apart from the issue with strings, don't put the max() inside the loop. — Matt Hall
– Matt Hall, Commented Mar 7, 2018 at 18:57

abarnert · Accepted Answer · 2018-03-07 22:48:05Z

1

The problem here is that you're building a list of the column-1 strings, but then expecting to find the max as a number, not as a string.

You could fix that by building a list of the column-1 strings mapped to integers, as other answers show:

for row in csvReader:
    Revenue.append(int(row[1]))
max_revenue=max(Revenue)

But another way is to use a key function for max:

for row in csvReader:
    Revenue.append(row[1])
max_revenue = max(Revenue, key=int)

Even better, you can use the same idea to not need that whole separate Revenue list:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))

This means you get the whole original row, not just the integer value. So, if, say, column 2 is the username that goes with the revenue in column 1, you can do this:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
best_salesman_name = max_revenue_row[2]

This also avoids building a whole extra giant list in memory; it just reads each row into memory one at a time and then discards them, and only remembers the biggest one.

Which is usually great, but it has one potential problem: if you actually need to scan the values two or more times instead of just once, the first time already consumed all the rows, so the second time won't find any. For example, this will raise an exception in the second call:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
min_revenue_row = min(csvReader, key=lambda row: int(row[1]))

The ideal solution is to reorganize your code to only scan the rows once. For example, if you understand how min and max work, you could build your own min_and_max function that does both at the same time, and then use it like this:

min_revenue_row, max_revenue_row = 
    min_and_max(csvReader, key=lambda row: int(row[1]))

But sometimes that's not possible, or at least not possible in a way you can figure out how to write readably. I'll assume you don't know how to write min_and_max. So, what can you do?

You have two less than ideal, but often still acceptable, options: Either read the entire file into memory, or read the file multiple times. Here's both.

rows = list(csvReader) # now it's in memory, so we can reuse it
max_revenue_row = max(rows, key=lambda row: int(row[1]))
min_revenue_row = min(rows, key=lambda row: int(row[1]))

with open(csvpath) as f:
    csvReader = csv.reader(f)
    max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
with open(csvpath) as f:
    # whole new reader, so it doesn't matter that we used up the first
    csvReader = csv.reader(f)
    min_revenue_row = min(csvReader, key=lambda row: int(row[1]))

In your case, if the CSV file is as small at it seems, it doesn't really matter that much, but I'd probably do the first one.

edited Mar 7, 2018 at 22:48

answered Mar 7, 2018 at 19:15

abarnert

368k54 gold badges626 silver badges691 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

user1592147 Over a year ago

Wow, made alot of sense , i got it ...thank you so much.

user1592147 Over a year ago

Also can u reccomend best online tutorials for beginners like me in pyhton

abarnert Over a year ago

@user1592147 I have no idea what tutorials are good nowadays, but I'll bet the python-list mailing list (either searching the archives, or joining and asking) is a good place to find that information.

user1592147 Over a year ago

i included the above code iand execute , am getting error " '>' not supported between instances of 'function' and '_csv.reader'"...any idea why

abarnert Over a year ago

@user1592147 Oops, I forgot the key= in one version; fixed.

|

Abhisek Roy · Accepted Answer · 2018-03-07 18:53:27Z

0

This should work. Since the elements of your array are string, you need to convert them to int using map(int,a) first.

a=['1154293', '885773', '-448704', '563679', '555394', '631974', '957395', '1104047', '693464', '454932', '727272', '125016', '339251', '78523', '977084', '1158718', '332681', '-341227', '173826', '742611', '1189806', '607363', '-1172384', '587993', '295198', '-300390', '468995', '698452', '967828', '-454873', '375723', '1140526', '83836', '413189', '551363', '1195111', '657081', '66659', '803301', '-953301', '883934']
print(max(map(int, a)))

answered Mar 7, 2018 at 18:53

Abhisek Roy

58413 silver badges32 bronze badges

3 Comments

user1592147 Over a year ago

Thanks , How can i find the name which is in col 0 , which has has the max revenue?

Abhisek Roy Over a year ago

Use .index to get the index of the highest element and print the same index for the other column.

abarnert Over a year ago

That's a bad idea. It means re-searching (in an exhaustive linear search) to find the same row you already found, so you're doubling the work, both conceptually and as far as performance.

kibs · Accepted Answer · 2018-03-07 18:53:29Z

0

I think the problem is with the data type. As your numbers are with '', they are interpreted as strings and thus give the maximum value considering that.

You may want to cast each string to an integer. Like this:

new_list = [int(number) for number in old_list]

Hope this helps.

answered Mar 7, 2018 at 18:53

kibs

263 bronze badges

7 Comments

Abhisek Roy Over a year ago

Can be done much more sensibly using map. No need to iterate.

pault Over a year ago

@AbhisekRoy what do you think map does?

Abhisek Roy Over a year ago

My bad. I thought map functions are faster. I did some research and found out that they take almost the same time. @pault

abarnert Over a year ago

@AbhisekRoy Performance between map and comprehensions is rarely the important question—except for the question of whether you need a list (that you can iterate over and over) or an iterator (which you can only use once, but doesn't waste time and space building the whole list). If you want the latter, either map or a generator expression is fine. If you want the former, use a list comprehension. We don't know which one the user wants.

user1592147 Over a year ago

Hi , How do i find the name, which is in col[1] corresponding to max i found in col[2]

|

user1592147 · Accepted Answer · 2018-03-07 18:55:11Z

0

Thank you all

I converted to int

Revenue.append(int(row[1]))

Now it works fine.

Thanks gain

answered Mar 7, 2018 at 18:55

user1592147

114 silver badges8 bronze badges

3 Comments

pstatix Over a year ago

I would caution this, it appears you still don't understand what is happening.

Reck Over a year ago

Its important that you understand whats was the thing you were doing wrong. And I suggest you to accept someones answer where you find whats going wrong and try not to add your own answer.

user1592147 Over a year ago

Please let me know, what i am doing worng, am very new to python,i got the output by making change like above

Collectives™ on Stack Overflow

find max of a column in a csv file using python

4 Answers 4

9 Comments

3 Comments

7 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

9 Comments

3 Comments

7 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related