0

I am trying to find max of below colm in csv

list['1154293', '885773', '-448704', '563679', '555394', '631974', '957395', '1104047', '693464', '454932', '727272', '125016', '339251', '78523', '977084', '1158718', '332681', '-341227', '173826', '742611', '1189806', '607363', '-1172384', '587993', '295198', '-300390', '468995', '698452', '967828', '-454873', '375723', '1140526', '83836', '413189', '551363', '1195111', '657081', '66659', '803301', '-953301', '883934']

I ran the code i wrote

  for row in csvReader:


        Revenue.append(row[1])
        max_revenue=max(Revenue)
        print("max revenue"+str(max_revenue))

But somhow its not fetching max value , output am getting is

        max revenue 977084

Please advice ,

7
  • 1
    Because it's treating them as strings... you need to get the max of the integer values. "9" is greater than "11" if it's a string Commented Mar 7, 2018 at 18:50
  • Your values are strings Commented Mar 7, 2018 at 18:50
  • Basically a dupe of stackoverflow.com/questions/7368789/… Commented Mar 7, 2018 at 18:53
  • Apart from the issue with strings, don't put the max() inside the loop. Commented Mar 7, 2018 at 18:57
  • 1
    Possible duplicate of Convert all strings in a list to int Commented Mar 7, 2018 at 18:57

4 Answers 4

1

The problem here is that you're building a list of the column-1 strings, but then expecting to find the max as a number, not as a string.

You could fix that by building a list of the column-1 strings mapped to integers, as other answers show:

for row in csvReader:
    Revenue.append(int(row[1]))
max_revenue=max(Revenue)

But another way is to use a key function for max:

for row in csvReader:
    Revenue.append(row[1])
max_revenue = max(Revenue, key=int)

Even better, you can use the same idea to not need that whole separate Revenue list:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))

This means you get the whole original row, not just the integer value. So, if, say, column 2 is the username that goes with the revenue in column 1, you can do this:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
best_salesman_name = max_revenue_row[2]

This also avoids building a whole extra giant list in memory; it just reads each row into memory one at a time and then discards them, and only remembers the biggest one.

Which is usually great, but it has one potential problem: if you actually need to scan the values two or more times instead of just once, the first time already consumed all the rows, so the second time won't find any. For example, this will raise an exception in the second call:

max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
min_revenue_row = min(csvReader, key=lambda row: int(row[1]))

The ideal solution is to reorganize your code to only scan the rows once. For example, if you understand how min and max work, you could build your own min_and_max function that does both at the same time, and then use it like this:

min_revenue_row, max_revenue_row = 
    min_and_max(csvReader, key=lambda row: int(row[1]))

But sometimes that's not possible, or at least not possible in a way you can figure out how to write readably. I'll assume you don't know how to write min_and_max. So, what can you do?

You have two less than ideal, but often still acceptable, options: Either read the entire file into memory, or read the file multiple times. Here's both.


rows = list(csvReader) # now it's in memory, so we can reuse it
max_revenue_row = max(rows, key=lambda row: int(row[1]))
min_revenue_row = min(rows, key=lambda row: int(row[1]))

with open(csvpath) as f:
    csvReader = csv.reader(f)
    max_revenue_row = max(csvReader, key=lambda row: int(row[1]))
with open(csvpath) as f:
    # whole new reader, so it doesn't matter that we used up the first
    csvReader = csv.reader(f)
    min_revenue_row = min(csvReader, key=lambda row: int(row[1]))

In your case, if the CSV file is as small at it seems, it doesn't really matter that much, but I'd probably do the first one.

Sign up to request clarification or add additional context in comments.

9 Comments

Wow, made alot of sense , i got it ...thank you so much.
Also can u reccomend best online tutorials for beginners like me in pyhton
@user1592147 I have no idea what tutorials are good nowadays, but I'll bet the python-list mailing list (either searching the archives, or joining and asking) is a good place to find that information.
i included the above code iand execute , am getting error " '>' not supported between instances of 'function' and '_csv.reader'"...any idea why
@user1592147 Oops, I forgot the key= in one version; fixed.
|
0

This should work. Since the elements of your array are string, you need to convert them to int using map(int,a) first.

a=['1154293', '885773', '-448704', '563679', '555394', '631974', '957395', '1104047', '693464', '454932', '727272', '125016', '339251', '78523', '977084', '1158718', '332681', '-341227', '173826', '742611', '1189806', '607363', '-1172384', '587993', '295198', '-300390', '468995', '698452', '967828', '-454873', '375723', '1140526', '83836', '413189', '551363', '1195111', '657081', '66659', '803301', '-953301', '883934']
print(max(map(int, a)))

3 Comments

Thanks , How can i find the name which is in col 0 , which has has the max revenue?
Use .index to get the index of the highest element and print the same index for the other column.
That's a bad idea. It means re-searching (in an exhaustive linear search) to find the same row you already found, so you're doubling the work, both conceptually and as far as performance.
0

I think the problem is with the data type. As your numbers are with '', they are interpreted as strings and thus give the maximum value considering that.

You may want to cast each string to an integer. Like this:

new_list = [int(number) for number in old_list]

Hope this helps.

7 Comments

Can be done much more sensibly using map. No need to iterate.
@AbhisekRoy what do you think map does?
My bad. I thought map functions are faster. I did some research and found out that they take almost the same time. @pault
@AbhisekRoy Performance between map and comprehensions is rarely the important question—except for the question of whether you need a list (that you can iterate over and over) or an iterator (which you can only use once, but doesn't waste time and space building the whole list). If you want the latter, either map or a generator expression is fine. If you want the former, use a list comprehension. We don't know which one the user wants.
Hi , How do i find the name, which is in col[1] corresponding to max i found in col[2]
|
0

Thank you all

I converted to int

Revenue.append(int(row[1]))

Now it works fine.

Thanks gain

3 Comments

I would caution this, it appears you still don't understand what is happening.
Its important that you understand whats was the thing you were doing wrong. And I suggest you to accept someones answer where you find whats going wrong and try not to add your own answer.
Please let me know, what i am doing worng, am very new to python,i got the output by making change like above

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.