How to check user input with csv file and print data from specific column?

Question

I have a CSV file, which contains patch names, their release date and some other info in separate columns. I am trying to write a Python script that will ask the user for a Patch name and once it gets the input, will check if the Patch is in the CSV file and print out the Release Date.

So far, I have written the following piece of code, which I based based on this answer.

import csv

patch = raw_input("Please provide your Patchname: ")

with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        for k in row:
            if row[k] == patch:
                print "According to the CSV_File1 database: "+row[k]

This way I get the Patch name printed on the screen. I don't know how to traverse the column with the Dates, so that I can print the date that corresponds to the row with the Patch name that I provided as input.

In addition, I would like to check if that patch is the last released one. If it isn't, then print the latest one along with its release date. My problem is that the CSV file contains patch names of different software versions, so I can't just print the last of the list. For example:

PatchXXXYY,...other columns...,Release Date,...     <--- (this is the header row of the CSV file)
Patch10000,...,date
Patch10001,...,date
Patch10002,...,date
Patch10100,...,date
Patch10101,...,date
Patch10102,...,date
Patch10103,...,date
Patch20000,...,date
...

So, if my input is "Patch10000", then I should get its release date and the latest available Patch, which in this case would be Patch10002, and its release date. But NOT Patch20000, as that would be a different software version. A preferable output would like this:

According to the CSV_File1 database: Patch10100 was released on "date". The latest available patch is "Patch10103", which was released on "date".

That's because the "XXX" digits in the PatchXXXYY above, represent the software version, and the "YY" the patch number. I hope this is clear.

Thanks in advance!

John Morrison · Accepted Answer · 2016-08-19 18:23:45Z

1

The CSV module works fine but I just wanted to throw Pandas in as this can be a good use case for it. There may be better ways to handle this but it's a fun example. This is assuming that your columns are labels(Patch_Name, Release_Date) so you will need to correct them.

import pandas as pd

my_file1 = pd.read_csv("CSV_File1.csv", error_bad_lines=False)

patch = raw_input("Please provide your Patchname: ")

#Find row that matches patch and store the index as idx
idx = my_file1[my_file1["Patch_Name"] == patch].index.tolist()

#Get the date value from row by index number
date = my_file1.get_value(idx[0], "Release_Date")

print "According to the CSV_File1 database: {} {}".format(patch, date)

There are great ways to filter and compare the data in a CSV with Pandas as well. I would give more descriptive solutions if I had more time. I highly suggest looking into the Pandas documentation.

answered Aug 19, 2016 at 18:23

John Morrison

4,1281 gold badge18 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

italialex7 Over a year ago

Thanks for your feedback, it's an interesting different approach and I will look into it!

Wayne Werner · Accepted Answer · 2016-08-20 13:22:13Z

0

You're almost there, though I'm a wee bit confused - your sample data doesn't have a header row. If it doesn't then you shouldn't be using a DictReader but if it does you can take this approach.

version = patch[:8]
latest_patch = ''
last_patch_data = None
with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        # This works because of ASCII ordering. First,
        # we make sure the package starts with the right
        # version - e.g. Patch200
        if row['Package'].startswith(version):
            # Now we grab the next two numbers, so from
            # Patch20042 we're grabbing '42'
            patch_number = row['Package'][8:10]
            # '02' > '' is true, and '42' > '02' is also True
            if patch_number > latest_patch:
                # If we have a greater patch number, we
                # want to store that, along with the row that
                # had that. We could just store the patch & date
                # but it's fine to store the whole row
                latest_patch = patch_number
                last_patch_data = row

        # No need to iterate over the keys, you *know* the
        # column containing the patch. Presumably it's
        # titled 'patch'
        #for k in row:
        #    if row[k] == patch:
        if row['Package'] == patch:
            # assuming the date header is 'date'
            print("According to the CSV_File1 database: {patch!r}"
                  " was released on {date!r}".format(patch=row['Package'],
                                                     date=row['Registration']))

    # `None` is a singleton, which means that we can use `is`,
    # rather than `==`. If we didn't even *start* with the same
    # version, there was certainly no patch. You may prefer a
    # different message, of course.
    if last_patch_data is None:
        print('No patch found')
    else:
        print('The latest available patch is {patch!r},'
              ' which was released on {date!r}'.format(patch=last_patch_data['Package'],
                                                       date=last_patch_data['Registration']))

edited Aug 20, 2016 at 13:22

answered Aug 19, 2016 at 13:58

Wayne Werner

52.3k35 gold badges213 silver badges304 bronze badges

8 Comments

italialex7 Over a year ago

The sample I gave above, was just an example. But the top row is the header. So my CSV file looks like above. At first I got a syntax error at the "if" statement, so I noticed there is a ":" missing. I changed the variable names for the headers to the corresponding ones for my CSV file (patch -> Package, and date -> Registration) As for the last "print". The patches are in order, but as I said, there are different software versions, so printing the last entry every time will not be the correct patch for the correct version. If you can please see my initial post again.

Wayne Werner Over a year ago

Whoops! Good catch. I've fixed the syntax error. I didn't notice anything in your question that specified a way to tell what patch went with what version. Or does the first integer value after Patch specify that? I can add a modification if that's the case.

italialex7 Over a year ago

Basically, my input is patch and then some digits, like "PatchXXXYY". I edited my initial post at the end to explain why the numbering is important. "So, if my input is "Patch10000", then I should get its release date and the latest available Patch, which in this case would be Patch10002, and its release date. But NOT Patch20000, as that would be a different software version. That's because the "XXX" digits in the PatchXXXYY above, represent the software version, and the "YY" the patch number. I hope this is clear."

Wayne Werner Over a year ago

@italialex7 that should do it for you. If this solves your question, remember to mark my answer as accepted with the green check mark. If you found it super helpful you could also upvote it as well.

italialex7 Over a year ago

I think I get how you want to do the check the numbers, but when I run it I still get the last entry of the list as the latest patch. So to use your code's comment as an example, if I tell it to check for Patch20002, it should give me Patch20042 as the latest one, but instead I get Patch23066 (which is a different software version and patch).

|

Collectives™ on Stack Overflow

How to check user input with csv file and print data from specific column?

2 Answers 2

1 Comment

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related