0

I have a CSV file, which contains patch names, their release date and some other info in separate columns. I am trying to write a Python script that will ask the user for a Patch name and once it gets the input, will check if the Patch is in the CSV file and print out the Release Date.

So far, I have written the following piece of code, which I based based on this answer.

import csv

patch = raw_input("Please provide your Patchname: ")

with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        for k in row:
            if row[k] == patch:
                print "According to the CSV_File1 database: "+row[k]

This way I get the Patch name printed on the screen. I don't know how to traverse the column with the Dates, so that I can print the date that corresponds to the row with the Patch name that I provided as input.

In addition, I would like to check if that patch is the last released one. If it isn't, then print the latest one along with its release date. My problem is that the CSV file contains patch names of different software versions, so I can't just print the last of the list. For example:

PatchXXXYY,...other columns...,Release Date,...     <--- (this is the header row of the CSV file)
Patch10000,...,date
Patch10001,...,date
Patch10002,...,date
Patch10100,...,date
Patch10101,...,date
Patch10102,...,date
Patch10103,...,date
Patch20000,...,date
...

So, if my input is "Patch10000", then I should get its release date and the latest available Patch, which in this case would be Patch10002, and its release date. But NOT Patch20000, as that would be a different software version. A preferable output would like this:

According to the CSV_File1 database: Patch10100 was released on "date". The latest available patch is "Patch10103", which was released on "date".

That's because the "XXX" digits in the PatchXXXYY above, represent the software version, and the "YY" the patch number. I hope this is clear.

Thanks in advance!

2 Answers 2

1

The CSV module works fine but I just wanted to throw Pandas in as this can be a good use case for it. There may be better ways to handle this but it's a fun example. This is assuming that your columns are labels(Patch_Name, Release_Date) so you will need to correct them.

import pandas as pd

my_file1 = pd.read_csv("CSV_File1.csv", error_bad_lines=False)

patch = raw_input("Please provide your Patchname: ")

#Find row that matches patch and store the index as idx
idx = my_file1[my_file1["Patch_Name"] == patch].index.tolist()

#Get the date value from row by index number
date = my_file1.get_value(idx[0], "Release_Date")

print "According to the CSV_File1 database: {} {}".format(patch, date)

There are great ways to filter and compare the data in a CSV with Pandas as well. I would give more descriptive solutions if I had more time. I highly suggest looking into the Pandas documentation.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your feedback, it's an interesting different approach and I will look into it!
0

You're almost there, though I'm a wee bit confused - your sample data doesn't have a header row. If it doesn't then you shouldn't be using a DictReader but if it does you can take this approach.

version = patch[:8]
latest_patch = ''
last_patch_data = None
with open("CSV_File1.csv") as my_file1:
    reader = csv.DictReader(my_file1)
    for row in reader:
        # This works because of ASCII ordering. First,
        # we make sure the package starts with the right
        # version - e.g. Patch200
        if row['Package'].startswith(version):
            # Now we grab the next two numbers, so from
            # Patch20042 we're grabbing '42'
            patch_number = row['Package'][8:10]
            # '02' > '' is true, and '42' > '02' is also True
            if patch_number > latest_patch:
                # If we have a greater patch number, we
                # want to store that, along with the row that
                # had that. We could just store the patch & date
                # but it's fine to store the whole row
                latest_patch = patch_number
                last_patch_data = row

        # No need to iterate over the keys, you *know* the
        # column containing the patch. Presumably it's
        # titled 'patch'
        #for k in row:
        #    if row[k] == patch:
        if row['Package'] == patch:
            # assuming the date header is 'date'
            print("According to the CSV_File1 database: {patch!r}"
                  " was released on {date!r}".format(patch=row['Package'],
                                                     date=row['Registration']))

    # `None` is a singleton, which means that we can use `is`,
    # rather than `==`. If we didn't even *start* with the same
    # version, there was certainly no patch. You may prefer a
    # different message, of course.
    if last_patch_data is None:
        print('No patch found')
    else:
        print('The latest available patch is {patch!r},'
              ' which was released on {date!r}'.format(patch=last_patch_data['Package'],
                                                       date=last_patch_data['Registration']))

8 Comments

The sample I gave above, was just an example. But the top row is the header. So my CSV file looks like above. At first I got a syntax error at the "if" statement, so I noticed there is a ":" missing. I changed the variable names for the headers to the corresponding ones for my CSV file (patch -> Package, and date -> Registration) As for the last "print". The patches are in order, but as I said, there are different software versions, so printing the last entry every time will not be the correct patch for the correct version. If you can please see my initial post again.
Whoops! Good catch. I've fixed the syntax error. I didn't notice anything in your question that specified a way to tell what patch went with what version. Or does the first integer value after Patch specify that? I can add a modification if that's the case.
Basically, my input is patch and then some digits, like "PatchXXXYY". I edited my initial post at the end to explain why the numbering is important. "So, if my input is "Patch10000", then I should get its release date and the latest available Patch, which in this case would be Patch10002, and its release date. But NOT Patch20000, as that would be a different software version. That's because the "XXX" digits in the PatchXXXYY above, represent the software version, and the "YY" the patch number. I hope this is clear."
@italialex7 that should do it for you. If this solves your question, remember to mark my answer as accepted with the green check mark. If you found it super helpful you could also upvote it as well.
I think I get how you want to do the check the numbers, but when I run it I still get the last entry of the list as the latest patch. So to use your code's comment as an example, if I tell it to check for Patch20002, it should give me Patch20042 as the latest one, but instead I get Patch23066 (which is a different software version and patch).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.