2

Here's a CSV file kidsList.csv, comma-delimited, the first row is a header. The first column Id is unique so you can use this as a primary key.

Id,Name,Age
A1,Alice,6
A2,Becca,5
B1,Cindy,7

Now I want to find the name where Id is A2; the answer should be "Becca".

In SQL it is like SELECT Name FROM table WHERE Id = "A2"

How can I do this in Python 3.x? This operation is so simple that I want to use the Standard Library like csv rather than non-standard ones like pandas.

3
  • 1
    Hey, I've added an answer that will create an OrderedDict from csv.DictReader. Please check it out! If you know SQL, maybe check the "dataset" package to easily go from CSV --> SQLite. dataset.readthedocs.io/en/latest Commented Jul 23, 2019 at 18:04
  • If Id is the key I suggest you index your data by Id and keep a reference of the row (values) from the index. You may do this indexing for each key column. The index may be stored in a map data structure. For non-indexed column search, it will be a full table scan (traverse all rows). Now, to support SQL like syntax you may need to write your interpreter if you wish to have similar syntax. Commented Jul 23, 2019 at 18:05
  • It is surprising that we can't really do such a basic SQL tasks using the CSV module alone %-/ Commented Jul 23, 2019 at 18:27

2 Answers 2

2

I think the csv.DictReader class can be utilized to create a dictionary mapping that you can index by the value of the Id column:

import csv
from collections import OrderedDict

kids = OrderedDict()
_name = 0
_age = 1

with open('kidsList.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile, fieldnames=("Id",),restkey="data_list")
        for row in reader:
            kids[row["Id"]] = row["data_list"]

print(f"ID = A1 has data: name= {kids['A1'][_name]}, age= {kids['A1'][_age]} ")

# Expected Output:
# ID = A1 has data: name= Alice, age= 6
Sign up to request clarification or add additional context in comments.

2 Comments

Yah, exactly to get the "Name" value. Glad this worked for you! You might try something like _name= 0; _age= 1; print(kids['A2'][_name]);
Cleaned up, thanks for catching that in public facing code!
1

You can use the csv library to convert the csv file to a 2d list, and then loop through the array:

import csv

key = 'A2'
category = 'Name'

with open('kidsList.csv', 'r') as file:
    contents = list(csv.reader(file))

index = contents[0].index(category)

for i in range(1, len(contents)):
    if contents[i][0] == key:
        print(contents[i][index])
        break

6 Comments

No that's not what I meant. I want to find the cell using a condition without knowing its row number. In this example, "Becca" is in the second row but imagine that CSV file is longer and you want to do SELECT name FROM table WHERE Id = "C1".
Do you only index into the data by the Id value?
Yes. Please note unlike key-value pairs (e.g. dictionary, JSON), it has 3+ columns.
@Culip sorry didn't quite understand your question, I've edited my answer to better address it
I got "IndexError: string index out of range".
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.