I am currently reading the csv file in "rb" mode and uploading the file to an s3 bucket.
with open(csv_file, 'rb') as DATA:
s3_put_response = requests.put(s3_presigned_url,data=DATA,headers=headers)
All of this is working fine but now I have to validate the headers in the csv file before making the put call.
When I try to run below, I get an error.
with open(csv_file, 'rb') as DATA:
csvreader = csv.reader(file)
columns = next(csvreader)
# run-some-validations
s3_put_response = requests.put(s3_presigned_url,data=DATA,headers=headers)
This throws
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
As a workaround, I have created a new function which opens the file in "r" mode and does validation on the csv headers and this works ok.
def check_csv_headers():
with open(csv_file, 'r') as file:
csvreader = csv.reader(file)
columns = next(csvreader)
I do not want to read the same file twice. Once for header validation and once for uploading to s3. The upload part also doesn't work if I do it in "r" mode.
Is there a way I can achieve this while reading the file only once in "rb" mode ? I have to make this work using the csv module and not the pandas library.
csv.readerobjet with a file opened in binary mode. It requires strings, not bytes, so you must use text mode. Why does it really matter if you read the header twice?bytes=DATA.readline()to read the first line as bytes, convert that to a string withl=bytes.decode()and then put that into a listlist=list[line], parse it withcsvreader=csv.reader(list), and finally go back to the start withDATA.seek(0). You'll have to do the same thing you would withopen(,'r')with a lot more code