Python program to write data in csv depends on column existence

Question

I am trying to append values in csv(already exist) with python programming, if the column already exists, it should be replaced or if the data is new I need to append it in the same file

What I tried so far:

csv data:

 SnapshotId,promotion level
 1.9.0.0,Tested
 2.0.0.1,Unit tested

Initial just appending (it works):

 with codecs.open('sample.csv','a',encoding='utf8') as newFile:
 newFileWriter = csv.writer(newFile)
 newFileWriter.writerow([str(snapshot_id),str(unitTest)])

So later my requirement changes like if incoming data like snapshotid already exist I should update the respective row value with promotion level, if snapshot id is unique and doesn't exist, I should then append a new row with snapshot id and promotion level.

I just tried like this with pandas

import pandas as pd
import os,sys 
snapID = sys.argv[1]
promLevel = sys.argv[2]
df = pd.read_csv('sample.csv')
def update_table(snapID, promLevel):

if snapID in df['SnapshotId']:
    print("Updating promotion level")
    df.loc[df['SnapshotId'] == snapID, ['promotion_level']] = promLevel
else:
    print("Adding new snapshot")
    return df.append({'SnapshotId': snapID, 'promotion_level': promLevel}, ignore_index=True)
return df

   dd = update_table(str(snapID),promLevel)
   print(dd)

I can able to locate the index but am not sure how to check if column already exists and replace the whole row with new values in csv any insights would be great help

If I understand your question thoroughly you need to use Database instead of file system, In file system you can append a content from existing file but you can't update the value. If you want to update then you need to save as a new file using to_csv. But Many of the database provides functionality what you requires :-) — Mohamed Thasin ah
– Mohamed Thasin ah, Commented Jan 23, 2019 at 10:26
What you said is absolutely right as its easily doable in database but there are third party applications already using this file so am restricted to more changes — Sathish kumar
– Sathish kumar, Commented Jan 23, 2019 at 10:33
When you wanna update a file read the latest csv file you already written, then make changes in the current df(which you read as a csv). then apply to_csv without append argument. So that you make sure your file has updated value without losing your data — Mohamed Thasin ah
– Mohamed Thasin ah, Commented Jan 23, 2019 at 10:40
Exactly that would be final move if nothing works.i hope i wont have any data loss or duplicate values — Sathish kumar
– Sathish kumar, Commented Jan 23, 2019 at 10:47
The only way to "update" a file is rewriting it just like @MohamedThasinah explained. Read the dataframe, update it and rewrite the csv. There won't be any data loss or duplicated values if you code it wisely. I can post an answer with the code if you would like me to. This will only work if you make sure those third party applications also update their buffer. If they have the file somehow cached you will have to think things through. — GRoutar
– GRoutar, Commented Jan 23, 2019 at 12:31

GRoutar · Accepted Answer · 2019-01-23 16:41:39Z

1

The code below reads a csv file, looks for a snapshotId. If it exists, the promotion level is replaced with the new value, otherwise a new row is appended:

def update_table(snapID, promLevel):

    # Reading file
    df = pd.read_csv('file.csv')

    if df['SnapshotId'].str.contains(snapID).any():
        # Updating promotion level
        df.loc[df['SnapshotId'] == snapID, ['Promotion_Level']] = promLevel
        return df
    else:
        # Adding new snapshot
        return df.append({'SnapshotId': snapID, 'Promotion_Level': promLevel}, ignore_index=True)

# Updating original file
update_table('id01', 'P10').to_csv('file.csv')

A working example:

import pandas as pd

data = {
    'SnapshotId': [1,2,3,4,5],
    'Promotion_Level': ['P1','P2','P3','P4','P5']
}
df = pd.DataFrame(data)

def update_table(snapID, promLevel):

    #df = pd.read_csv('file.csv')

    if df['SnapshotId'].str.contains(snapID).any():
        # Updating promotion level
        df.loc[df['SnapshotId'] == snapID, ['Promotion_Level']] = promLevel
        return df
    else:
        # Adding new snapshot
        return df.append({'SnapshotId': snapID, 'Promotion_Level': promLevel}, ignore_index=True)


# Original df
0           1              P1
1           2              P2
2           3              P3
3           4              P4
4           5              P5

> print(update_table(1, 'a'))
0           1               a
1           2              P2
2           3              P3
3           4              P4
4           5              P5

> print(update_table(10, 'a'))
0           1              P1
1           2              P2
2           3              P3
3           4              P4
4           5              P5
5          10               a

edited Jan 23, 2019 at 16:41

answered Jan 23, 2019 at 13:28

GRoutar

1,4552 gold badges19 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Sathish kumar Over a year ago

Thank much @khabz here is the output btw C:\Users\320047585\Sathish\Python\sample>python panda.py 1.9.0.0 sampling here SnapshotId promotion_level 0 1.9.0.0 Rejected 1 2.0.0.1 Unit tested 2 1.9.0.0 sampling

Sathish kumar Over a year ago

it just got appened rather than replaced

Sathish kumar Over a year ago

everytime it enters else condition

GRoutar Over a year ago

Your code/output is not readable at all. To insert code in comments check this link: meta.stackexchange.com/questions/74784/…

Sathish kumar Over a year ago

yes i did it like this dd = update_table(str(snapID),promLevel) print(dd)

|

Collectives™ on Stack Overflow

Python program to write data in csv depends on column existence

1 Answer 1

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related