0

My data looks like below

['[\'Patient, A\', \'G\', \'P\', \'RNA\']']

Irrespective of the brackets, quotes and back slashes, I'd like to separate the data by ',' and write to a CSV file like below

Patient,A,G,P,RNA

Mentioning delimiter = ',' has done no help. The output file then looks like

['Patient, A','G','P','RNA']

all in a single cell. I want to split them into multiple columns. How can I do that?

Edit - Mentioning quotechar='|' split them into different cells but it now looks like

|['Patient, A','G','P','RNA']|

Edit-

out_file_handle = csv.writer(out_file, quotechar='|', lineterminator='\n', delimiter = ",")
data = ''.join(mydict.get(word.lower(), word) for word in re.split('(\W+)', transposed))
data = [data,]
out_file_handle.writerow(data)

transposed:

['Patient, A','G','P','RNA']

data:

['[\'Patient, A\', \'G\', \'P\', \'RNA\']']

And it has multiple rows, the above is one of the rows from the entire data.

5
  • 2
    Something that'd get you started: docs.python.org/2/library/csv.html#csv.writer Commented Nov 24, 2014 at 0:34
  • @BorrajaX Thank you. That is how I tried. I tried multiple ways, mentioning dialects, delimiters, lineterminators, nothing worked. Maybe I'm over-looking something. I'd really appreciate if you could help me out with it. Commented Nov 24, 2014 at 0:37
  • 1
    Can you edit your question to show a more specific example of your code? Maybe we'll be able to spot something? Thx Commented Nov 24, 2014 at 0:37
  • 1
    To start of your list has a string which looks like a list. Patient A will be one element ( they will occupy the same cell in your csv file). Is that what you want? Commented Nov 24, 2014 at 0:38
  • @Beginner No. They should be in two different cells. Commented Nov 24, 2014 at 0:44

4 Answers 4

1

You first need to read this data into a Python array, by processing the string as a CSV file in memory:

from StringIO import StringIO
import csv
data = ['[\'Patient, A\', \'G\', \'P\', \'RNA\']']
clean_data = list(csv.reader( StringIO(data[0]) ))

However the output is still a single string, because it's not even a well-formed CSV! In which case, the best thing might be to filter out all those junk characters?

import re
clean_data = re.sub("[\[\]']","",data[0])

Now data[0] is 'Patient, A, G, P, RNA' which is a clean CSV you can write straight to a file.

Sign up to request clarification or add additional context in comments.

6 Comments

Why use re when string.replace() would do just as well? It's faster, and clearer.
It should be much faster to perform single pass of the re engine than three passes of the string.replace() engine to remove all the different junk characters: [, ] and , (probably more if there are other examples).
If we can be sure the [] characters are at the start and end, then you are correct and the code could read: clean_data = data[0][1:-1].replace('\'','')
the [ and ] are always at the beginning and end, so string[1:-1] works for that. All that's needed is string.replace("\'","")[1:-1]. Why use regex when it's completely overkill?
No, given that the quote characters are inconsistently distributed there is no reason to expect the brackets will conform to any kind of standard pattern. It looks like the OP is cleaning junk data, and there will probably be more characters to filter, plus a round of whitespace cleaning.
|
1

If what you're trying to do is write data in the form of ['[\'Patient, A\', \'G\', \'P\', \'RNA\']'], where you have an array of these strings, to file, then it's really a question in two parts.

The first, is how do you separate the data into the correct format, and then the second is is to write it to file.

If that is the form of your data, for every row, then something like this should work (to get it into the correct format):

data = ['[\'Patient, A\', \'G\', \'P\', \'RNA\']', ...]
newData = [entry.replace("\'", "")[1:-1].split(",") for entry in data]

that will give you data in the following form:

[["Patient", "A", "G", "P", "RNA"], ...]

and then you can write it to file as suggested in the other answers;

with open('new.csv', 'wb') as write_file:
  file_writer = csv.writer(write_file)
  for dataEntry in range(newData ):
    file_writer.writerow(dataEntry)

If you don't actually care about using the data in this round, and just want to clean it up, then you can just do data.replace("\'", "")[1:-1] and then write those strings to file.

The [1:-1] bits are just to remove the leading and trailing square brackets.

Comments

1

Python has a CSV writer. Start off with

import csv

Then try something like this

with open('new.csv', 'wb') as write_file:
    file_writer = csv.writer(write_file)
    for i in range(data):
        file_writer.writerow([x for x in data[i]])

Edit:

You might have to wrangle the data a bit first before writing it, since it looks like its a string and not actually a list. Try playing around with the split() function

list = data.split()

1 Comment

Thank you. But its actually a list.
0
"""
                             SAVING DATA INTO CSV FORMAT
    * This format is used for many purposes, mainly for deep learning.
    * This type of file can be used to view data in MS Excel or any similar 
      Application
"""
# == Imports ===================================================================

import csv
import sys

# == Initialisation Function ===================================================

def initialise_csvlog(filename, fields):
    """
    Initilisation this function before using the Inserction function

    * This Function checks the data before adding new one in order to maintain
      perfect mechanisum of insertion
    * It check the file if not exists then it creates a new one
    * if it exists then it proceeds with getting fields

    Parameters
    ----------
    filename : String
        Filename along with directory which need to be created
    Fields : List
        Colomns That need to be initialised

    """
    try :
        with open(filename,'r') as csvfile:
            csvreader = csv.reader(csvfile)
            fields = csvreader.next()
            print("Data Already Exists")
            sys.exit("Please Create a new empty file")
            # print fields  
    except :
        with open(filename,'w') as csvfile:

            csvwriter = csv.writer(csvfile)
            csvwriter.writerow(fields)

# == Data Insertion Function ===================================================

def write_data_csv(filename, row_data):
    """
    This Function save the Row Data into the CSV Created
    * This adds the row data that is Double Listed

    Parameters
    ----------
    filename : String
        Filename along with directory which need to be created
    row_data : List
        Double Listed consisting of row data and column elements in a list  
    """
    with open(filename,'a') as csvfile:

        csvwriter = csv.writer(csvfile)
        csvwriter.writerows(row_data)

if __name__ == '__main__':
    """
    This function is used to test the Feature Run it independently

    NOTE: DATA IN row_data MUST BE IN THE FOLLOWING DOUBLE LISTED AS SHOWN
    """
    filename = "TestCSV.csv"
    fields = ["sno","Name","Work","Department"]
    #Init
    initialise_csvlog(filename,fields)
    #Add Data
    row_data = [["1","Jhon","Coder","Pythonic"]]
    write_data_csv(filename,row_data)

# == END =======================================================================

Read the Module and you can start using CSV and view data in Excel or any similar application (calc in libreoffice)

NOTE: Remember to place list of data to be double listed as shown in __main__ function (row_data)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.