1

I am very new to Python and I have been trying to detect missing data in lists created from data in imported csv files so that I can plot the series using matplotlib without getting an error.

I show you what I have:

import numpy as np
# import matplotlib.pyplot as plt
import csv
from pylab import *

res = csv.reader(open('cvs_file_with_data.csv'), delimiter=',')
res.next() # do not read header

ColOneData = []
ColTwoData = []
ColThreeData = []

for col in res:
    ColOneData.append(col[0])
    ColTwoData.append(col[1])
    ColThreeData.append(col[2])

print ColOneData # I got here the following ['1', '2', '3', '4', '5'] 

print ColTwoData # I got here the following ['1', '2', '', '', '5']

print ColThreeData # I got here the following ['', '', '3', '4', '']

ColTwoData_M = np.ma.masked_where(ColTwoData == '', ColTwoData) # This does not work

I need to mask the empty values e.g. '' so that I can plot the series without errors. Any suggestion to solve this problem?

Regards...

3 Answers 3

1

What do you mean by mask? Remove? If so, try the following:

masked_data = [point for point in data if point != '']

Edit:

I'm not used to numpy, but maybe this is what you are searching for:

>>> data = numpy.array(['0', '', '1', '', '2'])
>>> numpy.ma.masked_where(data == '', data)
masked_array(data = [0 -- 1 -- 2],
             mask = [False True False True False],
       fill_value = N/A)
Sign up to request clarification or add additional context in comments.

2 Comments

Hi jena, I do not mean to remove the empty or missing data from the list. I need to mask it so that when plotting using matplotlib the corresponding mark is empty. E.g. if I try to plot the ColOneData vs ColTwoData as they are now I will get an error. 'plt.plot(ColOneData , ColTwoData) # This will produce an error show()'
@Jose: what if the missing data is represented as '0'. Does that plot correctly?
1

Jose, if you wish to plot column1 against column2 and not have the empty items cause errors, you will have to remove the empty items in column2 along with the corresponding items in column1. A function like the following should do the trick.

def remove_empty(col1, col2):
    # make copies so our modifications don't clobber the original lists
    col1 = list(col1) 
    col2 = list(col2)
    i = 0
    while i < len(col1):
        # if either the item in col1 or col2 is empty remove both of them
        if col1[i] == '' or col2[i] == '':
            del col1[i]
            del col2[i]
        # otherwise, increment the index
        else: i+=1
    return col1, col2

Comments

1

If what you want to do is add a filler value to the empty nodes you could do something like this:

def defaultIfEmpty(a):
    if a == '':
        return '0'

    return a

x = ['0', '', '2', '3', '']
map (defaultIfEmpty,x)

result: x = ['0', '0', '2', '3', '0']

If that's the result your looking for you could map(defaultIfEmpty,ColOneData) then ColTwoData, etc.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.