0

I have a Pandas series (which could be a list, this is not very important) of lists which contains (to simplify, but that could also be letters of words) positive and negative number, such as

0 [12,-13,0,6]  
1 [2,-3,8,233]  
2 [0,6,8,3]  

for each of these, i want to fill a row in a three columns data frame, with a list of all positive values, a list of all negative values, and a list of all values comprised in some interval. Such as:

 [[12,6],[-13],[0,6]]   
 [[2,8,233],[-3],[2,8]]   
 [[6,8,3],[],[6,8,3]]   

What I first thought was using a list comprehension to generate a list of triadic lists of lists, which would be converted using pd.DataFrame to the right form. This was because i don't want to loop over the list of lists 3 times to apply each time a new choice heuristics, feels slow and dull.

But the problem is that I can't actually generate well the lists of the triad [[positive],[negative], [interval]]. I was using a syntax like

[[[positivelist.extend(number)],[negativelist], [intervalist.extend(number)]]\  

for listofnumbers in listoflists for number in listofnumbers\
if number>0 else [positivelist],[negativelist.extend(number)], [intervalist.extend(number)]]

but let be honest, this is unreadable, and anyway it doesn't do what I want since extend yields none.
So how could I go about that without looping three times (I could have many millions elements in the list of lists, and in the sublists, and I might want to apply more complexe formulae to these numbers, too, it is a first approach)?

I thought about using functional programming, map/lambda; but it is unpythonic. The catch is: what in python may help to do it right?

My guess would be something as:

newlistoflist=[]
for list in lists:
     positive=[]
     negative=[]
     interval=[]
     for element in list:
         positive.extend(element) if element>0
         negative.extend(element) if element<0
         interval.extend(element) if n<element<m
     triad=[positive, negative,interval]
 newlistoflist.append(triad)

what do you think?

0

1 Answer 1

1

You can do:

import numpy
l = [[12,-13,0,6], [2,-3,8,233], [0,6,8,3]]
l = numpy.array([x for e in l for x in e])
positive = l[l>0]
negative = l[l<0]
n,m = 1,5
interval = l[((l>n) & (l<m))]
print positive, negative, interval

Output: [ 12 6 2 8 233 6 8 3] [-13 -3] [2 3]

Edit: Triad version:

import numpy
l = numpy.array([[12,-13,0,6], [2,-3,8,233], [0,6,8,3]])
n,m = 1,5
triad = numpy.array([[e[e>0], e[e<0], e[((e>n) & (e<m))]] for e in l])
print triad

Output:

[[array([12,  6]) array([-13]) array([], dtype=int64)]
 [array([  2,   8, 233]) array([-3]) array([2])]
 [array([6, 8, 3]) array([], dtype=int64) array([3])]]
Sign up to request clarification or add additional context in comments.

5 Comments

That's a start but it doesn't do what I want. With your exemple, for each list, I'd like a triad : [[12,6],[-13],[0,6]] [[2,8,233],[-3],[2,8]] [[6,8,3],[],[6,8,3]]
Can you give n,mfor interval?
@AndoJurai: If you change n,m values it should be it.
Thanks. What if I want to use the same principle on list containing words instead? I guess that the masking use may be reproduced (and eventually transposed to pandas, which would be more practical for me) but I can't be sure.
You can use the same principle e.g. l = numpy.array([['alpha', 'beta', 'gamma'], ['theta', 'epislon', 'zeta'], ['delta', 'sigma', 'omega']], dtype=str) then triad = numpy.array([[e[e>'a'], e[e<'h'], e[((e>'a') & (e<'g'))]] for e in l])

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.