3

I have a nested list with different list sized and types.

def read(f,tree,objects):

Event=[]
for o in objects:
    #find different features of one class 
    temp=[i.GetName() for i in tree.GetListOfBranches() if i.GetName().startswith(o)]
    tempList=[] #contains one class of objects
    for t in temp:
        #print t
        tempList.append(t)
        comp=np.asarray(getattr(tree,t))
        tempList.append(comp)
Event.append(tempList)

return Event



def main():
    path="path/to/file"
    objects= ['TauJet', 'Jet', 'Electron', 'Muon', 'Photon', 'Tracks', 'ETmis', 'CaloTower']

    f=ROOT.TFile(path)
    tree=f.Get("RecoTree")
    tree.GetEntry(100)
    event=read(f,tree,objects)

for example result of event[0] is

['TauJet', array(1), 'TauJet_E', array([ 31.24074173]), 'TauJet_Px', array([-28.27997971]), 'TauJet_Py', array([-13.18042469]), 'TauJet_Pz', array([-1.08304048]), 'TauJet_Eta', array([-0.03470514]), 'TauJet_Phi', array([-2.70545626]), 'TauJet_PT', array([ 31.20065498]), 'TauJet_Charge', array([ 1.]), 'TauJet_NTracks', array([3]), 'TauJet_EHoverEE', array([ 1745.89221191]), 'TauJet_size', array(1)]

how can I convert it into numpy array?

NOTE 1: np.asarray(event, "object") is slow. I am looking for a better way. Also np.fromiter() is not applicable as far as I don't have a fixed type

NOTE 2: I don't know the length of my Events.

NOTE 3: I can also get ride of names if it makes thing easier.

1
  • 3
    I would suggest you to give a look at pandas DataFrames. I remember (hopefully correctly) reading somewhere that there is some support for columns of different length. Besides they support a number of numpy arithmetic Commented Mar 8, 2013 at 12:26

1 Answer 1

1

You could try something like this, I'm not sure how fast its going to be though. This creates a numpy record array for first row.

data = event[0]
keys = data[0::2]
vals = data[1::2]
#there are some zero-rank arrays in there, so need to check for those, 
#but I think just recasting them to a np.float should work. 
temp = [np.float(v) for v in vals]
#you could also just create a np array from the line above with np.array(temp)
dtype={"names":keys, "formats":("f4")*len(vals)}
myArr = np.rec.fromarrays(temp, dtype=dtype)

#test it out
In [53]: data["TauJet_Pz"]
Out[53]: array(-1.0830404758453369, dtype=float32)


#alternatively, you could try something like this, which just creates a 2d numpy array
vals = np.array([[np.float(v) for v in row[1::2]] for row in event])
#now create a nice record array from that using the dtypes above
myRecordArray = np.rec.fromarrays(vals, dtype=dtype)
Sign up to request clarification or add additional context in comments.

9 Comments

I can't get this work. As I said I have list with different length. so when I have entry with different length than I mentioned here I get : temp2=[np.float(v) for v in vals] TypeError: only length-1 arrays can be converted to Python scalars and even for this length I get : myArr = np.rec.fromarrays(temp2, dtype=dtype) File "/usr/lib/pymodules/python2.7/numpy/core/records.py", line 537, in fromarrays descr = sb.dtype(dtype) TypeError: data type not understood
It sounds like there are arrays in the rows with more than one element, i.e. something like ["TauJet_E", array([31.56, 45.14])]. That would reproduce the error you see. Why would that be the case?
my code is exactly like I posted in the question! As I said the length of vectore can vary from one object to another when I am doing (comp=np.asarray(getattr(tree,t))) . so for example I may have: 'Jet_E', array([ 391.62017822, 31.24074173]), I fixed this using v.astype(float) but still I have problem with TypeError: data type not understood in last part
So what are the separate values when there are multiple values for a particle?
for example Electron_E with values [23,32,23] means we have three Electron we energies 23,32,23.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.