Creating dataframe from a list - TypeError: object of type 'int' has no len()

Question

I am trying to create a data-frame from a list which has varying lengths for each row.

A sample of the list looks like this (which is how I would like it to)

[(dwstweets gop, broadened, base people), 1]
[(bushs campaign video, features, kat), 2]
[3]
[4]
[5]
[(president obama, wants, york), 6]
[(jeb bush, talked, enforcement), (lets, see, plan), 7]

The code I am using the try and append the list with each row to create the data-frame is:

count = 0;
df2 = pd.DataFrame();
for index, row in df1.iterrows():
  doc = nlp(unicode(row));
  text_ext = textacy.extract.subject_verb_object_triples(doc);
  mylist = list(text_ext) + [index]
  count+=1;
  df2 = df2.append(mylist, ignore_index=True)

However I get the error:

TypeError: object of type 'int' has no len()

I saw there are several questions with this error but as far as I can see they are not caused by the same thing.

How would I go about creating a data-frame with 7 columns that is unique on the index? (I know many of which will be empty for at least 3 of the columns and all columns except the index)

Thanks.

jezrael · Accepted Answer · 2018-04-17 04:47:36Z

2

I suggest create list of tuples first by append by tuples without [index] and then call DataFrame constructor like:

count = 0
L = []
df2 = pd.DataFrame();
for index, row in df1.iterrows():
  doc = nlp(unicode(row))
  text_ext = textacy.extract.subject_verb_object_triples(doc)
  #remove join index 
  mylist = list(text_ext)
  count+=1;
  #append to list
  L.append(mylist)

df2 = pd.DataFrame(L, index=df1.index)
print (df2)
                                         0                  1
1  (dwstweets gop, broadened, base people)               None
2    (bushs campaign video, features, kat)               None
3                                     None               None
4                                     None               None
5                                     None               None
6           (president obama, wants, york)               None
7          (jeb bush, talked, enforcement)  (lets, see, plan)

answered Apr 17, 2018 at 4:47

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

C L Over a year ago

Thanks for your answer - I get the error: ValueError: Shape of passed values is (1, 2), indices imply (1, 3214) - where 3214 is the total number of rows in my sample dataset (though in the future it will be much bigger). How would I resolve this? Otherwise this looks very close to working perfectly!

jezrael Over a year ago

Hard to know without data what is problem... One idea, how working mylist = list((text_ext)) ?

C L Over a year ago

I ran it again today and it worked perfectly. No clue what the difference was but thank you so much!

Jigar Patel · Accepted Answer · 2018-04-17 03:32:45Z

0

I believe the error could be in your for loop line in the code:

for index, row in df1.iterrows():

DataFrame.iterrows() returns an iterator object which cannot be used for defining a for loop at least in this case.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html

answered Apr 17, 2018 at 3:32

Jigar Patel

11 bronze badge

1 Comment

C L Over a year ago

Hi - it works fine if I use mylist = list(text_ext) instead so I don't think this is the case.

Collectives™ on Stack Overflow

Creating dataframe from a list - TypeError: object of type 'int' has no len()

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related