1

What I want to do is pretty simple, in other languages. I want to split a table, using a "for" loop to split a data frame every fifth row.

The idea is that I have dataframe that adds a new row, every so often, like answering a form with different questions and every answer is added to a specific column, like Google Forms with SpreadSheet.

What I have tried is the following:

import pandas as pd
dp=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
df1=pd.DataFrame(data=dp)
for i in range(0, len(dp)):
   if i%5==0:
      df = df1.iloc[i,:]
      print(df)          
print(df)

Which I know isn't much but nevertheless it is a try. Now, what I can't do is create a new variable with the new dataframe every time the loop reaches the i mod 5 == 0 row.

3
  • It's not clear what output your desired output is. Do you want lots of individual five-row dataframes stored in varables? Or are you trying to convert a flat list into a single dataframe with rows and columns? Maybe you're just trying to print to the screen five-rows at a time. Commented Sep 26, 2018 at 2:15
  • I understand the confusion. What I want to do is the first one. I want to generate variables that store a dataframe every fifth row. For example: I want rows 0 through 4 stored in a variable named V1, 5 through 9 stored in V2 etc. Can this be done? Commented Sep 26, 2018 at 7:16
  • Thanks for the clarification, Jason. See my updated answer below. I create a list of dataframes, dfs. That way dfs[0] is the first data frame, dfs[1] is the second, etc... Commented Sep 26, 2018 at 7:35

2 Answers 2

1

I think you're trying to convert a flat list into rows and columns using a known number of fields.

I'd do something like this:

import numpy as np
import pandas as pd

numFields = 3   # this is five in your case
fieldNames = ['color', 'animal', 'amphibian'] # totally optional 

# this is your 'dp'
inputData = ['brown', 'dog','false','green', 'toad','true']

flatDataArray = np.asarray(inputData)

reshapedData = flatDataArray.reshape(-1, numFields)

df = pd.DataFrame(reshapedData, columns=fieldNames) # you only need 'columns' if you want to name fields

print(df)

which gives:

    color   animal  amphibian
0   brown   dog     false
1   green   toad    true

--UPDATE--

From your comment above, I see that you'd like an arbitrary number of dataframes- one for each five-row group. Why not create a list of dataframes (i.e. so you have dfs[0], dfs[1])?

# continuing with from where the previous code left off...

dfs = []

for group in reshapedData:
     dfs.append(pd.DataFrame(group))

for df in dfs:
    print(df)

which prints:

   0
0  brown
1    dog
2  false

   0
0  green
1   toad
2   true
Sign up to request clarification or add additional context in comments.

Comments

1

numpy.split

lod = np.split(df1, np.arange(1, 16, 5))

print(*lod, sep='\n\n')

   0
0  0

   0
1  1
2  2
3  3
4  4
5  5

     0
6    6
7    7
8    8
9    9
10  10

     0
11  11
12  12
13  13
14  14
15  15

lod = np.split(df1, np.arange(0, 16, 5)[1:])

print(*lod, sep='\n\n')

   0
0  0
1  1
2  2
3  3
4  4

   0
5  5
6  6
7  7
8  8
9  9

     0
10  10
11  11
12  12
13  13
14  14

     0
15  15

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.