Exploding a dataframe based on column value

Question

I have a Pandas dataframe that looks like this:

Location	Number	Position
New York	111	1 through 12
Dallas	222	1 through 12
San Francisco	333	1 through 4

I would like to basically explode the dataframe based on the Position column so that the output looks like this:

Location	Number	Position
New York	111	1
New York	111	2
New York	111	3
New York	111	4
New York	111	5
New York	111	6
New York	111	7
New York	111	8
New York	111	9
New York	111	10
New York	111	11
New York	111	12
Dallas	222	etc. etc.

I would like to do this for every instance of Location and Number based on what the values in Position are.

Is there a quick and easy way to do this??

Is position a string like you have showed?

DPM
– DPM

2022-06-08 14:38:56 +00:00
Commented Jun 8, 2022 at 14:38 — DPM
– DPM, Commented Jun 8, 2022 at 14:38

mozway · Accepted Answer · 2022-06-08 14:39:25Z

1

One option using explode:

out = (df
 .assign(Position=[range(a, b+1) for x in df['Position']
                   for a,b in [map(int, x.split(' through '))]])
 .explode('Position')
)

Another approach using reindexing:

df2 = df['Position'].str.extract('(\d+) through (\d+)').astype(int)
#    0   1
# 0  1  12
# 1  1  12
# 2  1   4

rep = df2[1].sub(df2[0]).add(1)

out = (df
 .loc[df.index.repeat(rep)]
 .assign(Position=lambda d: d.groupby(level=0).cumcount().add(df2[0]))
)

output:

        Location  Number  Position
0       New York     111         1
0       New York     111         2
0       New York     111         3
0       New York     111         4
0       New York     111         5
0       New York     111         6
0       New York     111         7
0       New York     111         8
0       New York     111         9
0       New York     111        10
0       New York     111        11
0       New York     111        12
1         Dallas     222         1
1         Dallas     222         2
1         Dallas     222         3
1         Dallas     222         4
1         Dallas     222         5
1         Dallas     222         6
1         Dallas     222         7
1         Dallas     222         8
1         Dallas     222         9
1         Dallas     222        10
1         Dallas     222        11
1         Dallas     222        12
2  San Francisco     333         1
2  San Francisco     333         2
2  San Francisco     333         3
2  San Francisco     333         4

answered Jun 8, 2022 at 14:39

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mozway Over a year ago

Try to run [x.split(' through ') for x in df['Position']] ensure this always yields 2 numbers (as string)

mozway Over a year ago

And what about the second approach?

Collectives™ on Stack Overflow

Exploding a dataframe based on column value

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related