Python Pandas slicing with various datatypes

Question

I have a column in a dataframe with two data types, like this:

25                3037205
26    2019-09-04 19:54:57
27    2019-09-09 17:55:45
28    2019-09-16 21:40:36
29                3037206
30    2019-09-06 14:49:41
31    2019-09-11 17:17:11
32                3037207
33    2019-09-11 17:19:04

I'm trying to slice it and build a new data frame like this:

26    3037205    2019-09-04 19:54:57
27    3037205    2019-09-09 17:55:45
28    3037205    2019-09-16 21:40:36
29    3037206    2019-09-06 14:49:41
30    3037206    2019-09-11 17:17:11
31    3037207    2019-09-11 17:19:04

I can't find how to slice between numbers "no datetype".

Some ideas?

Thx!

What does I can’t find how to slice between numbers “no datetype” mean? Is that part of an error message? — AMC
– AMC, Commented Dec 3, 2019 at 17:37

Quang Hoang · Accepted Answer · 2019-12-03 17:34:09Z

4

Another approach:

s = pd.to_numeric(df['col1'], errors='coerce')
df.assign(val=s.ffill().astype(int)).loc[s.isnull()]

Output:

                   col1      val
26  2019-09-04 19:54:57  3037205
27  2019-09-09 17:55:45  3037205
28  2019-09-16 21:40:36  3037205
30  2019-09-06 14:49:41  3037206
31  2019-09-11 17:17:11  3037206
33  2019-09-11 17:19:04  3037207

answered Dec 3, 2019 at 17:34

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user2509954 Over a year ago

Hi, this return an error: "ValueError: Cannot convert non-finite values (NA or inf) to integer"

Celius Stingher · Accepted Answer · 2019-12-03 17:42:00Z

2

I'm not sure if this is the most efficient way of solving the issue, but it seems to get the job done. I've added the option to rename the second column (since its name is not specified) after the #:

import pandas as pd
import numpy as np
data = {'dates':[3037205,'2019-09-04 19:54:57','2019-09-09 17:55:45','2019-09-16 21:40:36',3037206,'2019-09-06 14:49:41','2019-09-11 17:17:11',3037207,'2019-09-11 17:19:04']}

df = pd.DataFrame(data)

df['mask'] = np.where(df['dates'].str.isnumeric(),df['dates'],np.nan)
df['mask_2'] = np.where(df['dates'].str.isnumeric(),np.nan,df['dates'])
df['mask'] = df['mask'].fillna(method='ffill')
df = df.dropna(subset=['mask_2']).drop(columns=['mask_2'])#.rename(columns={'mask':'desired_name'})
print(df)

Output:

                 dates     mask
1  2019-09-04 19:54:57  3037205
2  2019-09-09 17:55:45  3037205
3  2019-09-16 21:40:36  3037205
5  2019-09-06 14:49:41  3037206
6  2019-09-11 17:17:11  3037206
8  2019-09-11 17:19:04  3037207

edited Dec 3, 2019 at 17:42

answered Dec 3, 2019 at 17:10

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

7 Comments

Quang Hoang Over a year ago

df.dropna(how='any') is pretty dangerous given that the data only represents one column in OP's.

Celius Stingher Over a year ago

It is indeed. But the structure of the data seems not to follow this issue as there is an id kind of value, followed by a certain number of dates, and so on... The major problem would arise if the id value contains strings, hence making the mask column contain NaN values.

Celius Stingher Over a year ago

I would say the major issue is not the dropna() but the np.where(), since the first one is based on the latter one. I would like your opinion on this take, since I'm a beginner and these kind of discussions are really useful to me .

Quang Hoang Over a year ago

No, as presented, your solution would work just fine. However, I'm talking about nan in other columns not mentioned in this post. by df.dropna(how='any') you might drop the rows with these nan, even if they are actual datetime in this column.

Celius Stingher Over a year ago

You are correct, I'll add subset to the dropna. I assumed the information OP is giving is the full dataframe. Thanks for your feedback. These are the differences between a seasoned coder and a beginner that I really enjoy learning!

|

Collectives™ on Stack Overflow

Python Pandas slicing with various datatypes

2 Answers 2

1 Comment

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related