I am trying to create a dataframe from a csv file, there are multiple columns and rows. One of the columns has either 'yes' or 'no'. I only want the dataframe to include rows that have 'yes'. Can someone show me how to write this code? Thanks in advance.
-
3Does this answer your question? conditional row read of csv in pandasEmi OB– Emi OB2021-10-14 08:27:05 +00:00Commented Oct 14, 2021 at 8:27
-
You can also try something like this df.loc[df['column_name'] == 'yes']Egidius– Egidius2021-10-14 08:48:00 +00:00Commented Oct 14, 2021 at 8:48
-
Python and Pandas have multiple options that you can use to filter under specific conditions. Quickly, here is a page I just found you towardsdatascience.com/…, but you can find many pages on google that can teach you other cool ways. don't limit yourself to learn all the cool stuff Pandas has.Egidius– Egidius2021-10-14 08:53:38 +00:00Commented Oct 14, 2021 at 8:53
Add a comment
|
2 Answers
Here are some ways that can help you.
Say that your column name is choice and your data frame name is df
df_new = df[df['choice'] == 'yes']
In this case, if you run df_new, you will get your datagram that only has yes.
Same to the code below.
mask = df['choice'] == 'yes'
# new dataframe with selected rows
df_new = pd.DataFrame(df[mask])
You can also try this:
# condition with df.values property
mask = df['choice'].values == 'yes'
# new dataframe
df_new = df[mask]
print(df_new)