1

I was practicing data wrangling and I eneded up with this simple dataset. but then I started to filter and selecting some information on it but is not working

here is the data set:

https://drive.google.com/file/d/1d1FMWhh3U1KnfVFYyC5R5USuB2BbcN6S/view?usp=sharing

df.head()

0                            TCS 
1                      Accenture 
2                      Cognizant 
3                     ICICI Bank 
4                      HDFC Bank 
                  ...            
8996              Bitla Software 
8997                Kern Liebers 
8998           ANAAMALAIS TOYOTA 
8999                    Elsevier 
9000    Samsung Heavy Industries 
Name: campany_name, Length: 9001, dtype: object

We see here that Accenture is in the second row but when I try to call it is not working

df['campany_name'] == 'Accenture'

0       False
1       False
2       False
3       False
4       False
        ...  
8996    False
8997    False
8998    False
8999    False
9000    False

I don't really want to get a different way. I just want to understand what is happening under the hood and fully understand what is different in this data set that I can't just do it like I normaly do. which is df['campany_name] == 'Accenture' I should get boolenans, and with those id be able to get the row doing df[df['campany_name] == 'Accenture']

something must be wrong at the index or format level. but I mean i'm new to python.

5
  • Try df[df['campany_name'] == 'Accenture'] Commented Sep 22, 2022 at 3:21
  • only returns the column names Commented Sep 22, 2022 at 3:27
  • TypeError: only list-like objects are allowed to be passed to isin(), you passed a [str] Commented Sep 22, 2022 at 3:34
  • I want to understand what is different in this data set that I can't just do it like I normaly do. which is df['campany_name] == 'Accenture' I should get boolenans, and with those id be able to get the row doing df[df['campany_name] == 'Accenture'] Commented Sep 22, 2022 at 3:35
  • 1
    If you edit the file with any text editor, you'll notice that there is an extra space after every company name e.g. "Accenture ". Commented Sep 22, 2022 at 4:02

2 Answers 2

1

Do

df['campany_name'] = df['campany_name'].astype(str)

and then you can try:

df.query('campany_name == Accenture')

or

df[df['campany_name'] == 'Accenture']

and if you know the row and column and you are trying to retrieve just one value you can do:

df.at[1, 'campany_name']

Also, remember that you are just printing information, if you need to save the result, assign it to something e.g:

acc_row = df.query('campany_name == Accenture')
Sign up to request clarification or add additional context in comments.

Comments

0

As you are trying to filter the dataframe given only a string, you can use df.Series.str.contains

aaa[aaa['campany_name'].str.contains('Accenture')]
 
                       campany_name  ...    jobs interviews
1                        Accenture   ...  4600.0     2500.0
5814    Accenture Federal Services   ...     NaN       20.0

[2 rows x 10 columns]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.