0
def clean_doc (df): 
 for rownum in range(0,df.shape[0]):
    if "LM_" not in df.iloc[rownum][6]:
        clean_df = df.drop([df.index[rownum]])
 return clean_df

I want to delete a row if it does not start with "LM_"

Also tried:

df.drop([rownum]) 

and many more, but it only deletes one line of my dataset.. but it should be a lot more

1 Answer 1

1

You could try:

df[df['<your_column>'].str.startswith('LM_')]

Example:

import pandas as pd

df = pd.DataFrame({'col':['abc', 'LM_abc']})

print(df[df['col'].str.startswith('LM_')])

Output:

      col
1  LM_abc

Your code is only deleting one line because you're overwriting the clean_df variable every time you loop.

Sign up to request clarification or add additional context in comments.

2 Comments

I think OP wants df[~df['<your_column>'].str.startswith('LM_')] instead.
@HenryYik I think it's the opposite, OPs code is deleting rows that do not contain LM_

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.