1

I have the following dataframe:

nutsgdp
Out[77]: 
              2010       2011       2012  ...       2016       2017       2018
NUTS_ID                                   ...                                 
AT       295896.60  310128.70  318653.00  ...  357299.70  370295.80  385711.90
AT1      131114.27  136271.77  139149.68  ...  155609.11  159879.39  166443.24
AT11       6698.37    7012.58    7365.43  ...    8353.78    8771.65    9005.49
AT111       738.53     784.29     791.16  ...     923.96     996.55     996.55
AT112      3843.03    4028.02    4313.17  ...    4923.69    5165.46    5165.46
           ...        ...        ...  ...        ...        ...        ...
UKN15      3762.30    3604.13    4228.35  ...    5391.50    5089.14    4203.36
UKN16      2169.86    2162.22    2452.28  ...    2801.88    2801.14    2730.28
UKZ       30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66
UKZZ      30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66
UKZZZ     30761.26   33592.50   32090.74  ...   13343.86   12887.29   20225.66

[1794 rows x 9 columns]

I would like to drop all the rows where the index is longer than 2 characters and ends on 'Z'. That means, as an example, dropping 'UKZ', 'UKZZ' and 'UKZZZ', but keeping 'CZ'. What would be the best way to do this? Thanks in advance for your help.

1 Answer 1

2

Use Series.str.contains with invert mask by ~ and filter by boolean indexing:

df = df[~df.index.str.contains('(.){2,}Z$')]

Or use Series.str.endswith with Series.str.len:

df = df[~df.index.str.endswith('Z') | (df.index.str.len() <= 2)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.