2

How to filter out this list, so that we are left with only a list of strings that are in yyyy-mm-dd format?

2021-11-11
2021-10-01
some_folder
some_other_folder

so that we end up with a list like so:

2021-11-11
2021-10-01

Also what if the list has a prefix?

root/2021-11-11
root/2021-10-01
user/some_folder
root/some_other_folder

and we wanted to end up with:

root/2021-11-11
root/2021-10-01
2

3 Answers 3

5

I would let datetime module handle that for me using strptime. If it is not in '%Y-%m-%d' format, it raises ValueError :

import datetime

lst = ['2021-11-11', '2021-10-01', 'some_folder', 'some_other_folder',
       'root/2021-11-11', 'root/2021-10-01',
       'user/some_folder', 'root/some_other_folder']


def filter_(s):
    last_part = s.rsplit('/', maxsplit=1)[-1]
    try:
        datetime.datetime.strptime(last_part, '%Y-%m-%d')
        return True
    except ValueError:
        return False


print([i for i in lst if filter_(i)])

output :

['2021-11-11', '2021-10-01', 'root/2021-11-11', 'root/2021-10-01']
Sign up to request clarification or add additional context in comments.

7 Comments

Do you know how i can change your code to work for the edited version i posted? thanks :)
@caasswa If the dates appear at the end, you can rsplit the lines using slash and maxsplit=1 then take the second.
could you write a quick example? sorry super noob at this...
thank you so much :)
btw, would i need to do [-2] if the strings all have a / at the end?
|
2

You can use the re library for this. Something like this.

Edit: Changed my answer because of @SorousHBakhtiary's comment about an exception I forgot that happens when you modify an iterable object while iterating it.

import re

li = [
'root/2021-11-11',
'root/2021-10-01',
'user/some_folder',
'root/some_other_folder',
]

new_list = li.copy()

for string in new_list:
   if not re.fullmatch('.*\d{4}-\d{2}-\d{2}$',string):
      li.remove(string)

This can also be done in one line using list comprehension:

li = [
'root/2021-11-11',
'root/2021-10-01',
'user/some_folder',
'root/some_other_folder',
]

li = [string for string in li if re.fullmatch('.*\d{4}-\d{2}-\d{2}$',string)]

6 Comments

Useful for when the date is ২০২২-০৫-১৯.
This is not working. You shouldn't remove something from a list while you are iterating over it. Simple fix is to use for string in list.copy().
@SorousHBakhtiary right, i forgot about that issue
what if it is like the question i edited? where it only matters if it ends with yyyy-mm-dd and not what is at the start?
@caasswa then you can just change the re.fullmatch() call to look like re.fullmatch('.*\d{4}-\d{2}-\d{2}$',string)``. The .*` at the start and the $ at the end of the regular expression will make the program look for any string that ends with something in the yyyy-mm-dd format.
|
1
>>> import re
>>> 
>>> filter_pattern = re.compile(r'.*\d{4}-\d{2}-\d{2}$')
>>> 
>>> lst = [
... '2021-11-11', '2021-10-01', 'some_folder', 
... 'some_other_folder', 'root/2021-11-11', 'root/2021-10-01',
... 'user/some_folder', 'root/some_other_folder'
... ]
>>> 
>>> lst = [i for i in lst if (len(filter_pattern.findall(i) > 0)]
>>> 
>>> lst

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.