0

I am using a script from https://towardsdatascience.com/a-keras-pipeline-for-image-segmentation-part-1-6515a421157d to split a data set. I don't understand what this part is doing

all_frames = os.listdir(FRAME_PATH)
all_masks = os.listdir(MASK_PATH)


all_frames.sort(key=lambda var:[int(x) if x.isdigit() else x 
                                for x in re.findall(r'[^0-9]|[0-9]+', var)])
all_masks.sort(key=lambda var:[int(x) if x.isdigit() else x 
                               for x in re.findall(r'[^0-9]|[0-9]+', var)])

More specifically I do not understand what the everything the var: is doing. My first guess would be a list comprehension, but it does not follow the structure.

[ expression for item in list if conditional ] 

Also what is the purpose of this part re.findall(r'[^0-9]|[0-9]+', var) ?

thank you

1

1 Answer 1

2

The int(x) if x.isdigit() else x is a ternary operator ("if condition then this else that"), which you're right isn't part of the list comprehension. This is saying "turn x (from within the list comprehension) into an integer if it contains only digits".

So we could write this all out like:

def convert_integer(x):
    if x.isdigit():
        return int(x)
    else:
        return x

def key_function(var):
    return [convert_integer(x) 
               for x in re.findall(r'[^0-9]|[0-9]+', var)]

all_frames.sort(key = key_function)
Sign up to request clarification or add additional context in comments.

6 Comments

yes, as I mentioned, what I dont understand is what the return part is doing
@LisLou Do you mean the ternary operator int(x) if x.isdigit() else x? I've added a section to my answer describing that as well
@DavidRobinson Regarding x.isdigit(), given regex will match 'a', 'b' and 'c' in "abc" because of the first part [^0-9], so not everything has to be made up of digits.
So, [int(x) if x.isdigit() else x for x in re.findall(r'[^0-9]|[0-9]+', var)] is a list comprehension statement where ` [ expression for item in list if conditional ]` expression is int(x) if x.isdigit() else x and there is no conditional at the end? only a for iteration over the elements of the list?
I am not sure if I understand re.findall(r'[^0-9]|[0-9]+', var)], I tried this var = "blabla0236abs7b8b9b0102b1" print(re.findall(r'[^0-9]|[0-9]+', var)) and the output was ['b', 'l', 'a', 'b', 'l', 'a', '0236', 'a', 'b', 's', '7', 'b', '8', 'b', '9', 'b', '0102', 'b', '1'] From what I understand it will return a list of strings with all the matches of sequences of numbers[0-9], or this pattern [^0-9], according to the documentation ^ matches the start of a string, so [^0-9] means a sequence like (any character)+a number from 0 to 9?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.