1

I'm using the python split method to manipulate with some filepaths. It looks like this, where I split the filepath to make a list, and then do some slicing on it:

array = "/home/ask/Git/Zeeguu-API/zeeguu_core/user_statistics/main.py"
split = array.split("/")

Which outputs:

['', 'home', 'ask', 'Git', 'Zeeguu-API', 'zeeguu_core', 'user_statistics', 'main.py']

The issue here is the little empty string in the beginning of the list, it makes sense that it is there but is annoying, and messes with the slicing that I want to do.

How can I split, but omit the empty strings? I would rather like to avoid having to do a O(n) operation, just to filter out the empty string, I really hope there is somehow to avoid it in the call to split()

6
  • 3
    array[1:].split("/")? or array.split("/")[1:]? Commented May 13, 2021 at 22:01
  • Would it be fine if you just popped the starting string in the list? Commented May 13, 2021 at 22:01
  • 1
    You understand that the leading / in the path conveys information (it distinguishes a relative path from an absolute one), right? What are you trying to do with the split-up path information? Do you also want to omit empty path components in the middle (like in foo//bar)? Why? Have you considered using the built-in standard library support for manipulating file paths (pathlib, or at the very least os.path)? Commented May 13, 2021 at 22:01
  • i would prefer not to .I have to work with filepaths that both start with and without "/" Commented May 13, 2021 at 22:02
  • And i'm going to use the filepaths to build a graph that can model python projects. I therefore need lists that can represent directory paths, and therefore slashes are not important Commented May 13, 2021 at 22:04

4 Answers 4

2

You can do the following to solve the problem:

array.strip("/").split("/")

The fact that the empty string is there is useful when you want to reverse the operation with "/".join() and get the original string

Sign up to request clarification or add additional context in comments.

4 Comments

Out of curiosity, isnt this answer a bad one ? To my knowledge, OP said that he/she wants to avoid additional O(n) operation, yet thats exactly what strip does (according to this SO post -> stackoverflow.com/a/27684660/9559884) ... In my opinion, the time complexity of this code is O(2n), which is equal to O(n), yet why to run through the same string twice ? (If I am mistaken, please let me know, I will gladly learn something new :) )
@StyleZ - its not really even 0(2n) as any one string creation in the split is as expensive as popping the list. And there isn't really such a thing as O(2n)... its still O(n) in growth rate.
@tdelaney thank you for clarification. What I mean when I was talking about O(2n) was that in my opinion what OP wanted was that he/she does not want to iterate through the same string twice (i might be mistaken), and by doing "strip" you are creating a copy of a string removing leading and trailing characters, which is a single iteration of a string and then an additional iteration where its being split. To make things clear ... I like this solution and now its up to OP whether this is the solution he is looking for.
The reason i mentioned this solution is because OP said that they work with files that both have "/" in the beginning and also dont. The other solution would be to use if array.startswith("/"): array[1:].split("/"). Im not sure of that has a lower time complexity, if you know pls let me know so i cant add that to my answer
0

I would rather like to avoid having to do a O(n)

In case that this is an algorithmic homework, I recommend you to do not read this solution unless you figure it out on your own ... But if its not, then I recommend instead of using a split() function doing your own split ... The way you do that is just that you will iterate char by char in the string and you will manually create the list of output ...

index = 0
output_array = ['']

for character in string:
    if character != '/':
        if output_array[index] == None:
            output_array[index] = ''
        output_array[index] += character
    else if output_array[index] != '':   # character == '/'
        index++

The above code is just a pseudocode, so you will have to rewrite it on your own ... also, the problem with this solution is that it will output the [''] in case there is nothing in the path variable, but that is an easy fix :)

4 Comments

But a split in python is going to be more expensive than the default one implemented in C. Each one of those characters is really a string, with the overhead needed to both create and concatenate them.
thats a good point ... in that can it should be run using indexes of the string too right? (instead of forloop, there should be while character_pointer < len(string)) Do you think that this wouldsolve the problem ?
Python doesn't implement a special character object, just a str with a single character in it. Any time you reference a single character in a string, python has to create a str to represent it. Using an index is more expensive than iterating the characters because there is also a dereference of the string in there also. You could write this comparison in C or perhaps cython, but as long as you stick with python, you have its inherent inefficiencies.
thanks for clarification. I understand the problem now.
0
array = "/home/ask/Git/Zeeguu-API/zeeguu_core/user_statistics/main.py"
list1= array.split("/")
list_without_space = []
for element in list1:
    if element.strip():
        list_without_space.append(element)

print(list_without_space)

1 Comment

Answer's are great. But please add some text around your code to help the OP understand your answer. This encourages them not to copy/paste the answer. Thank you!
0

There is another possibility using a list comrehension:

array1 = "/home/ask/Git/Zeeguu-API/zeeguu_core/user_statistics/main.py"
split1 = array1.split(sep="/")
noempty_strings = [x for x in split1 if x !='']
print(noempty_strings)

I think this method is the fastest among all the ones mentioned above.

P.S. Actually, I recieved an inspiration for this anoswer from @paxdiablo's answer to this question. So, credit goes to paxdiablo.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.