179

I have some simple python code that searches files for a string e.g. path=c:\path, where the c:\path part may vary. The current code is:

def find_path(i_file):
    lines = open(i_file).readlines()
    for line in lines:
        if line.startswith("Path="):
            return # what to do here in order to get line content after "Path=" ?

What is a simple way to get the text after Path=?

1
  • Be aware that you are returning on the first line occurrence within the file that starts with "Path=". Other answers to this post also do. But if the file is something like a DOS batch file you may actually want the last line occurrence from such a file depending if the "batch" or command file isn't filled with conditionals. Commented Aug 27, 2016 at 23:29

22 Answers 22

212

If the string is fixed you can simply use:

if line.startswith("Path="):
    return line[5:]

which gives you everything from position 5 on in the string (a string is also a sequence so these sequence operators work here, too).

Or you can split the line at the first =:

if "=" in line:
    param, value = line.split("=",1)

Then param is "Path" and value is the rest after the first =.

Sign up to request clarification or add additional context in comments.

7 Comments

+1 for the split method, avoids the slight ugliness of the manual slicing on len(prefix).
But also throws if your input isn't all in the form "something=somethingelse".
That's why I put the condition in front so it's only used if a "=" is in the string. Otherwise you can also test for the length of the result of split() and if it's ==2.
Like Dan Olson says split throws an exception if the delimiter is not present. partition is more stable, it also splits a string and always returns a three-element tuple with pre-, delimiter, and post-content (some of which may be '' if the delimiter was not present). Eg, value = line.partition('=').
Split doesn't throw an exception if the delimited is not present, it returns a list with the whole string. At least under python 2.7
|
136

Remove prefix from a string

# ...
if line.startswith(prefix):
   return line[len(prefix):]

Split on the first occurrence of the separator via str.partition()

def findvar(filename, varname="Path", sep="=") :
    for line in open(filename):
        if line.startswith(varname + sep):
           head, sep_, tail = line.partition(sep) # instead of `str.split()`
           assert head == varname
           assert sep_ == sep
           return tail

Parse INI-like file with ConfigParser

from ConfigParser import SafeConfigParser
config = SafeConfigParser()
config.read(filename) # requires section headers to be present

path = config.get(section, 'path', raw=1) # case-insensitive, no interpolation

Other options

1 Comment

One rare reason to indent three spaces instead of four.
111

Starting in Python 3.9, you can use removeprefix:

'Path=helloworld'.removeprefix('Path=')
# 'helloworld'

Comments

44

Python 3.9+

text.removeprefix(prefix)

Any Python version:

def remove_prefix(text, prefix):
    return text[len(prefix):] if text.startswith(prefix) else text

2 Comments

I like this one because you can replace "else text" with "else False" or "else None" or whatever -type- you want to return to indicate that the line in the file did not start with "Path=". Personally I like to surround my ternary operators with parentheses to stand out visually.
Useful one-liner: remove_prefix = lambda text, prefix: text[len(prefix):] if text.startswith(prefix) else text
21

For slicing (conditional or non-conditional) in general I prefer what a colleague suggested recently; Use replacement with an empty string. Easier to read the code, less code (sometimes) and less risk of specifying the wrong number of characters. Ok; I do not use Python, but in other languages I do prefer this approach:

rightmost = full_path.replace('Path=', '', 1)

or - to follow up to the first comment to this post - if this should only be done if the line starts with Path:

rightmost = re.compile('^Path=').sub('', full_path)

The main difference to some of what has been suggested above is that there is no "magic number" (5) involved, nor any need to specify both '5' and the string 'Path=', In other words I prefer this approach from a code maintenance point of view.

4 Comments

It doesn't work: 'c=Path=a'.replace("Path=", "", 1) -> 'c=a'.
That does not meet the original requirement of the string starting with "Path=".
You can replace the regex code with just rightmost = re.sub('^Path=', '', fullPath). The purpose of the compile() method is to make things faster if you reuse the compiled object, but since you throw it away after you use it, it has no effect here anyway. It's usually not worth worrying about this optimisation anyway.
I would add re.escape do the mix in case the prefix contains special characters. i.e. re.compile('^' + re.escape('Path='))
15

I prefer pop to indexing [-1]:

value = line.split("Path=", 1).pop()

to

value = line.split("Path=", 1)[1]
param, value = line.split("Path=", 1)

2 Comments

Nice alternative without "magic numbers". It's worth noting that this works because startswith has already been tested so split will divide "nothing" before and everything else after. split("Path=", 1) is more precise (in case of the prefix reappearing later in the string) but reintroduces a magic number.
Shorter version of the (very important) previous comment: this works ONLY if you test with startswith() first.
12

Or why not

if line.startswith(prefix):
    return line.replace(prefix, '', 1)

Comments

6

The simplest way I can think of is with slicing:

def find_path(i_file): 
    lines = open(i_file).readlines() 
    for line in lines: 
        if line.startswith("Path=") : 
            return line[5:]

A quick note on slice notation, it uses two indices instead of the usual one. The first index indicates the first element of the sequence you want to include in the slice and the last index is the index immediately after the last element you wish to include in the slice.
Eg:

sequence_obj[first_index:last_index]

The slice consists of all the elements between first_index and last_index, including first_index and not last_index. If the first index is omitted, it defaults to the start of the sequence. If the last index is omitted, it includes all elements up to the last element in the sequence. Negative indices are also allowed. Use Google to learn more about the topic.

Comments

5

Another simple one-liner that hasn't been mentioned here:

value = line.split("Path=", 1)[-1]

This will also work properly for various edge cases:

>>> print("prefixfoobar".split("foo", 1)[-1])
"bar"

>>> print("foofoobar".split("foo", 1)[-1])
"foobar"

>>> print("foobar".split("foo", 1)[-1])
"bar"

>>> print("bar".split("foo", 1)[-1])
"bar"

>>> print("".split("foo", 1)[-1])
""

Comments

5

How about..

line = r'path=c:\path'
line.partition('path=')

Output:

('', 'path=', 'c:\\path')

This triplet is the head, separator, and tail.

2 Comments

This doesn't work in all cases the same way. If the separator is present, then the result is the third item. Otherwise, the result is the first item.
You also have the third case where the separator is in the middle: "manpath=c:\path"
4

removeprefix() and removesuffix() string methods added in Python 3.9 due to issues associated with lstrip and rstrip interpretation of parameters passed to them. Read PEP 616 for more details.

# in python 3.9
>>> s = 'python_390a6'

# apply removeprefix()
>>> s.removeprefix('python_')
'390a6'

# apply removesuffix()
>>> s = 'python.exe'
>>> s.removesuffix('.exe')
'python'

# in python 3.8 or before
>>> s = 'python_390a6'
>>> s.lstrip('python_')
'390a6'

>>> s = 'python.exe'
>>> s.rstrip('.exe')
'python'

removesuffix example with a list:

plurals = ['cars', 'phones', 'stars', 'books']
suffix = 's'

for plural in plurals:
    print(plural.removesuffix(suffix))

output:

car
phone
star
book

removeprefix example with a list:

places = ['New York', 'New Zealand', 'New Delhi', 'New Now']

shortened = [place.removeprefix('New ') for place in places]
print(shortened)

output:

['York', 'Zealand', 'Delhi', 'Now']

Comments

4
import re

p = re.compile(r'path=(.*)', re.IGNORECASE)

path = r"path=c:\path"

re.match(p, path).group(1)

Output:

'c:\\path'

Comments

3
line[5:]

gives you characters after the first five.

Comments

2

line[5:] will give the substring you want. Search the introduction and look for 'slice notation'

Comments

2

Why not using regex with escape? ^ matches the initial part of a line and re.MULTILINE matches on each line. re.escape ensures that the matching is exact.

>>> print(re.sub('^' + re.escape('path='), repl='', string='path=c:\path\nd:\path2', flags=re.MULTILINE))
c:\path
d:\path2

Comments

1

If you know list comprehensions:

lines = [line[5:] for line in file.readlines() if line[:5] == "Path="]

1 Comment

There was an edit suggesting line.startswith(...) is 10X faster. My testing did not confirm this. Happy to change it if evidence supporting that assertion is provided.
0

Try Following code

if line.startswith("Path="): return line[5:]

1 Comment

What is the difference between your answer and the answer accepted? I see that it is in the first part of the other answer.
0

A solution using split that hasn't been proposed yet, I think:

def remove_prefix(prefix, string):
    prefix, *tail = string.split(prefix, 1)
    if not prefix:
        return tail[0]
    raise ValueError("...")

split(..., 1) will return a 2-tuple whose first item is the empty string when the separator is found at the start of the string.

for test in ("10.94", "$10.94", "US$10.94", "10.94$", "10.94$ ONLY"):
    try:
        print(test, remove_prefix("$", test))
    except:
        print(test, "bad format")
10.94 bad format
$10.94 10.94
US$10.94 bad format
10.94$ bad format
10.94$ ONLY bad format

Comments

-1

I guess this what you are exactly looking for

    def findPath(i_file) :
        lines = open( i_file ).readlines()
        for line in lines :
            if line.startswith( "Path=" ):
                output_line=line[(line.find("Path=")+len("Path=")):]
                return output_line

Comments

-2

without having a to write a function, this will split according to list, in this case 'Mr.|Dr.|Mrs.', select everything after split with [1], then split again and grab whatever element. In the case below, 'Morris' is returned.

re.split('Mr.|Dr.|Mrs.', 'Mr. Morgan Morris')[1].split()[1]

Comments

-2

The below method can be tried.

def remove_suffix(string1, suffix):
    length = len(suffix)

    if string1[0:length] == suffix:
        return string1[length:]
    else:
        return string1

suffix = "hello"
string1 = "hello world"

final_string = remove_suffix(string1, suffix)
print (final_string)

Comments

-3

This is very similar in technique to other answers, but with no repeated string operations, ability to tell if the prefix was there or not, and still quite readable:

parts = the_string.split(prefix_to_remove, 1):
    if len(parts) == 2:
        #  do things with parts[1]
        pass

1 Comment

This will remove stuff if "Path=" appears anywhere in the string, not just at the beginning.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.