As it seems, the requirements and solution shall be clarified and tested iteratively, I provide here
proposed solution incl. test suite to be used with pytest.
First, create test_trails.py file:
import pytest
def fix_trails(trails):
"""Clean up list of trails to make sure, longest phrases are processed
with highest priority (are sooner in the list).
This is needed, if some trail phrases contain other ones.
"""
trails.sort(key=len, reverse=True)
return trails
@pytest.fixture
def trails():
phrases = ["Fire trail", "Firetrail", "Fire Trail",
"FT", "firetrail", "Trail", "Fire Trails"]
return fix_trails(phrases)
def remove_trails(line, trails):
for trail in trails:
if trail in line:
res = line.replace(trail, "").strip()
return res.replace(" ", " ")
return line
scenarios = [
["Poverty Point FT", "Poverty Point"],
["Cedar Party Fire Trails", "Cedar Party Fire"],
["Mailbox Trail", "Mailbox"],
["Carpet Snake Creek Firetrail", "Carpet Snake Creek"],
["Pretty Gully firetrail - Roayl NP", "Pretty Gully - Roayl NP"],
]
@pytest.mark.parametrize("scenario", scenarios, ids=lambda itm: itm[0])
def test(scenario, trails):
line, expected = scenario
result = remove_trails(line, trails)
assert result == expected
The file defines the function removing not needed text from processed lines as well as it contains
test case test_trails.
To test it, install pytest:
$ pip install pytest
Then run the test:
$ py.test -sv test_trails.py
========================================= test session starts ==================================
=======
platform linux2 -- Python 2.7.9, pytest-2.8.7, py-1.4.31, pluggy-0.3.1 -- /home/javl/.virtualenvs/stack
/bin/python2
cachedir: .cache
rootdir: /home/javl/sandbox/stack, inifile:
collected 5 items
test_trails.py::test[Poverty Point FT] PASSED
test_trails.py::test[Cedar Party Fire Trails] FAILED
test_trails.py::test[Mailbox Trail] PASSED
test_trails.py::test[Carpet Snake Creek Firetrail] PASSED
test_trails.py::test[Pretty Gully firetrail - Roayl NP] PASSED
================ FAILURES ==================
______ test[Cedar Party Fire Trails] _______
scenario = ['Cedar Party Fire Trails', 'Cedar Party Fire']
trails = ['Fire Trails', 'Fire trail', 'Fire Trail', 'Firetrail', 'firetrail', 'Trail', ...]
@pytest.mark.parametrize("scenario", scenarios, ids=lambda itm: itm[0])
def test(scenario, trails):
line, expected = scenario
result = remove_trails(line, trails)
> assert result == expected
E assert 'Cedar Party' == 'Cedar Party Fire'
E - Cedar Party
E + Cedar Party Fire
E ? +++++
test_trails.py:42: AssertionError
======== 1 failed, 4 passed in 0.01 seconds ============
The py.test command discovers in the file the test case, finds input arguments, uses injection to
put into it the value of trails and parametrization of the test case provides the scenario
parameter.
You may then fine tune the function remove_trails and list of trails untill all passes.
When you are finished, you may move the remove_trails function where you need (probably incl.
trails list).
You may use this approach to test whatever of solutin proposed to your question.
splitmeans splitting a string into parts, creating a list.stripon the other hand removes part of a string, if that is possible. Your question is using wordsplitwhich confused me a bit. You have meanstrip.