0

I think this has a simple answer, but I haven't been able to crack it. Say you have a URL that needs to be broken down into four parts:

comp1 = 'www.base.com/'
comp2 = list1   # a list of letters, say "AAA", "BBB", "CCC"
comp3 = list2   # another list, but this time dates: '2019/10/21', '2019/10/20', '2019/10/19'
comp4 = "/example.html"

I have tried to combine these in multiple different ways, and I know urllib.parse.urljoin is the best option, but it cannot take more than three arguments:

for i in comp2:
    iter1 = urllib.parse.urljoin(comp1, i)
    print(iter1)  # this pairs the first two components nicely

for j in comp3:
    iter2 = urllib.parse.urljoin(j, comp4)
    print(iter2)  # this just returns '/example.html', and nothing is joined

What's the most pythonic way to join these 4 components? I've tried ''.join(), but that only takes one argument. I'm going to have a lot more than just three iterations. In R, I would just slam my components in a paste0() and call it a night.

2
  • do you want your end result to be: www.base.com/AAABBBCCC/2019/10/212019/10/20/example.html if you can provide exactly what you are looking for it will help get an answer. :) It looks like you are just trying to combine strings and lists, maybe this will help? stackoverflow.com/questions/12633024/… Commented Oct 22, 2019 at 2:29
  • reduce in combination with urljoin seems like a good option, see here. Commented Oct 22, 2019 at 2:51

4 Answers 4

1

You were right to want to use ''.join(), but you are misunderstanding the argument... it should be any iterable (a list, a tuple, or similar). You can create an iterable on the fly by just putting the items in a list or tuple directly:

result = ''.join([comp1, comp2, comp3])

Of course, this assumes that each item in the list is itself a single string element. If comp2 and comp3 are already lists, you may need to also do joins to get them where you want them:

result = ''.join([
    comp1,
    '/'.join(comp2),
    '/'.join(comp3),
    comp4,
])

If you're using modern versions of Python (not Python 2.x) you can also format strings very nicely using f-strings:

result = f'www.base.com/{comp2}/{comp3}/example.html'

Note here, too, that you may need to do additional formatting if comp2 and comp3 are lists:

result = f'www.base.com/{"/".join(comp2)}/{"/".join(comp3)}/example.html'

In that case, make sure the quotes you use on the join strings are different than the quotes on the outer f-string or are properly escaped.

Sign up to request clarification or add additional context in comments.

Comments

1

Going off of your example

you can do something like this:

for x in comp2:
      for y in comp3:
          combine = comp1 + x + y + comp4

Comments

1

I suspect this isn't exactly what you want, but if it really is just jamming the string and lists together, this would work:

comp1 = 'www.base.com/'
comp2 = ["AAA", "BBB", "CCC"]
comp3 = ['2019/10/21', '2019/10/20', '2019/10/19']
comp4 = "/example.html"


url = comp1 + "".join(comp2) + "".join(comp3) + comp4

print (url)

results:

www.base.com/AAABBBCCC2019/10/212019/10/202019/10/19/example.html

You can fiddle with the "" in the join to make it toss in / or whatever delimiter you'd want for the items in the list. Perhaps you want AAA/BBB for example. So you would do "/".join([comp2]) instead.

The deal with join only taking 1 argument is it only takes a list, but that list can be made up of a bunch of stuff. Good luck :)

1 Comment

I'm aiming to scrape data, so iterating through the AAA to BBB plus the dates in their own individual links. I'll update the question to reflect that, but something like this: www.base.com/AAA/2010/10/21/example.html
0

My two cents:

def _url(r, *path_components):
    for c in path_components:
        r += "/{}".format(str(c))    
    return r

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.