1

i have an Array arr = [title, fileurl] so when i do print arr it goes like this:

['name1', 'url1']
['name1', 'url2']
['name1', 'url3']
['name2', 'url1']
['name2', 'url2']
['name3', 'url1']

I would like to group these array by the first element, it means I would like to have:

['name1', 'url1', 'url2', 'url3']
['name2', 'url1', 'url2']
['name3', 'url1']

My code:

for final in posterlink:
    pagesourcec = requests.get(final)
    soupc = BeautifulSoup(pagesourcec.text, "html.parser")
    strc = soupc.findAll("iframe", attrs={"id": "myframe"})
    title = soupb.find("li",{"class": "breadcrumb-item active"}).get_text()
    for embedlink in strc:
        fff = embedlink.get('data-src')
        arr = [title, fff]
        print arr 
3
  • arr = [title, fff]... what is title here? Commented Nov 21, 2018 at 11:22
  • Do you really need an array? Would not it be better with a dictionary instead? so you could have at the end {"name1": ["url1", "url2", "url3"], "name2": ["url1", "url2"]} Commented Nov 21, 2018 at 11:22
  • sorry it's variable title = soupb.find("li",{"class": "breadcrumb-item active"}).get_text() Commented Nov 21, 2018 at 11:24

2 Answers 2

4

You can do this:

from collections import defaultdict as ddict

group = ddict(list)

for name, url in arr:
  group[name].append(url)

And if you absolutely want it as a list of lists, you can then follow up with this:

group = [[name, *urls] for name, urls in group.items()]

Edit: It's important to note that the above line works with python 3, which is what you should be using anyways. However, for the sake of completeness if you're using python 2.7, then use this:

group = [[name] + urls for name, urls in group.items()]
Sign up to request clarification or add additional context in comments.

Comments

-2

Try This:

a = [['name1', 'url1'],
 ['name1', 'url2'],
 ['name1', 'url3'],
 ['name2', 'url1'],
 ['name2', 'url2'],
 ['name3', 'url1']]
d = {}
for elem in a:
    if elem[0] not in d:
        d[elem[0]] = []
    d[elem[0]].append(elem[1:])

Output:

{'name1': [['url1'], ['url2'], ['url3']], 'name2': [['url1'], ['url2']], 'name3': [['url1']]}

6 Comments

You should not use dict as variable name
Now @Andersson.
Now the dict name is OK, but it still not clear what your code should do... What is a? Is it list of lists? Also the output not really looks like the one that OP is looking for
Now Fully Updated with a. btw a is lists of list.
IMO instead of re-formatting data it's better to get data in correct format initially... Also your key values are not lists of elements, but lists of lists with single element - still doesn't look like OP requirements met...
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.