2

So I have a biiiiig list of lists, looks like:

big_list = [[17465, [22, 33, 1, 7, 83, 54, 84, -5], '123-432-3'], [13254, [42, 64, 4, -5, 75, -2, 1, 6], '1423-1762-4'], [...........................................................................................................], [17264, [22, 75, 54, 2, 87, 12, 23, 86], '14234-453-1']]

I need to cycle over the entire list of lists and when it detects two or more strings (element [2] of each inner lists e.g. '123-423-3') that are the same it will amalgamate the lists of ints (element[1]) relating to that string with the list of ints relating to the last same value string detected.

6
  • 3
    Can you clarify what you want the resulting list to look like? Commented Jul 31, 2012 at 10:28
  • Is each list comprised of three elements? [int, list, string]? Is it important to preserve any kind of ordering? Commented Jul 31, 2012 at 10:30
  • Why aren't the top-level list entries classes? Commented Jul 31, 2012 at 10:31
  • 1
    relating to that string with the list of ints relating to the last same value string detected ??? I don't understand Commented Jul 31, 2012 at 10:45
  • Are you only amalgamating consecutive lists? Commented Jul 31, 2012 at 11:02

3 Answers 3

1

This is my solution if you are looking for the string matches anywhere in big_list:

>>> from collections import OrderedDict
>>> big_list = [[17465, [1, 2, 3], '123-432-3'], [13254, [4, 5, 6], '1423-1762-4'], [17264, [7, 8, 9], '14234-453-1'], [12354, [10, 11, 12], '14234-453-1'], [12358, [13, 14], '14234-453-1'], [99213, [15], '123-999-3'], [27461, [16, 17, 18], '123-432-3']]
>>> def amalgamate(seq):
        d = OrderedDict()
        for num, ints, text in big_list:
            d.setdefault(text, [num, [], text])[1].extend(ints)
        return d.values()

>>> amalgamate(big_list)
[[17465, [1, 2, 3, 16, 17, 18], '123-432-3'], [13254, [4, 5, 6], '1423-1762-4'], [17264, [7, 8, 9, 10, 11, 12, 13, 14], '14234-453-1'], [99213, [15], '123-999-3']]
Sign up to request clarification or add additional context in comments.

9 Comments

only looking for string matches, then amalgamating the list of ints relating to each string match so that there will be just one string value with a very large related list of ints.
this looks like it does exactly what I want it to, thak you very much!
@User1532369 Ah yes that's what I meant, so you should probably accept this answer since my other one finds only consecutive matches
I am getting the error: cannot import OrderedDict. Is this not supported by Python2.5?
@user1532369 yes it's new in 2.7, however you could find a recipe which gives you the same thing in 2.5 by searching 'python 2.5 ordereddict recipe'
|
1

i think this solves your problem.

big_list = [[17465, [22, 33, 1, 7, 83, 54, 84, -5], '123-432-3'], \
            [13254, [42, 64, 4, -5, 75, -2, 1, 6], '1423-1762-4'], \
            [17264, [22, 75, 54, 2, 87, 12, 23, 86], '14234-453-1']]


# adding same string element to big_list
big_list.append([22222, [10, 12, 13], '14234-453-1'])
#now should itterate big_list, and when '14234-453-1' is found in 2 inner lists.
#it will put the values [10, 12, 13] into the first instance and remove the second.

print "Before:"
for l in big_list:
      print l

seen_list = {}
del_list = []
for inner in xrange(len(big_list)):
      if big_list[inner][2] in seen_list:
            for item in big_list[inner][1]:
                  big_list[seen_list[big_list[inner][2]]][1].append(item)
            del_list.append(inner)
      else:
            seen_list[big_list[inner][2]] = inner

for i in reversed(del_list):
      del big_list[i]

print "after:"

for l in big_list:
      print l

result:

>>> 
Before:
[17465, [22, 33, 1, 7, 83, 54, 84, -5], '123-432-3']
[13254, [42, 64, 4, -5, 75, -2, 1, 6], '1423-1762-4']
[17264, [22, 75, 54, 2, 87, 12, 23, 86], '14234-453-1']
[22222, [10, 12, 13], '14234-453-1']
after:
[17465, [22, 33, 1, 7, 83, 54, 84, -5], '123-432-3']
[13254, [42, 64, 4, -5, 75, -2, 1, 6], '1423-1762-4']
[17264, [22, 75, 54, 2, 87, 12, 23, 86, 10, 12, 13], '14234-453-1']

3 Comments

This is EXACTLY what I needed, thank you very much!! However when I run it on mine absolutely nothing happens, hopefully I will figure out why. Will this work if the string appears more than twice?
from what i understand. your "big_list" contains smaller lists of this structure: [INT, LIST[INT,INT,INT,...INT], STR] what my code does is go through these smaller lists, and make a dictionary of each STR, each time it is found to already be in the big_list, then that whole inner list is marked for deletion, and the inner lists list of INT is added to the list of INT from the first time the str is found. i was actually wondering about the first INT in each inner list, the duplicates ones are lost?
The duplicates do not matter for the int, it is essentially a verification number for the string so I only need it once if that makes any sense to you?
0

In it's current form the question is unclear but this might be what you are looking for (this works for consecutive matches):

>>> from itertools import groupby
>>> from operator import itemgetter
>>> big_list = [[17465, [1, 2, 3], '123-432-3'], [13254, [4, 5, 6], '1423-1762-4'], [17264, [7, 8, 9], '14234-453-1'], [12354, [10, 11, 12], '14234-453-1'], [12358, [13, 14], '14234-453-1'], [99213, [1], '123-999-3']]
>>> def amalgamate(seq):
        for k, g in groupby(seq, itemgetter(2)):
            num, ints, text = next(g)
            for sublist in g:
                ints.extend(sublist[1])
            yield [num, ints, text]


>>> list(amalgamate(big_list))
[[17465, [1, 2, 3], '123-432-3'], [13254, [4, 5, 6], '1423-1762-4'], [17264, [7, 8, 9, 10, 11, 12, 13, 14], '14234-453-1'], [99213, [1], '123-999-3']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.