0
ListA = [['abc','15','2021-10-02 08:53:29'],
         ['def','10','2021-10-01 07:52:19'],
         ['abc','15','2021-10-02 09:53:29'],
         ['def','10','2021-10-01 06:52:19'],
         ['gfc','10','2021-10-01 07:52:19']]

ListB = ['abc','def']

Since ListB has 'abc' and 'def', the script should

  1. compare subarrays in array A which have "abc":

    1. find the latest date. eg.'2021-10-02 09:53:29'
    2. remove subarray which has "abc" but the date is earlier than the latest date
  2. compare subarrays in array A which have "def" a) find the latest date. eg. '2021-10-01 07:52:19' b) remove subarray which has "def" but the date is earlier than the latest date

The final output should be

A = [['def','10','2021-10-01 07:52:19'],
     ['abc','15','2021-10-02 09:53:29'],
     ['gfc','10','2021-10-01 07:52:19']]

How to do this in Python?

3
  • 3
    Hi! Welcome to Stack Overflow! You are very welcome to be here, we love having your here. It's just your question that we are having troubles welcoming, because we are not a code writing service. So you are welcome and we welcome you, please feel that you are welcomed, but your question is going to get closed and deleted. If you do not abide by these guidelines. Commented Oct 5, 2021 at 12:03
  • 3
    Please provide what you have already tried and which part you are stuck with Commented Oct 5, 2021 at 12:04
  • Also I assume the last sublist is not intended to be a nested sublist with a missing closed bracket so I changed it to be a list of lists Commented Oct 5, 2021 at 12:07

1 Answer 1

2

You could sort the array by key and date, then groupby and take the item with highest date if key is in ListB else, the whole array. Finally flatten using chain:

from itertools import groupby, chain

list(chain(*([list(g)[-1]] if k in ListB else list(g)
             for k,g in groupby(sorted(ListA, key=lambda x: (x[0], x[-1])), lambda x: x[0])
            )
          ))

output:

[['abc', '15', '2021-10-02 09:53:29'],
 ['def', '10', '2021-10-01 07:52:19'],
 ['gfc', '10', '2021-10-01 07:52:19']]

NB. this is not preserving the original order of the lists

keeping order

If order is important, it is possible to save it (here using enumerate) and to reorder after filtering:

from itertools import groupby, chain

[
i[1] for i in
sorted(chain(*([list(g)[-1]] if k in ListB else list(g)
             for k,g in groupby(sorted(enumerate(ListA), key=lambda x: (x[1][0], x[1][-1])), lambda x: x[1][0])
            )
          ))
]

output:

[['def', '10', '2021-10-01 07:52:19'],
 ['abc', '15', '2021-10-02 09:53:29'],
 ['gfc', '10', '2021-10-01 07:52:19']]
Sign up to request clarification or add additional context in comments.

3 Comments

@liuxu you can consider marking the answer as accepted if your problem is solved ;)
Thanks @mozway, but can you explain more about the script? I want to understand more about it :)
@liuxu what have you not understood? Group by key (second item), if the key is in ListB take the last (most recent) element, else the whole list. Finally, flatten the lists to original format. Let me know if anything is unclear, but the best might be to try running the code from in to out to see the intermediates

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.