3

I have a python list made up of usernames and time stamp tuples. Imagine it's like the following:

[(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj),(username,datetime_obj)]

Next, imagine the list above has only 3 unique usernames, but that all datetime objects are unique.

What's the most efficient, pythonic way to derive a new list from the one above, which is again made up of tuples and the same usernames, except that next to each username, the most recent datetime_obj in the list (for that particular username) is repeatedly attached.

E.g. if starting list was [(sam,1),(sam,7),(sam,8),(jon,4),(mel,9),(mel,2),(mel,10),(jon,3),(jon,6)], I end up with [(sam,1),(sam,1),(sam,1),(jon,3),(mel,2),(mel,2),(mel,2),(jon,3),(jon,3)].

I used ints to depict datetime objects in the example above. This was just for simplicity.

Thanks in advance.

1
  • Just do the job with normal Python syntax (especially list comprehensions), and don't worry about efficiency or 'pythonic'. If you show code that works, we can suggest improvements. Commented Feb 4, 2016 at 3:09

1 Answer 1

3

I think you can't get around iterating over the list twice:

most_recent = {}
for user, date in myList:
    most_recent[user] = max(most_recent.get(user, date), date)

newList = [(user, most_recent[user]) for user, _ in myList]

You can do something like this, if you consider this more pythonic, but it is slower (quadratic complexity), so don't actually do it:

[(user, max(date for u, date in myList if u == user)) for user, _ in myList]
Sign up to request clarification or add additional context in comments.

5 Comments

i guess it should be [(user, min(date for u, date in myList if u == user)) for user, _ in myList]
@minitoto If they want the output described in the question, yes. But if they want the most recent date object, no. In any case, this is simple to change.
yeah i understand that, OP simplified the I/O but it's confusing :)
It's not more Pythonic to use a list comp when you'd be nesting comprehensions excessively and turning O(n) algorithms into O(n**2) algorithms. My only changes would be: 1. Use {} to initialize most_recent instead of the constructor (trivial) and 2. Change the if and set to the one-line most_recent[user] = max(most_recent.get(user, date), date) to make the code avoid special casing the "first time seen" case and avoiding the need to pick a specific sentinel value (0 in this case, but arbitrary datetime harder to choose).
@ShadowRanger To me the list comprehension is slightly more readable, because it is a more direct translation of the semantics, but of course it should not be used. I agree with your other comments, and made changes accordingly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.