1

So I have my python script which lists a report about each ec2 instance. I have a function which sorts the networkattatchtime of an instance but everytime i use stftime to format the output of the date , it seems to mess up the sorting and lists instances in a random order and not by the most oldest CreationDateTime. The output i get is as follows :

i-09dc54328002240ff,Aug 05 2021,asg-workxxx
i-048e92c5a4741d2b1,Mar 09 2017,False
i-0d649cebdf54bd2f4,Mar 12 2020,asg-dyyyyy
i-0ff596f1dc01b61d8,Mar 17 2021,asg-base-test
i-06db4eb158ad0b071,May 12 2021,False
i-0f285277529543462,May 18 2018,False
i-0f67cf99fb9f96c3f,Oct 14 2020,asg-elk-test
i-01734dfef0159c5c8,Oct 20 2020,asg-lb-test
i-0539c8dfc839cbfda,Oct 26 2020,asg-stand-base-test 

You can see the CreationDateTime are not in a sorted order.

My code is as follows :

       response = ec2_client.describe_instances(
       
        MaxResults=10
    )
    
    # return json data from describe instances and filter what is needed
    instances = (len(response['Reservations']))
    header_row = 'InstanceID, CreationDateTime, AutoScalingGroupName' + '\n'
  

    for x in range(instances):

        # Get InstanceId
        instance_id = (response['Reservations'][x]
                       ['Instances'][0]['InstanceId'])
        # Get NetworkInterfacws AttatchTime
        network_interface_id = (
            response['Reservations'][x]['Instances'][0]['NetworkInterfaces'][0]['NetworkInterfaceId'])
        
        network_interface_details = ec2_client.describe_network_interfaces(
            NetworkInterfaceIds=[network_interface_id])
        networkinterface_id_attachedtime = network_interface_details[
            'NetworkInterfaces'][0]['Attachment']['AttachTime']
    
        time_between_insertion = datetime.now(
            timezone.utc) - networkinterface_id_attachedtime

        # Get Autoscaling GroupName
        tags = (response['Reservations'][x]['Instances'][0]['Tags'])
        autoscaling_group_name = get_tag(tags, 'aws:autoscaling:groupName')

        # print results
        if time_between_insertion.days > max_age:
            line = '{},{},{}'.format(
                instance_id, formatted_date_networkinterface_id, autoscaling_group_name)
            instances_list.append(line)

    sorted_list= sorted(instances_list, key=lambda v: v.split(',')[1])

    for instance in sorted_list:
        print(instance) ```

2
  • What is your current output and expected output? Commented Sep 15, 2021 at 12:23
  • You are sorting a list of strings by sub-string sorted(instances_list, key=lambda v: v.split(',')[1]) so there is no "date" per se. Note in your output the "A" month dates appear first... Commented Sep 15, 2021 at 12:46

1 Answer 1

1

To parse dates, you can use datetime.datetime.strptime:

import datetime  # strptime
import operator  # itemgetter

data_unparsed = ['i-09dc54328002240ff,Aug 05 2021,asg-workxxx',
'i-048e92c5a4741d2b1,Mar 09 2017,False',
'i-0d649cebdf54bd2f4,Mar 12 2020,asg-dyyyyy',
'i-0ff596f1dc01b61d8,Mar 17 2021,asg-base-test',
'i-06db4eb158ad0b071,May 12 2021,False',
'i-0f285277529543462,May 18 2018,False',
'i-0f67cf99fb9f96c3f,Oct 14 2020,asg-elk-test',
'i-01734dfef0159c5c8,Oct 20 2020,asg-lb-test',
'i-0539c8dfc839cbfda,Oct 26 2020,asg-stand-base-test']

data = [((row := s.split(','))[0], datetime.datetime.strptime(row[1], '%b %d %Y'), row[2]) for s in data_unparsed]

data_sorted = sorted(data, key=operator.itemgetter(1))
print(data_sorted)
# [('i-048e92c5a4741d2b1', datetime.datetime(2017, 3, 9, 0, 0), 'False'),
#  ('i-0f285277529543462', datetime.datetime(2018, 5, 18, 0, 0), 'False'),
#  ('i-0d649cebdf54bd2f4', datetime.datetime(2020, 3, 12, 0, 0), 'asg-dyyyyy'),
#  ('i-0f67cf99fb9f96c3f', datetime.datetime(2020, 10, 14, 0, 0), 'asg-elk-test'),
#  ('i-01734dfef0159c5c8', datetime.datetime(2020, 10, 20, 0, 0), 'asg-lb-test'),
#  ('i-0539c8dfc839cbfda', datetime.datetime(2020, 10, 26, 0, 0), 'asg-stand-base-test'),
#  ('i-0ff596f1dc01b61d8', datetime.datetime(2021, 3, 17, 0, 0), 'asg-base-test'),
#  ('i-06db4eb158ad0b071', datetime.datetime(2021, 5, 12, 0, 0), 'False'),
#  ('i-09dc54328002240ff', datetime.datetime(2021, 8, 5, 0, 0), 'asg-workxxx')]

Alternatively, as a one-liner:

data_sorted = sorted(data_unparsed, key=lambda s: datetime.datetime.strptime(s.split(',')[1], '%b %d %Y'))

print(data_sorted)
# ['i-048e92c5a4741d2b1,Mar 09 2017,False',
#  'i-0f285277529543462,May 18 2018,False',
#  'i-0d649cebdf54bd2f4,Mar 12 2020,asg-dyyyyy',
#  'i-0f67cf99fb9f96c3f,Oct 14 2020,asg-elk-test',
#  'i-01734dfef0159c5c8,Oct 20 2020,asg-lb-test',
#  'i-0539c8dfc839cbfda,Oct 26 2020,asg-stand-base-test',
#  'i-0ff596f1dc01b61d8,Mar 17 2021,asg-base-test',
#  'i-06db4eb158ad0b071,May 12 2021,False',
#  'i-09dc54328002240ff,Aug 05 2021,asg-workxxx']

Relevant documentation:

Sign up to request clarification or add additional context in comments.

3 Comments

I have just tried that and i get this error : raise ValueError("time data %r does not match format %r" % ValueError: time data '2020-03-12 10:07:07+00:00' does not match format '%b %d %Y' @Stef
@devdude, aye, the format I specified matches strings like "Aug 05 2021". If the strings are different, such as "2020-03-12 10:07:07+00:00", you need a different format. The documentation I linked explains how to write the format (probably "%Y-%m-%d %H:%M:%S%z" in the case of '2020-03-12 10:07:07+00:00'). Or you could try not specifying a format at all, and letting datetime figure it out. The danger in this case is that you have to make sure that datetime is not confusing day and month (ie 07-12 is 12 of july and not 7 of december).
If you have only those two formats, you could use a try / except block to try applying a format and if it fails, try applying the second format.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.