2

I have got two lists in python with elements. I want to perform some checks in those two lists. My lists are the following:

list_A = [["'EASY'", "'LEVEL_C'", "'4'", '0.714', '\n'], ["'EASY'", "'LEVEL_D'", "'5'", '0.778', '\n'], ["'EASY'", "'LEVEL_D'", "'5'", '0.226', '\n'], ["'EASY'", "'LEVEL_D'", "'5'", '0.222', '\n'], ...]
list_B = [["'EASY'", "'LEVEL_B'", "'2'", '1.000', '\n'], ["'EASY'", "'LEVEL_C'", "'3'", '1.000', '\n'], ["'EASY'", "'LEVEL_D'", "'4'", '1.000', '\n'], ["'EASY'", "'LEVEL_D'", "'4'", '0.290', '\n'], ...]

For the variable "EASY" and for the variable level which takes the values (LEVEL_A - LEVEL_F) there is a third variable correspond to score (1-6) and the confidence variable (0-1). What I want to do is to compare the two lists for the variable easy and level and to find in all cases which of the two lists (list_A and list_B) has greater score and with which confidence. How can I do so?

The way that I am constructing my rules, in the beginning I ve got the rows derived from an executable and filter them into lists. A vector example for my lists is the following:

Rule: ('EASY', 'LEVEL_E') ==> ('4') , 0.182 
'EASY' 'LEVEL_E' '4'  0.182 
["'EASY'",  "'LEVEL_E'", , "'4'", '0.182', '\n']

and the code that I am using for creating the vector:

 for row in my_lines:
   print row
   row = re.sub('[()]', "", row)
   row = row.replace("Rule: ", "")
   row = row.replace(",", "")
   row = row.replace("==>", "")
   print row
   split = re.split(r' +', row)
   print split

Then as soon as I have created my lists i sort them with the second element which corresponds to the variable Level:

list_A.sort(key=lambda x: x[1])
list_B.sort(key=lambda x: x[1])

EDIT: I have sorted the lists with the variable level. Now I want to compare the two lists regarding the score for all the variable levels. When a level does not exist then the score is zero and when the same score exist twice the system should prefer the one with highest confidence. How can I compare all the possibles values for the variable level?

21
  • 2
    Why are the sublists not ordered? Sometimes 'LEVEL' is the first item and other times 'EASY'. Commented Apr 13, 2017 at 9:16
  • 1
    Also, how are these lists generated? Is using objects/dictionaries an option? Commented Apr 13, 2017 at 9:22
  • 2
    Clearly, the data structured is not adapted. You should use dictionnaries with keys such as 'Level' and 'score'. Then just build a list of dictionnaries which can be easily sorted and handled in the same way. Commented Apr 13, 2017 at 9:22
  • 1
    how did you produced these list? looks they are the result of an existing python code. If it's the case, can you show us the said python code ? Commented Apr 13, 2017 at 9:24
  • 1
    Show us the code that generates these lists. It looks like you should do it in a different way. It would also be good to know why you have these two lists. Please describe your goals and requirements in detail. Commented Apr 13, 2017 at 9:52

2 Answers 2

2

This is only a partial answer, but it would be a lot more pleasant to have the data in a dict of dicts:

dict_a = {
    'LEVEL_D': {'difficulty': 'EASY', 'score': 1, 'confidence': 0.778},
    'LEVEL_F': {'difficulty': 'EASY', 'score': 6, 'confidence': 0.750},
    'LEVEL_C': {'difficulty': 'EASY', 'score': 7, 'confidence': 0.714},
    }

dict_b = {
    'LEVEL_F': {'difficulty': 'EASY', 'score': 8, 'confidence': 0.800},
    'LEVEL_B': {'difficulty': 'EASY', 'score': 2, 'confidence': 0.900},
    'LEVEL_D': {'difficulty': 'EASY', 'score': 3, 'confidence': 1.000},
    }

Then you could write a simple for loop to get the desired values of the inner dicts:

for level in dict_a:
    if level in dict_b:
        stats_a = dict_a[level]
        stats_b = dict_b[level]
        score_a = stats_a['score']
        score_b = stats_b['score']
        conf_a = stats_a['confidence']
        conf_b = stats_b['confidence']
        print(level, score_a, score_b, conf_a, conf_b)

We need to figure out how to rearrange the data in this way. The list of lists approach could actually work, too, but less efficient. The main problem is that the data is not ordered correctly.

Edit: To get the name of the list with the higher score for the specific level you can do this:

for level in dict_a:
    if level in dict_b:
        stats_a = dict_a[level]
        stats_b = dict_b[level]
        container = 'A' if stats_a['score'] > stats_b['score'] else 'B'
        print('Container {} has the higher score for level {}.'.format(container, level))
Sign up to request clarification or add additional context in comments.

2 Comments

The thing is easier to find a solution with list of lists than re-arrange it into dictionary as it is. I have manage to sort the lists through the level and now i just need to perform the comparisons for all possible values of variable level= {LEVEL_A - LEVEL_F}
There are also cases that I have the same variable twice with different confidence or score.
0

The final solution to my question was to order the lists just using a simple sort for the strings and then zip the two lists in order to be able to perform the comparison. The code used is the following:

list_A.sort(key=lambda x: x[1])
list_B.sort(key=lambda x: x[1])
res = zip(list_A, list_B)

However it seems that the dictionary solution proposed in the previous answer is more efficient than using the lists.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.