2

i have list structure look like this :

example =
[
   {
      "value":"promo",
      "score":0.3333333333333333,
      "slugger":"promoKeyword",
      "type":"normal",
   },
   {
      "value":"unknown",
      "score":1.0,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.3333333333333333,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.5,
      "slugger":"promoCart",
      "type":"normal",
   }
]

i want to filter the list by maximum score in [score] key if only the [slugger] key has same value(this mean [slugger] can have multiple same value and we only take the highest score of it)

so the example will look like this

[
   {
      "value":"promo",
      "score":0.3333333333333333,
      "slugger":"promoKeyword",
      "type":"normal",
   },
   {
      "value":"unknown",
      "score":1.0,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.5,
      "slugger":"promoCart",
      "type":"normal",
   }
]

my effort right now look like this,but it fails to satisfied the condition

score_data = []
for data in example:
    score_data.append(data['score'])
max_score = max(score_data)
example = [x for x in example if x['score'] == max_score and x['score'] > 0]
example = list({ each['slug'] : each for each in example }.values())

can you guys help ? thank you in advance..pardon my english

1
  • I don't have much time, so only general advice. Read about groupby - sort by slugger value, then group by it (groupby only groups adjacent elements, hence the sorting first), and then you can take the max. Commented Nov 27, 2019 at 16:21

4 Answers 4

1

One solution using itertools:

data = [
   {
      "value":"promo",
      "score":0.3333333333333333,
      "slugger":"promoKeyword",
      "type":"normal",
   },
   {
      "value":"unknown",
      "score":1.0,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.3333333333333333,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.5,
      "slugger":"promoCart",
      "type":"normal",
   }
]

from itertools import groupby, islice

rv = []
for _, g in groupby(sorted(data, key=lambda k: (k['slugger'], -k['score'])), lambda k: k['slugger']):
    rv.extend(islice(g, 0, 1))

from pprint import pprint
pprint(rv, width=30)

Prints:

[{'score': 0.5,
  'slugger': 'promoCart',
  'type': 'normal',
  'value': 'theory'},
 {'score': 1.0,
  'slugger': 'promoCategory',
  'type': 'normal',
  'value': 'unknown'},
 {'score': 0.3333333333333333,
  'slugger': 'promoKeyword',
  'type': 'normal',
  'value': 'promo'}]
Sign up to request clarification or add additional context in comments.

1 Comment

thank you!...solved ,its really helpfull..i need to explore itertools then..thanks again
0

Perhaps convert the list of dictionaries to a dataframe and then extract the stuff you want?

list_values = [
   {
      "value":"promo",
      "score":0.3333333333333333,
      "slugger":"promoKeyword",
      "type":"normal",
   },
   {
      "value":"unknown",
      "score":1.0,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.3333333333333333,
      "slugger":"promoCategory",
      "type":"normal",
   },
   {
      "value":"theory",
      "score":0.5,
      "slugger":"promoCart",
      "type":"normal",
   }
]

df = pd.DataFrame(list_values)

# Get average scores for each slugger:
df.groupby('slugger')['score'].mean()

# Get max score for each slugger:
df.groupby('slugger')['score'].max()

You haven't specified what the example variable is, so I can't really help you with that.

Comments

0

You can fist create the filter feature dictionary and then create a new list based on this filter dictionary. For example in your example the code will look like this.

d = dict()

## this will create a dictionary of categories as keys and highest score as value

for e in example:
   if e['slugger'] in d:
     if e['score']> d['slugger']:
       d['slugger'] = e['score']
   else:
     d[e['slugger']] = e['score']

## this will filter the original list by dictionary
result = [e for e in example if d[e['slugger']] == e['score']]

Comments

0

Use list comprehensions

data = [
    {
        "value":"promo",
        "score":0.3333333333333333,
        "slugger":"promoKeyword",
        "type":"normal",
    },
    {
        "value":"unknown",
        "score":1.0,
        "slugger":"promoCategory",
        "type":"normal",
    },
    {
        "value":"theory",
        "score":0.3333333333333333,
        "slugger":"promoCategory",
        "type":"normal",
    },
    {
        "value":"theory",
        "score":0.5,
        "slugger":"promoCart",
        "type":"normal",
    }]
print([
    max([y['score'] for y in data if y['slugger'] == x]) 
        for x in set([z['slugger'] for z in data])
])

set([z['slugger'] for z in data])

That part creates an iterable element with unique values, in your case, unique 'slugger' values.

[[y['score'] for y in data if y['slugger'] == x] for x in set([z['slugger'] for z in data])]

That part return the scores grouped in a list by the sluggers.

And finally we use max to get only the max values of each group.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.