21

I have data naively collected from package dependency lists.

Depends: foo bar baz >= 5.2

I end up with

 d = set(['foo','bar','baz','>=','5.2'])

I don't want the numerics and the operands.

In Perl I would

@new = grep {/^[a-z]+$/} @old

but I can't find a way to e.g. pass remove() a lambda, or something.

The closest I've come is ugly:

[ item != None for item in [ re.search("^[a-zA-Z]+$",atom)   for atom in d] ]

which gets me a map of which values out of the set I want...if the order of the set is repeatable? I know that's not the case in Perl hashes.

I know how to iterate. :) I'm trying to do it the pythonesque Right Way

2
  • Take a look at this post (which is kind of your question in reverse): stackoverflow.com/questions/1112444/… Commented Aug 21, 2009 at 21:33
  • OT remark: The idiomatic way to test for None in Python is "is". Use "item is not None" instead of "item != None" Commented Aug 22, 2009 at 17:13

3 Answers 3

44

No need for regular expressions here. Use str.isalpha. With and without list comprehensions:

my_list = ['foo','bar','baz','>=','5.2']

# With
only_words = [token for token in my_list if token.isalpha()]

# Without
only_words = filter(str.isalpha, my_list)

Personally I don't think you have to use a list comprehension for everything in Python, but I always get frowny-faced when I suggest map or filter answers.

Sign up to request clarification or add additional context in comments.

2 Comments

Filter without lambda is A-OK
Filter without lambda in this (surprisingly common) case is A-OK as long as you're not mixing str and unicode objects, all hell breaks loose otherwise.
4

What about using filter function In this example we are trying to only keep the even numbers in list_1.

 list_1 = [1, 2, 3, 4, 5]
 filtered_list = filter(lambda x : x%2 == 0, list_1)
 print(list(filtered_list))

This prints the new list which contains only the even numbers from list_1

1 Comment

The accepted answer recommends filter(). You should upvote that rather than duplicating it.
2

How about

d = set([item for item in d if re.match("^[a-zA-Z]+$",item)])

that gives you just the values you want, back in d (the order may be different, but that's the price you pay for using sets.

1 Comment

match() makes the "^" useless: re.match("[a...").

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.