0
[Delta-1234, United-1345] Testing different airlines
[Delta-1234] Testing different airlines

I want to get Delta-1234 and United-1345 in the first case and just Delta-1234 in the second. Is it possible using findall?

3
  • I don't see how findall() could do it because you don't want the square brackets in the resulting list. So the square brackets can't be in the pattern. In @CertainPerformances answer you'd still have to split on commas and remove the square brackets. Commented Jul 31, 2018 at 0:11
  • oops, just split on the commas for @CertainPerformace - I missed that the square brackets are outside the capture group. That is assuming you want an actual list of flight-like things, e.g. a=[ 'Delta-1234', 'United-1345' ] instead of a list with a single csv-string like b=[ 'Delta-1234, United-1345' ]. Note len(a) == 2 while len(b) == 1. Commented Jul 31, 2018 at 0:31
  • @jgreve That's what I just i.e. len(b) == 2. But I wanted to see if its possible with just one regex rather than doing a split later. I actually want something like [('Delta', '1234'), ('United', ''1345)] that's why I thought findall may be a good option! Commented Jul 31, 2018 at 1:14

4 Answers 4

1

Do you really need regular expressions? You can just find elements between the brackets [ and ]

x = lambda s: s[s.index('['):s.index("]")+1]

string1 = "[Delta-1234, United-1345] Testing different airlines"
string2 = "[Delta-1234] Testing different airlines"

print(x(string1))
print(x(string2))

outputs

[Delta-1234, United-1345]
[Delta-1234]
Sign up to request clarification or add additional context in comments.

1 Comment

I just wanted a list as an output, I am not sure if this lamba gives me a list, but rather a string,
0

If you want to use a regular expression, just match [, and then (greedily) capture repeated non-]s:

>>> regex = re.compile(r"\[([^\]]+)")
>>> re.findall(regex, "[Delta-1234, United-1345] Testing different airlines")
['Delta-1234, United-1345']
>>> re.findall(regex, "[Delta-1234] Testing different airlines")
['Delta-1234']

Or use lookbehind

>>> regex = re.compile(r"(?<=\[)[^\]]+")
>>> re.findall(regex, "[Delta-1234, United-1345] Testing different airlines")
['Delta-1234, United-1345']
>>> re.findall(regex, "[Delta-1234] Testing different airlines")
['Delta-1234']

4 Comments

So the first out is a list of one item: ['Delta-1234, United-1345']. Can this be split to a list of two items using the regex?
If you want more than one group, then if you want to use findall, the returned value will have to be a list of tuples, there's no way around that. You can use r"\[(\S+)(?:, (\S+))?\]" to capture the first, or the first and second airline code.
The issue is that the regex wouldn't work if the string becomes [Delta-1234, United-1345, Spirit-8778] Testing different airlines. My point being, the airlines and their code can vary and can be more than 1.
If you want to use findall for this, then you'll need a separate capturing group for each substring. (Repeating a captured group, eg \[(\S+)(?:, (\S+))*`, doesn't work because only the last match for the second group would be retained in the result.) While you *could* manually repeat groups like r"[(\S+)(?:, (\S+))?(?:, (\S+))?]"` (repeat the groups as much as you need), that's pretty messy. Better to keep code DRY and branch out from pure re.findall.
0

Another way to achieve this using regex is:

import re

str1 = "[Delta-1234, United-1345] Testing different airlines"
str2 = "[Delta-1234] Testing different airlines"

regex_pattern = r"[^[]*\[([^]]*)\]"

print(re.match(regex_pattern, str1).groups()[0])
print(re.match(regex_pattern, str2).groups()[0])

It will print

Delta-1234, United-1345
Delta-1234

Comments

0

Given:

s='''\
[Delta-1234, United-1345] Testing different airlines
[Delta-1234] Testing different airlines'''

You can do:

>>> [e.split(', ') for e in re.findall(r'\[([^]]+)\]', s)]
[['Delta-1234', 'United-1345'], ['Delta-1234']]

3 Comments

If I were to use just [Delta-1234, United-1345] Testing different airlines, then re.findall(r'\[([^]]+)\]', s only creates one value in the list: ['Delta-1234, United-1345']. However, I am looking for two values in the list. Is that possible?
It is returning a list of lists. The string you state is being split correctly into a two element list inside another list.
You might try changing the e.split(', ') to e.split(',') (ie, no space after the comma). Or split with a regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.