1

I have this list of strings in PYthon 2.7:

list_a = ['temp_52_head sensor,
uploaded by TS','crack in the left quadrant, uploaded by AB, Left in 2hr
sunlight','FSL_pressure, uploaded by RS, no reported vacuum','art
9943_mercury, Uploaded by DY, accelerated, hurst potential too
low','uploaded by KKP, Space 55','avogadro reading level,
uploaded by HB, started mini counter, pulled lever','no comment
yesterday, Uploaded to TFG, level 1 escape but temperature stable,
pressure lever north']

In each list item, there is a string

uploaded by SOMEONE

I need to extract SOMEONE.

However, as you can see, SOMEONE:

  1. changes from one item in the list to the next.
  2. can be 2 or 3 characters in length (text only, no numbers).
  3. occurs at different positions in the string.
  4. uploaded also occurs as Uploaded
  5. uploaded sometimes occurs before any comma

Here is what I need to pull out:

someone_names = ['TS','AB','RS','DY','KKP','HB','TFG']

I was thinking of using regular expressions, but the problems I am facing are from points 2. and 3. above.

Is there a way to pull out these characters from the list?

4 Answers 4

4

You can implement regular expression using a list comprehension.

>>> import re
>>> list_a = [
      'temp_52_head sensor, uploaded by TS',
      'crack in the left quadrant, uploaded by AB, Left in 2hr sunlight',
      'FSL_pressure, uploaded by RS, no reported vacuum',
      'art9943_mercury, Uploaded by DY, accelerated, hurst potential too low',
      'uploaded by KKP, Space 55',
      'avogadro reading level, uploaded by HB, started mini counter, pulled lever',
      'no comment yesterday, Uploaded to TFG, level 1 escape but temperature stable,pressure lever north'
]
>>> regex = re.compile(r'(?i)\buploaded\s*(?:by|to)\s*([a-z]{2,3})')
>>> names = [m.group(1) for x in list_a for m in [regex.search(x)] if m]
['TS', 'AB', 'RS', 'DY', 'KKP', 'HB', 'TFG']
Sign up to request clarification or add additional context in comments.

3 Comments

Hi, this works but I have minimal experience with re.compile() Could you please explain those 2 lines, in particular the first one?
I have one last question: The method works even if I use [a-z] instead of [A-Z]. Why did you use uppercase for the alphabets?
Thanks a lot! All my questions have been answered. I found this link to be really good for re.compile(): diveintopython3.net/regular-expressions.html.
1

Not regex, but more verbose approach could be this:

import re
name = re.search(re.escape("uploaded by ")+"(.*?)"+re.escape(","),list_a[x]).group(1)

1 Comment

^^^^^ I get this error message TypeError: list indices must be integers, not str.
0

It looks like a regex such as this would fit your requirements, unless I'm missing something:

/[U|u]ploaded by ([A-Z]{2}|[A-Z]{3}),/

Alternatively, it appears (from your sample) that you could also split the string over commas and pull the element from the array that has the string "ploaded by" (avoids the possibility of upper/lower "u"), split it over spaces, then take the last element in the resulting array.

Comments

0

This regex would hit all of those and if you changed how many letters were in the uploader initials, it would still work. This will match regardless if there is a comma, or a single quote after the two or three letters. It will also capture all of the data you are looking for:

import re

m = re.compile('uploaded ((by)|(to)) ([a-z]+)', flags=re.IGNORCASE)

You can then use the search patter object m with the search() function and it will pull out all of the matches. The 4th match in each iteration is the data you are looking for.

3 Comments

Hi, this seems like most simple answer, but re.IGNORECASE says module object has no attribute IGNORECASE. Does that not work for Python 2.7?
Ah, it needed the flags=. fixed it.
I barely know Python, but I wonder if you could just join your array of strings into one long string, call the search( ) function on it, and use the .group function. I believe it would be in group(4)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.