String parsing up to a specific character (Python)

Question

Vogue (@voguemagazine) • Instagram photos and videos

Fashionista (@fashionista_com) • Instagram photos and videos

The Business of Fashion (@bof) • Instagram photos and videos I parsed the string inside the <title> tag in the Instagram's page.

I need to parse the screen name which is all strings before (@....) in the string above.

For my examples above, it will be Vogue, Fashionista, and The Business of Fashion respectively.

I tried something like

string.split(' ')[0].replace('\n', '') but this just parses the very first token.

artona · Accepted Answer · 2018-11-19 06:18:41Z

2

module "re" will help. Please find below a pattern that makes this possible:

import re
pattern = re.compile("(.+?) \(@.*?\)")

string = "Vogue (@voguemagazine) • Instagram photos and videos"
word = pattern.findall(string)[0]

In pattern "(.+?) \(@.*?\)"

(.+?) - catches all characters before space ("") and parentheses;
\(@.*?\) - catches things in parentheses (i.e. between "(\" and "\)"), e.g. "@" and all other characters (".*?")

answered Nov 19, 2018 at 6:10

artona

1,2929 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1