1

I have some CSS and I'm looking to create a list of all the class names and identifiers. This is what I have:

var TheList = new List<string>();
var Test2 = Regex.Matches(TheCSS, ".-?[_a-zA-Z]+[_a-zA-Z0-9-]*(?=[^}]*\\{)");

foreach(Match m in Test2)
{
    TheList.Add(m.Value);
}

The problem is that there are some unwanted elements:

body
:hover
select
input
label
[for
input
[type
'radio

I've tried with several regex expressions that I've found online; this one is the closest but it's not perfect yet. Basically, it needs to include only elements that begin with # and . so as to avoid body and [type and then not include pseudo-selectors like :hover

What do I need to change in the regex to make it work?

2

1 Answer 1

5

Following the CSS standards, a class or ID must match [_A-Za-z0-9\-]+. A class or ID thus matches that string prefixed directly by either a # or ..

After determining that all you need to do is ensure that it's followed by a { before an } occurs to make sure you're outside a rule.

The resulting regexp would then be: ([\.#][_A-Za-z0-9\-]+)[^}]*{

Your sample case. Same regexp applied to Facebook CSS.

Sign up to request clarification or add additional context in comments.

3 Comments

Ok, this looks great. I changed it to ([\\.#][_A-Za-z0-9\\-]+)[^}][*{] to make sure to include * as an ending possibility.
It doesn't work if a selector contains multiple classes or ids. It will match only the first id or class from a selector. So this way you cannot extract all of them.
Based on that a better solution would be to use a positive lookahead: ([\.#][_A-Za-z0-9\-]+)(?=[^}]+{)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.