I'm writing a Python CSS-selector library that allows one to write these kinds of expressions in Python as a pet project. The goal of the library is to represent selectors in a flat, intuitive and interesting way; all valid syntax defined by the Selectors Level 4 Draft must be supported, in one way or another.
# lorem|foo.bar[baz^="qux"]:has(:invalid)::first-line
selector = (Namespace('lorem') | Tag('foo')) \
.bar \
# Can also be written as [Attribute('baz').starts_with('qux')]
[Attribute('baz', '^=', 'qux')] \
# '>>' is used instead of ' '.
[:'has', (Selector.SELF >> PseudoClass('invalid'),)] \
[::'first-line']
Here's how the hierachy looks like (/ signifies an alias, () mixin superclasses):
Selector(ABC) # Enum too?
├── PseudoElement
├── ComplexSelector(Sequence[CompoundSelector | Combinator])
├── CompoundSelector(Sequence[SimpleSelector])
├── SimpleSelector
│ ├── TypeSelector / Tag
│ ├── UniversalSelector
│ ├── AttributeSelector / Attribute
│ ├── ClassSelector / Class
│ ├── IDSelector / ID
│ └── PseudoClass
├── SELF / PseudoClass('scope')
└── ALL / UniversalSelector()
Combinator
├── ChildCombinator: '__gt__' / '>'
├── DescendantCombinator: '__rshift__' / '>>'
├── NamespaceSeparator: '__or__' / '|'
├── NextSiblingCombinator: '__add__' / '+'
├── SubsequentSiblingsCombinator: '__sub__' / '-'
└── ColumnCombinator: '__floordiv__' / '//'
This design has some disadvantages:
The replacements of combinators:
- Descendant combinator (
) → right shift (>>) - Column combinator (
||) → floor division (//) - Subsequent-siblings combinator (
~) → minus/subtract (-)
>>and//are currently not valid combinators, but may be in the future. The last is much safer, since-is already considered a valid character for<ident-token>s.- Descendant combinator (
Functional pseudo-classes needs a comma between its name (a string/non-callable) and its arguments (a tuple):
[:'where', (Class('foo'), Class('bar'))]
Those disadvantages might need to be considered while modifying the design around the limitations:
- HTML classes with hyphens cannot be added with Python dotted attribute syntax (
.foo-bar); not to mention, this also means that any classes that implement this syntax using__getattr__/__getattribute__won't be able to have any methods. - Currently there is no way to add an ID in the middle of a compound selector. Since Python doesn't have a
#operator I'm at a loss. I have thought about overloading__call__butTag('foo').bar('baz')orTag('foo')[Attribute('qux')]('baz')would look too much like a normal method call.
How should I go about working around these limitations?
:'has'instead of':has'etc.selector[:'has']calls__getitem__()with aslice(None, 'has', None)(and[::'first-line']withslice(None, None, 'first-line')).