Natural language for human and machine.
NLCST discloses the parts of natural language as a concrete syntax tree. Concrete means all information is stored in this tree and an exact replica of the original document can be re-created.
NLCST is a subset of Unist, and implemented by retext.
This document describes version 1.0.0 of NLCST. Changelog »
Root (Parent) houses all nodes.
interface Root <: Parent {
type: "RootNode";
}Paragraph (Parent) represents a self-contained unit of
discourse in writing dealing with a particular point or idea.
interface Paragraph <: Parent {
type: "ParagraphNode";
}Sentence (Parent) represents grouping of grammatically
linked words, that in principle tells a complete thought, although it
may make little sense taken in isolation out of context.
interface Sentence <: Parent {
type: "SentenceNode";
}Word (Parent) represents the smallest element that may
be uttered in isolation with semantic or pragmatic content.
interface Word <: Parent {
type: "WordNode";
}Symbol (Text) represents typographical devices like
white space, punctuation, signs, and more, different from characters
which represent sounds (like letters and numerals).
interface Symbol <: Text {
type: "SymbolNode";
}Punctuation (Symbol) represents typographical devices
which aid understanding and correct reading of other grammatical
units.
interface Punctuation <: Symbol {
type: "PunctuationNode";
}WhiteSpace (Symbol) represents typographical devices
devoid of content, separating other grammatical units.
interface WhiteSpace <: Symbol {
type: "WhiteSpaceNode";
}Source (Text) represents an external (ungrammatical) value
embedded into a grammatical unit: a hyperlink, a line, and such.
interface Source <: Symbol {
type: "SourceNode";
}TextNode (Text) represents actual content in an NLCST
document: one or more characters. Note that its type property
is TextNode, but it is different from the asbtract Text
interface.
interface TextNode < Text {
type: "TextNode";
}wooorm/nlcst-is-literal— Check whether a node is meant literally;wooorm/nlcst-normalize— Normalize a word for easier comparison;wooorm/nlcst-search— Search for patterns in an NLCST tree;wooorm/nlcst-to-string— Stringify a node;wooorm/nlcst-test— Validate a NLCST node;
In addition, see Unist for other utilities which work with retext nodes.
MIT © Titus Wormer