How it works
Prompt generator
The prompt generator is more of a utility than an essential component of SaxaMLL. Power users can forgo the prompt generator completely since they likely have customized prompts that work much better for their use cases.
However, for simple applications, the prompt generator will get you up and running.
Events
Broadly, there are two classes of events:
🔵 events that change "scope"
🔴 events that don't change "scope"
By "scope", we mean how deep you are in the AST. Some examples to illustrate the two classes of events:
🔵 A non-self-closing tag is opened e.g.
<tweet>
🔵 A non-self-closing tag is closed e.g.
</tweet>
🔴 A self-closing tag is placed e.g.
<tweet />
🔴 Any text is added e.g.
hello my name is alex
Now, there are three types of events - each of which can be placed within the two classes above:
🔵
tagOpen
= when a tag is opened.🔵
tagClose
= when a tag is closed.🔴
update
= when a child to the current node has been added.
❓The update
event is useful really only for when you need results immediately. Otherwise, the tagOpen
and tagClose
events are sufficient.
🔵 Events of the "scope-changing" type accept callbacks of the form (node: XMLNode) => {}
. The node
parameter is the node that we are currently in the scope of in the tagOpen
case. In the tagClose
case, the node
parameter is the node that we just left the scope of.
🔴 Events of the "non-scope-changing type" accept callbacks of the form ([parent: XMLNode, child: XMLNode, isCommitted: boolean]) => {}
. The parent
parameter is the node that we're currently adding children to. The child
parameter is the node that we're currently constructing.
The isCommitted
flag is true
when the update event actually adds the child to the parent. For example, text nodes will always have isCommitted
as true
.
On the other hand, the update event might fire when the child node hasn't been fully constructed yet (see Example with self-closing tag). In this case, isCommitted
will be false
- but you can still check out what the partial child node looks like when the update was requested.
Example with text updates at the root level
Suppose our parser is chomping through this text:
Let's define a simple callback:
If you want to be explicit about which level you want to fire on, you can also define:
Remember, for update
events, the for(...)
refers to whichever parent you're adding children.
We parse the incoming stream, token-by-token:
The events are fired at these points:
Example with text updates at an inner level
Suppose our parser is chomping through this text:
Let's define a simple callback:
Remember, for update
events, the for(...)
refers to whichever parent you're adding children.
We parse the incoming stream, token-by-token:
The events are fired at these points:
Example with text updates at nested levels
Suppose our parser is chomping through this text:
Let's define a simple callback:
Remember, for update
events, the for(...)
refers to whichever parent you're adding children.
We parse the incoming stream, token-by-token:
The events are fired at the following points: 🟡 for keyword
updates, and 🟣 for root
updates.
Example with scope changes
Suppose our parser is chomping through this text:
Let's define these callbacks:
❓At the time of the tagOpen
callback, you will have access to all of the attributes.
❓At the time of the tagClose
callback, you will have access to all of the attributes, plus all the children of the imageSearch
node.
We parse the incoming stream, token-by-token:
Then the events are fired at these points:
Example with self-closing tag
Suppose our parser is chomping through this text:
Let's define a callback on the self-closing tag:
Remember, for update
events, the for(...)
refers to whichever parent you're adding children.
We parse the incoming stream, token-by-token:
Then the events are fired here:
However, suppose we parsed the stream by calling the .update()
method at every update:
Then, all these events are fired, but since we only fire on element
children that have been committed to the parent, we only do anything on the last update fire (the blue one):
Errors
There are two types of errors:
UNEXPECTED_TOKEN
BAD_CLOSE_TAG
UNEXPECTED_TOKEN
errors occur on malformed XML e.g. <twee<t>
. On UNEXPECTED_TOKEN
, an error node is created, and the rest of the input is collected into the content
field of the error node.
BAD_CLOSE_TAG
errors occur on mismatched opening and closing tags e.g. <tweet></question>
. In this case, an error node is collected, but the rest of the input is parsed as if the </question>
was never encountered. In other words, the opening tag is always stronger than the closing tag.
When either error is encountered, an error
event is emitted.
Last updated