Getting started

Suppose you want to classify text based on what's placed inside the <sentence></sentence> tags.

SaxaMLL has two components: a prompt generator and a SAX parser.

The prompt generator generates a description of your XML schemas to tack onto the end of your system prompt (or normal prompt).

The SAX parser is an event-based parser that parses XML on-the-fly and emits events based on the opening or closing of specified tags of your choosing.

TL;DR

import { getText, XMLNodeDescription, SaxaMLLParser } from "saxamll";

/*
First, define an XML description.
*/
const classificationTag = new XMLNodeDescription({
    tag: "classification",
    description: "Put 'positive' if the text inside '<sentence></sentence> tags is positive. Put 'negative' if the text is negative"
})

classificationTag.setExamples([
    {
        input: "<sentence>I'm eating lobsters and I'm so happy.</sentence>",
        output: "<classification>positive</classification>"
    }
])

/*
This generates a description of the <classification> tags.
*/
const classificationDescription = classificationTag.getPrompt();

const saxParser = new SaxaMLLParser();

/*
When </classification> is encountered, we will save the text
inside the <classification> tags inside `response`.
*/
let response;
saxParser.executor.upon('tagClose').for(classificationTag).do((node: XMLNode) => {
    response = getText(node);
});


/* 
Parse the input all at once
*/
saxParser.parse("<classification>positive</classification>");
console.log(response); // "positive"

/*
Or, parse in an online fashion
*/
const streamExample = [
    "<class",
    "ification",
    ">",
    "positive",
    "</",
    "classification",
    ">"
]
    
for (let delta of streamExample) {
    saxParser.parse(delta);
}
console.log(response); // "positive"

For those who aren't too lazy to read:

Say we want to classify text based on what's placed inside the <sentence> tags. We first make an XML node description:

const classificationTag = new XMLNodeDescription({
    tag: "classification",
    description: "Put 'positive' if the text inside '<sentence></sentence> tags is positive. Put 'negative' if the text is negative"
})

Maybe you want to provide examples to the model as well:

classificationTag.setExamples([
    {
        input: "<sentence>I'm eating lobsters and I'm so happy.</sentence>",
        output: "<classification>positive</classification>"
    }
])

Once you're happy with your descriptions and examples, call getPrompt(), and tack it onto your prompt.

const classificationDescription = classificationTag.getPrompt();

Now, onto the parser side. First, we define a parser:

const saxParser = new SaxaMLLParser();

We want to be able to trigger on specific "events" as the response is streamed back to us. There are three types of events: tagOpen, tagClose, and update. For more details, check out How it works.

  1. tagOpen = when a tag is opened.

  2. tagClose = when a tag is closed.

  3. update = when any update to the AST has been made.

The general syntax for assigning callbacks on events is

executor.upon(<EVENT-TYPE>).for(<TAG-NAME>).do((node) => {<YOUR-CALLBACK>});

In this example, we want the text in between the classification tags, so we should trigger as soon as the tag is closed:

saxParser.executor.upon("tagClose").for("classification").do((node) => {
    response = getText(node);
});

Our parser is now ready to go! You can parse the model's response all at once:

saxParser.parse("<classification>positive</classification>");
console.log(response); // "positive"

Or, if you want streaming, you can do this:

const streamExample = [
    "<class",
    "ification",
    ">",
    "positive",
    "</",
    "classification",
    ">"
]
    
for (let delta of streamExample) {
    saxParser.parse(delta);
}

console.log(response); // "positive"

Last updated