Skip to content

Latest commit

 

History

History
145 lines (100 loc) · 2.31 KB

File metadata and controls

145 lines (100 loc) · 2.31 KB

Extractive Question Answering

Here is the documentation for the JavaScript and Python code snippets performing end-to-end question answering:

Imports and Setup

Python

from pgml import Collection, Model, Splitter, Pipeline, Builtins
from datasets import load_dataset
from dotenv import load_dotenv

JavaScript

const pgml = require("pgml");
require("dotenv").config();

The SDK and datasets are imported. Builtins are used in Python for transforming text.

Initialize Collection

Python

collection = Collection("squad_collection") 

JavaScript

const collection = pgml.newCollection("my_javascript_eqa_collection");

A collection is created to hold context passages.

Create Pipeline

Python

model = Model()
splitter = Splitter()
pipeline = Pipeline("squadv1", model, splitter)
await collection.add_pipeline(pipeline)

JavaScript

const pipeline = pgml.newPipeline(
  "my_javascript_eqa_pipeline",
  pgml.newModel(),
  pgml.newSplitter(),
);

await collection.add_pipeline(pipeline);

A pipeline is created and added to the collection.

Upsert Documents

Python

data = load_dataset("squad")

documents = [
  {"id": ..., "text": ...} 
  for r in data  
]

await collection.upsert_documents(documents)

JavaScript

const documents = [
  {
    id: "...",
    text: "...",
  }
];

await collection.upsert_documents(documents);

Context passages from SQuAD are upserted into the collection.

Query for Context

Python

results = await collection.query()
  .vector_recall(query, pipeline) 
  .fetch_all()

context = " ".join(results[0][1]) 

JavaScript

const queryResults = await collection
  .query()
  .vector_recall(query, pipeline)
  .fetch_all();

const context = queryResults
  .map(result => result[1])
  .join("\n");

A vector search query retrieves context passages.

Query for Answer

Python

builtins = Builtins()

answer = await builtins.transform(
  "question-answering", 
  [{"question": query, "context": context}]
)

JavaScript

const builtins = pgml.newBuiltins();

const answer = await builtins.transform("question-answering", [
  JSON.stringify({question, context})
]);

The context is passed to a QA model to extract the answer.