| description | Example for Semantic Search |
|---|
This tutorial demonstrates using the pgml SDK to create a collection, add documents, build a pipeline for vector search, make a sample query, and archive the collection when finished. It loads sample data, indexes questions, times a semantic search query, and prints formatted results.
Python
from pgml import Collection, Model, Splitter, Pipeline
from datasets import load_dataset
from dotenv import load_dotenv
import asyncioJavaScript
const pgml = require("pgml");
require("dotenv").config();The SDK is imported and environment variables are loaded.
Python
async def main():
load_dotenv()
collection = Collection("my_collection") JavaScript
const main = async () => {
const collection = pgml.newCollection("my_javascript_collection");
}A collection object is created to represent the search collection.
Python
model = Model()
splitter = Splitter()
pipeline = Pipeline("my_pipeline", model, splitter)
await collection.add_pipeline(pipeline)JavaScript
const model = pgml.newModel();
const splitter = pgml.newSplitter();
const pipeline = pgml.newPipeline("my_javascript_pipeline", model, splitter);
await collection.add_pipeline(pipeline); A pipeline encapsulating a model and splitter is created and added to the collection.
Python
documents = [
{"id": "doc1", "text": "..."},
{"id": "doc2", "text": "..."}
]
await collection.upsert_documents(documents) JavaScript
const documents = [
{
id: "Document One",
text: "...",
},
{
id: "Document Two",
text: "...",
},
];
await collection.upsert_documents(documents);Documents are upserted into the collection and indexed by the pipeline.
Python
results = await collection.query()
.vector_recall("query", pipeline)
.fetch_all() JavaScript
const queryResults = await collection
.query()
.vector_recall(
"query",
pipeline,
)
.fetch_all();A vector similarity search query is made on the collection.
Python
await collection.archive()JavaScript
await collection.archive();The collection is archived when finished.
Let me know if you would like me to modify or add anything!
Python
if __name__ == "__main__":
asyncio.run(main())JavaScript
main().then((results) => {
console.log("Vector search Results: \n", results);
});Boilerplate to call main() async function.
Let me know if you would like me to modify or add anything to this markdown documentation. Happy to iterate on it!