API Reference
This document provides comprehensive documentation for RAG-lite TS APIs, focusing on the clean architecture with simple constructors and optional factory patterns.
Table of Contents
- Quick Start
- Main Classes
- Factory Pattern
- Core Architecture
- Configuration
- Type Definitions
- Error Handling
- Utility Functions
Quick Start
For most users - get started in minutes
The fastest way to get started with RAG-lite TS using simple constructors:
import { SearchEngine, IngestionPipeline } from 'rag-lite-ts';
// Initialize and ingest documents
const ingestion = new IngestionPipeline('./db.sqlite', './index.bin');
await ingestion.ingestDirectory('./docs');
// Search your documents
const search = new SearchEngine('./index.bin', './db.sqlite');
const results = await search.search('your query');
Configuration Options
import { SearchEngine, IngestionPipeline } from 'rag-lite-ts';
// Custom model configuration
const search = new SearchEngine('./index.bin', './db.sqlite', {
embeddingModel: 'Xenova/all-mpnet-base-v2',
enableReranking: true
});
// Ingestion with custom settings
const ingestion = new IngestionPipeline('./db.sqlite', './index.bin', {
embeddingModel: 'Xenova/all-mpnet-base-v2',
chunkSize: 400,
chunkOverlap: 80
});
Main Classes
For application developers - the primary API
These classes provide the main interface for RAG-lite TS functionality with simple, direct constructors that handle initialization automatically.
SearchEngine
Performs semantic search over indexed documents with automatic initialization.
class SearchEngine {
constructor(indexPath: string, dbPath: string, options?: SearchEngineOptions);
async search(query: string, options?: SearchOptions): Promise<SearchResult[]>;
async getStats(): Promise<SearchStats>;
async cleanup(): Promise<void>;
}
Constructor Parameters
indexPath
(string): Path to the vector index file (must exist)dbPath
(string): Path to the SQLite database file (must exist)options
(SearchEngineOptions, optional): Configuration options
SearchEngineOptions
interface SearchEngineOptions {
embeddingModel?: string; // Model name (auto-detected from database)
batchSize?: number; // Embedding batch size
enableReranking?: boolean; // Enable reranking (default: false)
rerankingModel?: string; // Reranking model name
// Advanced: Custom functions for dependency injection
embedFn?: EmbedFunction; // Custom embedding function
rerankFn?: RerankFunction; // Custom reranking function
}
Methods
search(query, options?)
Performs semantic search over indexed documents.
Parameters:
query
(string): Search query textoptions
(SearchOptions, optional): Search configuration
Returns: Promise<SearchResult[]>
- Array of search results ordered by relevance
Example:
import { SearchEngine } from 'rag-lite-ts';
const search = new SearchEngine('./index.bin', './db.sqlite');
const results = await search.search('machine learning', {
top_k: 5,
rerank: true
});
console.log(results);
// [
// {
// content: "Machine learning is a subset of artificial intelligence...",
// score: 0.85,
// document: {
// id: 1,
// source: "ml-guide.md",
// title: "ML Guide"
// }
// }
// ]
getStats()
Gets statistics about the search engine and indexed content.
Returns: Promise<SearchStats>
- Statistics object
Example:
const stats = await search.getStats();
console.log(`Indexed ${stats.totalChunks} chunks from ${stats.totalDocuments} documents`);
console.log(`Reranking: ${stats.rerankingEnabled ? 'enabled' : 'disabled'}`);
cleanup()
Cleans up resources and closes connections.
Returns: Promise<void>
Example:
// Always cleanup when done
await search.cleanup();
Complete Example
import { SearchEngine } from 'rag-lite-ts';
// Basic usage
const search = new SearchEngine('./index.bin', './db.sqlite');
try {
const results = await search.search('artificial intelligence');
console.log(`Found ${results.length} results`);
for (const result of results) {
console.log(`${result.document.title}: ${result.score.toFixed(2)}`);
console.log(result.content.substring(0, 100) + '...');
}
} finally {
await search.cleanup();
}
IngestionPipeline
Handles document ingestion, processing, and indexing with automatic initialization.
class IngestionPipeline {
constructor(dbPath: string, indexPath: string, options?: IngestionPipelineOptions);
async ingestDirectory(path: string, options?: IngestionOptions): Promise<IngestionResult>;
async ingestFile(filePath: string, options?: IngestionOptions): Promise<IngestionResult>;
async cleanup(): Promise<void>;
}
Constructor Parameters
dbPath
(string): Path to the SQLite database file (will be created if doesn't exist)indexPath
(string): Path to the vector index file (will be created if doesn't exist)options
(IngestionPipelineOptions, optional): Configuration options
IngestionPipelineOptions
interface IngestionPipelineOptions {
embeddingModel?: string; // Model name (default: 'sentence-transformers/all-MiniLM-L6-v2')
batchSize?: number; // Embedding batch size
chunkSize?: number; // Chunk size in tokens (default: 250)
chunkOverlap?: number; // Overlap between chunks (default: 50)
forceRebuild?: boolean; // Force index rebuild (default: false)
}
Methods
ingestDirectory(path, options?)
Ingests all supported documents in a directory.
Parameters:
path
(string): Directory path to ingestoptions
(IngestionOptions, optional): Ingestion configuration
Returns: Promise<IngestionResult>
- Ingestion statistics and results
ingestFile(filePath, options?)
Ingests a single document.
Parameters:
filePath
(string): Path to the document to ingestoptions
(IngestionOptions, optional): Ingestion configuration
Returns: Promise<IngestionResult>
- Ingestion statistics and results
Example:
import { IngestionPipeline } from 'rag-lite-ts';
// Basic usage
const pipeline = new IngestionPipeline('./db.sqlite', './index.bin');
try {
const result = await pipeline.ingestDirectory('./docs/');
console.log(`Processed ${result.documentsProcessed} documents`);
console.log(`Generated ${result.chunksCreated} chunks`);
} finally {
await pipeline.cleanup();
}
cleanup()
Cleans up resources and closes connections.
Returns: Promise<void>
Complete Example
import { IngestionPipeline } from 'rag-lite-ts';
// With configuration options
const pipeline = new IngestionPipeline('./db.sqlite', './index.bin', {
embeddingModel: 'Xenova/all-mpnet-base-v2',
chunkSize: 400,
chunkOverlap: 80
});
try {
// Ingest documents from directory
const result = await pipeline.ingestDirectory('./documents');
console.log('Ingestion Summary:');
console.log(`- Documents processed: ${result.documentsProcessed}`);
console.log(`- Chunks created: ${result.chunksCreated}`);
console.log(`- Processing time: ${result.processingTimeMs}ms`);
if (result.documentErrors > 0) {
console.warn(`- Document errors: ${result.documentErrors}`);
}
} finally {
await pipeline.cleanup();
}
Factory Pattern
For advanced users and library authors
Factory functions provide advanced initialization with automatic setup, smart defaults, and comprehensive error handling. Use these when:
- Building libraries or frameworks that use rag-lite-ts internally
- Need automatic resource management and cleanup on process exit
- Require extensive error handling and validation with detailed error messages
- Working with complex deployment scenarios where initialization might fail
When to Use:
- Use constructors for direct application usage (90% of cases)
- Use factories for library development, complex error handling, or automatic resource management
SearchFactory
Creates and initializes search engines with automatic model loading and validation.
class SearchFactory {
static async create(
indexPath: string,
dbPath: string,
options?: TextSearchOptions
): Promise<SearchEngine>;
static async createWithDefaults(
options?: TextSearchOptions
): Promise<SearchEngine>;
}
TextSearchOptions
interface TextSearchOptions {
embeddingModel?: string; // Model name override
batchSize?: number; // Embedding batch size override
rerankingModel?: string; // Reranking model name override
enableReranking?: boolean; // Enable/disable reranking (default: false)
}
Example
import { SearchFactory } from 'rag-lite-ts';
// Basic usage with comprehensive error handling
const search = await SearchFactory.create('./index.bin', './db.sqlite');
const results = await search.search('machine learning');
// Advanced configuration
const search = await SearchFactory.create('./index.bin', './db.sqlite', {
embeddingModel: 'Xenova/all-mpnet-base-v2',
enableReranking: true
});
// Use default paths from configuration
const search = await SearchFactory.createWithDefaults({
enableReranking: false
});
IngestionFactory
Creates and initializes ingestion pipelines with automatic model loading and directory setup.
class IngestionFactory {
static async create(
dbPath: string,
indexPath: string,
options?: TextIngestionOptions
): Promise<IngestionPipeline>;
static async createWithDefaults(
options?: TextIngestionOptions
): Promise<IngestionPipeline>;
}
TextIngestionOptions
interface TextIngestionOptions {
embeddingModel?: string; // Model name override
batchSize?: number; // Embedding batch size override
chunkSize?: number; // Chunk size override
chunkOverlap?: number; // Chunk overlap override
forceRebuild?: boolean; // Force rebuild of existing index
}
Example
import { IngestionFactory } from 'rag-lite-ts';
// Basic usage with automatic directory creation
const ingestion = await IngestionFactory.create('./db.sqlite', './index.bin');
await ingestion.ingestDirectory('./documents');
// Advanced configuration with force rebuild
const ingestion = await IngestionFactory.create('./db.sqlite', './index.bin', {
embeddingModel: 'Xenova/all-mpnet-base-v2',
chunkSize: 400,
chunkOverlap: 80,
forceRebuild: true
});
RAGFactory
Creates complete RAG systems with both search and ingestion capabilities.
class RAGFactory {
static async createBoth(
indexPath: string,
dbPath: string,
searchOptions?: TextSearchOptions,
ingestionOptions?: TextIngestionOptions
): Promise<{
searchEngine: SearchEngine;
ingestionPipeline: IngestionPipeline;
}>;
static async createBothWithDefaults(
searchOptions?: TextSearchOptions,
ingestionOptions?: TextIngestionOptions
): Promise<{
searchEngine: SearchEngine;
ingestionPipeline: IngestionPipeline;
}>;
}
Example
import { RAGFactory } from 'rag-lite-ts';
// Create complete RAG system
const { searchEngine, ingestionPipeline } = await RAGFactory.createBoth(
'./index.bin',
'./db.sqlite'
);
// First, ingest some documents
await ingestionPipeline.ingestDirectory('./knowledge-base');
// Then search the ingested content
const results = await searchEngine.search('What is the main topic?');
// Clean up both instances
await Promise.all([
searchEngine.cleanup(),
ingestionPipeline.cleanup()
]);
Core Architecture
For library authors and custom implementations
The core architecture provides low-level access to internal components for advanced use cases, custom implementations, and library development.
Core Classes
CoreSearchEngine
Low-level search engine with explicit dependency injection.
import { SearchEngine as CoreSearchEngine } from 'rag-lite-ts';
const coreSearch = new CoreSearchEngine(embedFn, indexManager, db, rerankFn);
CoreIngestionPipeline
Low-level ingestion pipeline with explicit dependency injection.
import { IngestionPipeline as CoreIngestionPipeline } from 'rag-lite-ts';
const corePipeline = new CoreIngestionPipeline(embedFn, indexManager, db, chunkConfig);
Text Implementations
Embedding Functions
import { createTextEmbedFunction, createTextEmbedder } from 'rag-lite-ts';
// Create embedding function
const embedFn = createTextEmbedFunction('Xenova/all-mpnet-base-v2', 16);
// Create embedding engine directly
const embedder = createTextEmbedder('Xenova/all-mpnet-base-v2');
await embedder.loadModel();
Reranking Functions
import { createTextRerankFunction, createTextReranker } from 'rag-lite-ts';
// Create reranking function
const rerankFn = createTextRerankFunction('cross-encoder/ms-marco-MiniLM-L-6-v2');
// Create reranker directly
const reranker = createTextReranker('cross-encoder/ms-marco-MiniLM-L-6-v2');
await reranker.loadModel();
Custom Implementation Example
import {
SearchEngine as CoreSearchEngine,
createTextEmbedFunction,
IndexManager,
openDatabase
} from 'rag-lite-ts';
// Custom embedding function
const customEmbedFn = async (query: string) => {
// Your custom embedding logic
return {
embedding_id: 'custom-' + Date.now(),
vector: new Float32Array([/* your embeddings */])
};
};
// Create core components
const db = await openDatabase('./db.sqlite');
const indexManager = new IndexManager('./index.bin', './db.sqlite', 384, 'custom-model');
await indexManager.initialize();
// Create search engine with custom embedding
const search = new CoreSearchEngine(customEmbedFn, indexManager, db);
Configuration
Global Configuration
import { config, validateConfig, getModelDefaults } from 'rag-lite-ts';
// Get current config
console.log(config.embedding_model); // Current model
// Get model-specific defaults
const defaults = getModelDefaults('Xenova/all-mpnet-base-v2');
console.log(defaults.chunk_size); // 400 (optimized for this model)
// Validate custom config
const customConfig = validateConfig({
embedding_model: 'Xenova/all-mpnet-base-v2',
chunk_size: 300
});
Model Defaults
// Available models with optimized defaults
const models = {
'sentence-transformers/all-MiniLM-L6-v2': {
dimensions: 384,
chunk_size: 250,
chunk_overlap: 50,
batch_size: 16
},
'Xenova/all-mpnet-base-v2': {
dimensions: 768,
chunk_size: 400,
chunk_overlap: 80,
batch_size: 8
}
};
Type Definitions
Core Types
SearchResult
interface SearchResult {
content: string; // Chunk text content
score: number; // Similarity score (0-1)
document: {
id: number; // Document ID
source: string; // Document source path
title: string; // Document title
};
}
SearchOptions
interface SearchOptions {
top_k?: number; // Number of results to return (default: 10)
rerank?: boolean; // Enable result reranking (default: false)
}
IngestionResult
interface IngestionResult {
documentsProcessed: number; // Number of documents processed
chunksCreated: number; // Number of chunks created
embeddingsGenerated: number; // Number of embeddings generated
documentErrors: number; // Number of document processing errors
embeddingErrors: number; // Number of embedding errors
processingTimeMs: number; // Total processing time in milliseconds
}
Document
interface Document {
source: string; // File path or identifier
title: string; // Document title
content: string; // Full document content
}
EmbeddingResult
interface EmbeddingResult {
embedding_id: string; // Unique identifier
vector: Float32Array; // Embedding vector
}
Configuration Types
CoreConfig
interface CoreConfig {
// Model settings
embedding_model: string;
// Processing settings
chunk_size: number;
chunk_overlap: number;
batch_size: number;
top_k: number;
// File paths
db_file: string;
index_file: string;
model_cache_path: string;
// Features
rerank_enabled: boolean;
}
Error Handling
Error Types
import {
APIError,
IngestionError,
SearchError,
ResourceError,
ModelCompatibilityError
} from 'rag-lite-ts';
Error Handling Patterns
import { SearchEngine } from 'rag-lite-ts';
try {
const search = new SearchEngine('./index.bin', './db.sqlite');
const results = await search.search('query');
} catch (error) {
if (error instanceof ModelCompatibilityError) {
console.error('Model mismatch detected:', error.message);
// Handle model compatibility issues
} else if (error instanceof SearchError) {
console.error('Search failed:', error.message);
// Handle search-specific errors
} else {
console.error('Unexpected error:', error.message);
}
}
Utility Functions
Database Operations
import {
openDatabase,
initializeSchema,
insertDocument,
insertChunk,
getChunksByEmbeddingIds
} from 'rag-lite-ts';
const db = await openDatabase('./my-docs.sqlite');
await initializeSchema(db);
const docId = await insertDocument(db, 'README.md', 'My Document', 'Content...');
File Processing
import {
discoverFiles,
processFiles,
discoverAndProcessFiles
} from 'rag-lite-ts';
const result = await discoverFiles('./docs/', {
recursive: true,
extensions: ['.md', '.txt', '.mdx']
});
console.log(`Found ${result.files.length} files`);
Tokenization
import { countTokens } from 'rag-lite-ts';
const count = await countTokens('Hello world, this is a test.');
console.log(`Token count: ${count}`); // Token count: 8
Path Management
import { DocumentPathManager } from 'rag-lite-ts';
const pathManager = new DocumentPathManager('relative', '/project');
const storagePath = pathManager.toStoragePath('/project/docs/readme.md');
console.log(storagePath); // "docs/readme.md"
This API reference covers all aspects of RAG-lite TS from simple usage to advanced customization. Start with the Quick Start section and Main Classes for typical usage, then explore the Factory Pattern and Core Architecture sections for advanced use cases.