Files

Rohit Ghumare c3f43d8b61 Expand toolkit to 135 agents, 120 plugins, 796 total files

- Add 60 new agents across all 10 categories (75 -> 135)
- Add 95 new plugins with command files (25 -> 120)
- Update all agents to use model: opus
- Update README with complete plugin/agent tables
- Update marketplace.json with all 120 plugins

2026-02-04 21:08:28 +00:00

1.6 KiB

Raw Permalink Blame History

/index-docs - Index Documents for RAG

Index documents into a vector store for retrieval-augmented generation.

Steps

Ask the user for the document source: directory, URLs, database, or API
Detect document types: PDF, markdown, HTML, text, code, DOCX
Load documents using appropriate parsers for each file type
Split documents into chunks using semantic-aware chunking:
- Respect paragraph and section boundaries
- Target chunk size: 500-1000 tokens with 100-token overlap
Clean and preprocess chunks: remove boilerplate, normalize whitespace
Generate embeddings for each chunk using the configured embedding model
Store embeddings in the vector database: Pinecone, Weaviate, Chroma, or pgvector
Create metadata for each chunk: source file, page number, section title, date
Build an index mapping for fast retrieval and source citation
Validate the index by running sample queries and checking relevance
Report: documents indexed, total chunks, vector dimensions, storage size
Save the indexing configuration for incremental updates

Rules

Use semantic chunking that respects document structure over fixed-size splitting
Include sufficient overlap between chunks to preserve context at boundaries
Store source metadata with each chunk for citation and provenance
Handle duplicate documents by comparing content hashes before indexing
Support incremental indexing: add new documents without re-indexing everything
Use the same embedding model for indexing and querying
Monitor embedding costs and set budget alerts for large document sets

1.6 KiB Raw Permalink Blame History

/index-docs - Index Documents for RAG

Steps

Rules

1.6 KiB

Raw Permalink Blame History