Chroma Core: Concepts and APIs¶
This section is the fastest way to understand Chroma's core data model, client choices, and day-to-day APIs.
If you are new to Chroma, use the path below in order. If you are already building, jump to the API map.
Recommended Learning Path¶
- Concepts: Understand tenants, databases, collections, documents, metadata, embeddings, and query flow.
- Installation: Install Chroma for local development, server usage, or cloud-connected workflows.
- Clients: Pick the right client (
PersistentClient,HttpClient,AsyncHttpClient,CloudClient) for your architecture. - Collections: Learn collection lifecycle and core CRUD/query operations.
- Filters: Add precise metadata and document filtering to retrieval.
- Configuration: Tune index, runtime, and environment options.
- Resources: Estimate memory/CPU/disk needs before scaling.
- Advanced Search: Learn query-stage semantics, ranking, and execution tradeoffs.
API Map¶
Client-Level APIs¶
- Connect to Chroma:
PersistentClient,HttpClient,AsyncHttpClient,CloudClient - Scope data isolation: tenant + database selection
- Perform admin/list operations (for example collection listing and lifecycle actions)
Start in Clients, then use Tenants and Databases for multi-tenant setups.
Collection-Level APIs¶
- Create/get collections:
create_collection,get_or_create_collection,get_collection - Write data:
add,upsert,update,delete - Read data:
get,query,count - Manage collection settings: metadata, index configuration, and cloning/forking patterns
Start in Collections, then pair with Filters and Document IDs.
HTTP/OpenAPI Surface¶
- Explore server endpoints and OpenAPI: API
- Generate custom clients when needed
Minimal End-to-End Flow (Python)¶
import chromadb
client = chromadb.PersistentClient(path="./chroma")
collection = client.get_or_create_collection("quickstart")
collection.upsert(
ids=["doc-1", "doc-2"],
documents=["Chroma stores vectors and metadata.", "Filters narrow candidate sets before ranking."],
metadatas=[{"topic": "basics"}, {"topic": "search"}],
)
result = collection.query(
query_texts=["How do filters affect retrieval?"],
where={"topic": "search"},
n_results=2,
)
print(result["ids"])
Where to Go Next¶
- Operating local persistent deployments: Storage Layout, WAL, WAL Pruning
- Production guardrails and limits: System Constraints, Resources