Filters¶
Chroma provides two types of filters:
- Metadata - filter documents based on metadata using
where
clause in eitherCollection.query()
orCollection.get()
- Document - filter documents based on document content using
where_document
inCollection.query()
orCollection.get()
.
Those familiar with MongoDB queries will find Chroma's filters very similar.
Metadata Filters¶
Equality¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": "is_equal_to_this"}
)
Alternative syntax:
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$eq": "is_equal_to_this"}}
)
Inequality¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$ne": "is_not_equal_to_this"}}
)
Greater Than¶
Greater Than
The $gt
operator is only supported for numerical values - int or float values.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$gt": 5}}
)
Greater Than or Equal¶
Greater Than or Equal
The $gte
operator is only supported for numerical values - int or float values.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$gte": 5.1}}
)
Less Than¶
Less Than
The $lt
operator is only supported for numerical values - int or float values.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$lt": 5}}
)
Less Than or Equal¶
Less Than or Equal
The $lte
operator is only supported for numerical values - int or float values.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$lte": 5.1}}
)
In¶
In works on all data types - string, int, float, and bool.
In
The $in
operator is only supported for list of values of the same type.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$in": ["value1", "value2"]}}
)
Not In¶
Not In works on all data types - string, int, float, and bool.
Not In
The $nin
operator is only supported for list of values of the same type.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"metadata_field": {"$nin": ["value1", "value2"]}}
)
Logical Operator: And¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"$and": [{"metadata_field1": "value1"}, {"metadata_field2": "value2"}]}
)
Logical Operators can be nested.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"$and": [{"metadata_field1": "value1"}, {"$or": [{"metadata_field2": "value2"}, {"metadata_field3": "value3"}]}]}
)
Logical Operator: Or¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where={"$or": [{"metadata_field1": "value1"}, {"metadata_field2": "value2"}]}
)
Document Filters¶
Contains¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where_document={"$contains": "search_string"}
)
Not Contains¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where_document={"$not_contains": "search_string"}
)
Logical Operator: And¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where_document={"$and": [{"$contains": "search_string1"}, {"$contains": "search_string2"}]}
)
Logical Operators can be nested.
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where_document={"$and": [{"$contains": "search_string1"}, {"$or": [{"$not_contains": "search_string2"}, {"$not_contains": "search_string3"}]}]}
)
Logical Operator: Or¶
results = collection.query(
query_texts=["This is a query document"],
n_results=2,
where_document={"$or": [{"$not_contains": "search_string1"}, {"$not_contains": "search_string2"}]}
)
Pagination¶
Collection.get()
allows users to specify page details limit
and offset
.