Time-based Queries¶
Filtering Documents By Timestamps¶
In the example below, we create a collection with 100 documents, each with a random timestamp in the last two weeks. We then query the collection for documents that were created in the last week.
The example demonstrates how Chroma metadata can be leveraged to filter documents based on how recently they were added or updated.
import uuid
import chromadb
import datetime
import random
now = datetime.datetime.now()
two_weeks_ago = now - datetime.timedelta(days=14)
dates = [
two_weeks_ago + datetime.timedelta(days=random.randint(0, 14))
for _ in range(100)
]
dates = [int(date.timestamp()) for date in dates]
# convert epoch seconds to iso format
def iso_date(epoch_seconds): return datetime.datetime.fromtimestamp(
epoch_seconds).isoformat()
client = chromadb.EphemeralClient()
col = client.get_or_create_collection("test")
col.add(ids=[f"{uuid.uuid4()}" for _ in range(100)], documents=[
f"document {i}" for i in range(100)], metadatas=[{"date": date} for date in dates])
res = col.get(where={"date": {"$gt": (now - datetime.timedelta(days=7)).timestamp()}})
for i in res['metadatas']:
print(iso_date(i['date']))
Ref: https://gist.github.com/tazarov/3c9301d22ab863dca0b6fb1e5e3511b1