Simple and Powerful Semantic Computation
Palimpzest (PZ) enables developers to write simple, powerful programs which use semantic operators (i.e. LLMs) to perform computation.
The following code snippet sets up PZ and downloads a small datast of emails:
# setup in the terminal
$ pip install palimpzest
$ export OPENAI_API_KEY="<your-api-key>"
$ wget https://palimpzest-workloads.s3.us-east-1.amazonaws.com/emails.zip
$ unzip emails.zip
- compute the
subject
anddate
of each email - filter for emails about vacations which are sent in July
import palimpzest as pz
emails = pz.Dataset("emails/")
emails = emails.sem_add_columns([
{"name": "subject", "type": str, "desc": "the subject of the email"},
{"name": "date", "type": str, "desc": "the date the email was sent"},
])
emails = emails.sem_filter("The email is about vacation")
emails = emails.sem_filter("The email was sent in July")
output = emails.run(max_quality=True)
print(output.to_df(cols=["filename", "date", "subject"]))
filename date subject
0 email4.txt 6 Jul 2001 Vacation plans
1 email5.txt 26 Jul 2001 Vacation Days in August
Key Features of PZ
There are a few features of this program which are worth highlighting:
- The programmer creates a
pz.Dataset
from the directory of emails and defines a series of semantic computations on that dataset:sem_add_columns()
specifies a set of fields which PZ must computesem_filter()
selects for emails which satisfy the natural language filter
- The user does not specify how the computation should be performed -- they simply declare what they want PZ to compute
- This is what makes PZ declarative
- Under the hood, PZ's optimizer determines the best way to execute each semantic operator
- In this example, PZ optimizes for output quality because the user sets
max_quality=True
- In this example, PZ optimizes for output quality because the user sets
- The
output
is not generated until the call toemails.run()
- i.e. PZ uses lazy evaluation
Declarative Optimization for AI
The core philosophy behind PZ is that programmers should simply specify the high-level logic of their AI programs while offloading much of the performance tuning to a powerful optimizer. Of course, users still have the ability to fully control their program, and can override and assist the optimizer (if needed) to get the best possible performance.
This email processing example only showcases a small set of the semantic operators implemented in PZ. Other operators include:
retrieve()
which takes a vector database and a search string as input and retrieves the most relevant entries from the databaseadd_columns()
andfilter()
which are the non-semantic equivalents ofsem_add_columns()
andsem_filter()
groupby()
,count()
,average()
,limit()
, andproject()
which mirror their implementations in frameworks like Pandas and Spark.
Join our community
We strongly encourage you to join our Discord server where we are happy to help you get started with PZ.
What's Next?
The rest of our Getting Started section will:
- Help you install PZ
- Explore more of PZ's features in our Quick Start Tutorial
- Give you an overview of our User Guides which discuss features of PZ in more depth