Braintrust Weekly Update

Ankur Goyal

09 October 2023

It’s been a busy week for us at Braintrust. Here’s some of the new features we shipped this week:

All experiment loading HTTP requests are 100-200ms faster
We released a new tutorial: finetune GPT3.5 to write SQL queries

You can easily finetune GPT3.5 to generate SQL queries using OpenAI and then evaluate how the fine tuned model compares to the base model using Braintrust. Check out the Jupyter Notebook example here to get started.

We evaluated the Alpaca evals leaderboard in Braintrust

The Alpaca evals use Claude and GPT4 to rank how different LLMs perform on a variety of tasks. You can see the aggregated rankings and also dig into individual models and better understand their strengths and weaknesses. Check out the Alpaca Evals braintrust project on Braintrust to dig in further—no login required.

We improved Datasets. See when they were last edited and the version number from the UI.

Easily see when a dataset was last changed from the UI by hovering over the ID. We also provide example code so you can quickly use the current dataset version in your project. Learn more on our datasets guide.

Release notes

All experiment loading HTTP requests are 100-200ms faster
The prompt playground now supports autocomplete
Dataset versions are now displayed on the datasets page
Projects in the summary page are now sorted alphabetically
Long text fields in logged data can be expanded into scrollable blocks

Braintrust is the enterprise-grade stack for building AI products. From evaluations, to prompt playground, to data management, we take uncertainty and tedium out of incorporating AI into your business.

Braintrust Weekly Update

We evaluated the Alpaca evals leaderboard in Braintrust

We improved Datasets. See when they were last edited and the version number from the UI.

Release notes

Ship AI with confidence