how we actually evaluate agents (health)

In the last couple of months we’ve been working on a health agent. It was my role specifically to deal with the messy answer to the tough question: “is it good? worthwhile? valuable?”

Continue reading →

the entertainment is instagram reels (and tiktoks)

“The Entertainment is real, and it’s called Instagram Reels.”

how will we know the model did a good job?

A foundation model I’ve been working on recently got published in Nature.1 For a while I’ve wanted to write this. Now the paper is finally out so I have to do it in a timely manner, and I also have to start investing more thought in the upcoming projects (some are similar). So what is this? I think the honest answer is something between a post-mortem of a successful project and some exploration towards the future. Exploration about the question that lived in my mind when I was working on this project: “how do we know it’s actually worthwhile?” I think it comes up often in these kinds of research works.

  1. A foundation model for continuous glucose monitoring data. I’m second author; Guy Lutsker led. For full text: rdcu.be/eY5fH 

Continue reading →

data activation thoughts

The landscape is shifting in recent years — it’s a cliche to start texts like this these days, but the fact that it’s a cliche doesn’t make it any less true.1 In 2019, the folks at Andreessen Horowitz wrote this about data (in a piece titled The Empty Promise of Data Moats): “Instead of getting stronger, the defensible moat erodes as the data corpus grows and the competition races to catch up.” (Trying to prove some data has value — I’ve experienced it firsthand.)

  1. Speaking of cliches — I’m aware this piece is full of em-dashes, which have become a telltale sign of AI-assisted writing. But as Nabeel Qureshi pointed out, David Foster Wallace was doing this decades ago. The italics for emphasis, the informality, the casual speech tone. I’ll keep my em-dashes. 

Continue reading →

What's Sparse Thoughts?

For a while now I’ve been collecting too many things to read and think about, mostly in twitter ‘saved links’. This is a place for me to silently collect together things I enjoyed reading / other content I’ve enjoyed and sometimes jot some thoughts about it. It’s built in a way that won’t make me feel too committed, should be kind of under the radar, low friction, minimal effort.