Measuring AI in Ruby: Tracing, Evals, and the Cost of Hype

From Model Benchmarks to Business Tradeoffs — Where Ruby Fits in the AI Reality Check

and

Sep 23, 2025

What does it really mean to measure AI in production? In Episode 7 of The Ruby AI Podcast, we dive into tracing, evaluations, and the economic side of building with LLMs — and why Ruby might be the best place to separate hype from reality.

🎧 Measuring AI in Ruby: Tracing, Evals, and the Cost of Hype

Available now wherever you get your podcasts — including Apple Podcasts and Spotify.

This episode unpacks the messy but essential work of evaluating models, tracing requests, and managing cost tradeoffs in real AI systems. We compare big-name LLMs against cheaper options, explore why prompt strategies feel like compiled code, and share what it takes to keep Phoenix testing Rails applications at scale.

But the conversation goes wider. We get into why SaaS economics make AI different from past waves, why Ruby’s flexibility makes it ideal for experimentation, and how the community blends serious engineering with playful tinkering — from Ruby LLM to “quantum Ruby” experiments. Along the way, we shout out Scott Werner’s latest gems, VSM and AirBee, pushing the boundaries of self-building AI systems.

Listen Now

Show Notes

00:00 Introduction and Host Banter

00:40 Marketing and Positioning in AI

01:09 Top AI Researchers and Industry Insights

02:33 AI Contracts and Market Dynamics

04:51 Evaluating AI Models and Benchmarks

07:44 Challenges in AI Model Evaluation

18:45 Cost-Benefit Analysis in AI

26:01 Constrain Tool Outputs with Grammars

27:03 Introducing Preambles for Tool Calls

28:16 Reasoning Tokens and Multi-Turn Responses

28:55 Comparing AI Models: OpenAI vs. Anthropic

30:08 Ruby LLM Integration in AI Products

31:50 Evaluating Ruby LLM and Other Tools

35:13 The Importance of Modularity in AI Development

40:07 Ruby's Role in AI and Software Engineering

42:11 Innovative and Playful Ruby Projects

45:11 Scott Werner's AI Innovations

46:45 Concluding Thoughts on AI and Ruby

What's Next?

Episode 8: Go on… take a guess.

The next guest is one of the Ruby community’s sharpest voices. Think you know who it is? Drop your guess below. 💎

Who Else Should We Talk To?

Know someone doing innovative or inspiring work with Ruby and AI? Let us know we might feature them in an upcoming episode. Contact us at news@therubyaipodcast.com.

Let's continue to redefine what's possible with Ruby and AI!

💎 Valentino & Joe 💎

Discussion about this post

Ready for more?