Building reliable and proactive agentic systems at scale: how Shopify’s reflexive AI culture was instrumental in their development of Sidekick w/ Andrew McNamara #258
Get the full intelligence
Search transcripts, export clips, track mentions, and explore all topics from “Building reliable and proactive agentic systems at scale: how Shopify’s reflexive AI culture was instrumental in their development of Sidekick w/ Andrew McNamara #258” inside PodZeus.
At Shopify, 95% of employees now use AI daily, creating a 'reflexive' culture where AI is embedded in every stage of product development—from prototyping to deployment. This cultural shift, powered by a commitment to hiring early talent and building at the lowest possible level, enabled the creation of Sidekick, an AI co-founder that acts as a business mentor for millions of merchants. The key to its success lies not in complex multi-agent architectures, but in a simple orchestration model with specialized sub-agents, like a fine-tuned GraphQL tool that outperforms Opus 4.7. Central to this system is a constantly evolving 'ground truth set' of human-rated conversations, which serves as the foundation for AI judges that enable auto-research, prompt optimization, and scalable evaluation. Sidekick Pulse, the proactive variant, uses background research and high-reasoning loops to deliver actionable insights—like fixing US shipping misconfigurations—without user input. Reliability is ensured through model-agnostic, cloud-agnostic redundancy and a strict 'show intelligence within a second' principle that balances deep thinking with responsiveness. The entire platform is built on one core mantra: keep building.
Build tools at the lowest level possible—like a natural language to GraphQL converter—to unlock infinite use cases without adding hundreds of specialized tools.
A constantly evolving ground truth set with human-rated conversations enables AI judges that power auto-research, prompt optimization, and scalable evaluation.
Sidekick Pulse uses 40–60 minute background research loops to deliver proactive, high-reasoning insights—like fixing shipping misconfigurations—without user input.
Reliability for business-critical functions is achieved through model- and cloud-agnostic redundancy, automatic failover, and a 'show intelligence within a second' response principle.
Sub-agents like the GraphQL tool are fine-tuned for extreme specialization, outperforming general models like Opus 4.7, but only return data—not natural language—to prevent hallucination.
…and 3 more takeaways available in PodZeus
Sponsor: Unblocked’s Context Engine for Agentic Coding
Unblocked’s Dennis Pilarinos explains how a context engine gives AI agents organizational knowledge—like code conventions, decisions, and team norms—so they don’t waste tokens or time on inefficient, context-free reasoning.
The Reflexive AI Culture at Shopify
“Everyone at Shopify is high agency. The interns are coming in and they're also high agency doing incredible things.”
From Vision to Sidekick: The AI Co-Founder
“You have the comfort to ask it anything without worrying about judgment, without worrying about whether it's an embarrassing question or not.”
Simpler Architecture Wins: The Orchestration Model
“The key is actually that less is more. Like simple architecture has gone a long way.”
Evals Are the Foundation: The Ground Truth Set
“Evals is the most important thing for any system. And our goal is to build a system that on that ground true set is indistinguishable from a human.”
“our GraphQL subagent is actually at the task of turning natural language into GraphQL. It is outperforming Opus 4 .7. That's”
“Within a second, a second and a half. If that intelligence is being shown... like what's happening. I mean,”
“You have the comfort to ask it anything without worrying about judgment, without worrying about whether it's an embarrassing question or not.”
Hosts
Guest
Shopify
organization
Sidekick
product
Andrew McNamara
person
Sidekick Pulse
product
GraphQL
other
Unblocked
organization
Toby
person
Dennis Pilarinos
person
OpenClaw
product
Farhan
person
Leading effectively across company archetypes: product, business and design-led leadership w/ Sebastiano Armeli #253
The Engineering Leadership Podcast • 39m • 3/31/2026
Shifting Eng Leaders to Think Like GMs, Building an AI-Driven Visionary Roadmap & Braze’s Product Health Initiative w/ Jon Hyman #254
The Engineering Leadership Podcast • 48m • 4/7/2026
How Enterprises Actually Win with AI: Operationalizing Responsible AI, Engineering Guardrails, Trust Controls, and Systems Thinking at Scale w/ Murali Swaminathan #255
The Engineering Leadership Podcast • 44m • 4/21/2026
Scaling TensorFlow, Navigating Startup Pivots, ML Edge Infrastructure and AI Inference Strategy w/ Rajat Monga #256
The Engineering Leadership Podcast • 40m • 4/28/2026
How the R&D Org at Twilio Drives Business Strategy and Transformation w/ Inbal Shani #257
The Engineering Leadership Podcast • 47m • 5/5/2026
Get the full intelligence
Search transcripts, export clips, track mentions, and explore all topics from “Building reliable and proactive agentic systems at scale: how Shopify’s reflexive AI culture was instrumental in their development of Sidekick w/ Andrew McNamara #258” inside PodZeus.
Start discovering podcast insights today
Start with a 7-day trial and explore a growing catalog of popular podcasts. No credit card required.
No credit card required • 7-day trial • Cancel anytime
