D2DO301: Actually Implementing AI Summary — The Everything Feed - All Packet Pushers Pods

Summary

Enrico Teodi, an independent consultant with 25 years of software experience spanning engineering and product management, joined the Day 2 DevOps podcast to share concrete, real-world AI implementation stories from a 13-14 month engagement at a software company. He was originally brought in to address software quality issues, including poor test coverage and brittle unit tests that gave false confidence without genuine integration testing.

His first major AI discovery was connecting tools like Windsurf to the full application codebase rather than just the database schema. This allowed the AI agent to reason about business logic encoded in enums and application code, not just table definitions—dramatically expanding the quality of questions it could answer. He then escalated this by connecting the agent to a read-only production database replica via Kubernetes port forwarding, enabling root cause analysis on real production data rather than local test data.

Enrico described several concrete use cases: debugging a record ordering bug where the agent initially concluded the ordering was correct before being redirected to find a missing ORDER BY clause in a polymorphic table; and diagnosing a slow page load by combining MCP-connected tools including Sentry for performance monitoring and PostHog for analytics, only to discover that the slow feature had been clicked by just three people in four months, two of whom were likely developers testing it. This multi-tool agentic approach compressed what would have been hours of manual investigation into approximately five minutes.

He raised strong concerns about AI permissions and trust, explicitly stating he does not trust AI with write access to production systems. He advocated for read-only database access, isolated VMs for tools like Open Claude, and tightly scoped permissions that expand incrementally. He warned that system prompts are ineffective guardrails and that AI makes poor decisions when given admin-level permissions.

On the human side, Enrico argued that the developers who succeeded were not the fastest coders but those who understood the product, asked precise questions, and recognized when the AI was wrong. He expressed concern about 'AI slop'—code generated without clear acceptance criteria, proper testing, or human oversight. He recounted an analyst who burned $800 in hours by bypassing the company's AI gateway with a direct API key, and another who ran Cloud Code against a disconnected local environment for hours using CI failures as feedback, ultimately failing to deploy a working feature.

He framed the current moment as requiring tighter product-engineering collaboration, stronger definitions of 'done,' and test-driven development as a non-negotiable foundation. He compared the current AI shift to the rollout of Microsoft Office—a moment that democratized powerful tools but required new literacy to prevent catastrophic misuse. He also expressed optimism about local models eventually reducing dependence on expensive third-party frontier model services, viewing AI as simply the next abstraction layer above previous programming paradigms.

Key Insights

Enrico found that connecting an AI agent to the full application codebase—not just the database schema—allowed it to reason about business logic encoded in enums and application code, which he described as the first moment that genuinely astonished him about AI's power.

Enrico argues that combining a read-only production database replica with codebase access and analytics MCPs allowed him to compress multi-hour debugging investigations into approximately five minutes, citing a specific case where a slow page feature turned out to have been used by only three people in four months.

Enrico explicitly states he does not trust AI with write permissions, advocating for incrementally scoped read-only access and warning that system prompts are 'a wet paper bag of a guardrail' that cannot reliably prevent AI from making destructive decisions.

Enrico observed that the development team's existing unit tests were so brittle and low-level that they provided false confidence—tests passed in CI but features broke in production—and he argues that meaningful integration tests are now more critical than ever given the speed at which AI generates code.

Enrico claims that the developers who succeeded during his engagement were not the fastest coders but those who understood the product, asked precise questions, and knew when the AI was wrong—a conclusion he drew directly from observing team performance over 14 months.

Enrico recounted an analyst who burned approximately $800 in a few hours by bypassing the company's AI gateway and using a direct OpenAI API key for local testing, illustrating that AI cost management requires proactive architectural controls, not just awareness.

Enrico argues that the increased speed of AI-assisted development makes tighter product-engineering feedback cycles more necessary, not less—comparing the ideal to Pivotal Labs' practice of having domain experts pair-programming on-site daily to shorten iteration loops.

Enrico contends that people who will be displaced by AI are not necessarily those in technical roles, but those who lack curiosity and resist adapting—drawing an analogy to gas station attendants and supermarket cashiers, roles eliminated not by malice but by indifference to evolving context.

Topics

Agentic AI workflows combining codebase, production database, and analytics toolsAI permissions, security, and responsible access controlsTest-driven development and software quality in the AI eraProduct management and acceptance criteria for AI-generated codeDeveloper curiosity and adaptability as survival traitsCost management and token spend awarenessLocal models vs. third-party frontier model dependency

Summary

Key Insights

Topics

Get AI summaries delivered to your inbox