Architecture Decision Records: Mom Test Customer Validation
ADR-001: Adopt Behavior-First Questioning as Default Interview Protocol
Status
Accepted
Context
Product teams need a reliable method to distinguish genuine customer demand from polite agreement. Traditional interview approaches ("What do you think of my idea?", "Would you use this?") produce misleading signal because humans are socially conditioned to be supportive. This false positive data is more dangerous than no data, as it creates confidence in the wrong direction.
Decision
Adopt the Mom Test's three rules as the mandatory protocol for all customer discovery conversations:
- Discuss the customer's life and existing behavior, not our idea
- Ask about specific past events, not hypothetical futures
- Maintain a listen-to-talk ratio of at least 70/30
All team members conducting customer conversations must demonstrate proficiency in these rules before conducting interviews independently.
Consequences
Positive: Dramatically improved signal quality. Reduced risk of building features nobody wants. Team decisions become evidence-based. Negative: Initial learning curve. Founders must suppress the instinct to pitch. Some team members may resist the discipline. Neutral: Does not replace quantitative validation — must be paired with surveys, analytics, and market sizing.
Alternatives Considered
- Survey-first approach: Rejected because surveys at early stage lack the depth to uncover real behavior patterns and are easily gamed by social desirability bias.
- Pitch-and-observe approach: Rejected because pitching during discovery contaminates responses with social pressure.
- Jobs-to-Be-Done interviews only: Considered but determined to be complementary rather than a replacement. JTBD provides the strategic framework; Mom Test provides the tactical conversation discipline.
ADR-002: Classify All Interview Data into Three Reliability Tiers
Status
Accepted
Context
Customer conversations produce a mix of reliable and unreliable data. Teams frequently treat all data equally — giving the same weight to an enthusiastic "I love it!" (unreliable) as to "I spent $500 last month trying to solve this" (highly reliable). Without a classification system, dangerous noise gets treated as signal.
Decision
Implement a three-tier data classification system applied to all interview notes:
- Tier 1 — Behavioral Facts: Past actions, workflows, money spent, tools tried, time invested. High reliability. Base decisions on this.
- Tier 2 — Contextual Signals: Emotional reactions, frustration indicators, engagement level. Medium reliability. Use as supporting evidence.
- Tier 3 — Noise: Compliments, hypothetical statements ("I would..."), feature wishlists, generic claims ("I always..."). Low reliability. Acknowledge and discard from decision-making.
Notetakers must tag all captured data with its tier during or immediately after conversations.
Consequences
Positive: Prevents dangerous noise from influencing product decisions. Creates a shared vocabulary for data quality across the team. Negative: Requires discipline to discard flattering feedback. May feel counterintuitive to dismiss positive signals. Neutral: Tier 2 data remains subjective and requires judgment in application.
Alternatives Considered
- No classification (treat all data equally): Rejected because this is the primary failure mode the framework is designed to prevent.
- Binary classification (signal/noise): Rejected because it loses the nuance of Tier 2 contextual signals that have legitimate supporting value.
ADR-003: Require Commitment Escalation in Every Conversation
Status
Accepted
Context
A common failure pattern in customer discovery is accumulating "zombie leads" — prospects who express enthusiasm in conversations but never take any concrete action. Without a mechanism to test commitment, teams mistake politeness for demand. The Mom Test emphasizes that "actions speak louder than words," but teams need a structured way to measure this.
Decision
Every customer conversation must end with a commitment ask — a request for a concrete next step that costs the participant something (time, reputation, or money). The commitment ladder is:
- Time: Follow-up meeting scheduled
- Reputation: Introduction to colleague or decision-maker
- Effort: Agreement to test a prototype or provide detailed feedback
- Money: Pre-order, deposit, or paid pilot
- Contract: Letter of intent or formal agreement
If a prospect fails to advance past Level 1 after 2–3 touchpoints, classify as a zombie lead and deprioritize.
Consequences
Positive: Provides an objective measure of genuine interest. Prevents accumulation of false pipeline. Accelerates learning about real vs. perceived demand. Negative: Some team members may feel uncomfortable "asking for things" during research conversations. Requires coaching on how to ask naturally. Neutral: Some valuable conversations may not lend themselves to commitment asks (e.g., purely exploratory industry research). Use judgment.
Alternatives Considered
- NPS-style scoring of conversations: Rejected because NPS measures satisfaction, not commitment. A satisfied non-buyer is still a non-buyer.
- No structured commitment tracking: Rejected because this is the pattern that produces zombie leads.
ADR-004: Mandate Full Team Participation in Customer Learning
Status
Accepted
Context
A known anti-pattern in customer discovery is the "learning bottleneck" — where one person (typically the CEO, product manager, or "business person") conducts all customer conversations and then tells the rest of the team what to build. This creates an information asymmetry that enables political power ("the customer told me...") rather than shared understanding. It also means customer insights are filtered through one person's interpretations and biases.
Decision
All core team members (founders, product, engineering leads, design leads) must:
- Participate in at least 2 customer conversations per month (as interviewer or notetaker)
- Attend 100% of post-batch team review sessions
- Have access to the full conversation notes repository
No single person may be the exclusive interpreter of customer data for the team.
Consequences
Positive: Eliminates the learning bottleneck. Creates empathy across the entire team. Engineers who hear customer pain directly build with more context. Reduces "telephone game" distortion of customer insights. Negative: Time investment for non-customer-facing team members. Scheduling complexity. Neutral: Different team members will interpret the same conversation differently — this is a feature, not a bug, as it surfaces blind spots.
Alternatives Considered
- Dedicated research team only: Rejected for early-stage teams where shared context is critical. Acceptable for enterprise organizations if paired with mandatory cross-team insight sessions.
- Written summaries instead of participation: Rejected because summaries lose nuance, emotional signals, and the educational effect of hearing customers directly.
ADR-005: Phase AI Interview Tools After Manual Foundation
Status
Accepted
Context
AI-powered interview platforms (Prelaunch, Marvin, Cusmos) emerged in 2024–2025, offering the ability to conduct Mom Test-aligned conversations at scale. The temptation for time-pressured teams is to skip manual conversations entirely and rely on AI from day one. However, practitioner consensus and platform guidance both indicate that AI scales execution but doesn't teach the mindset. Teams that skip manual conversations tend to design poor research goals, ask shallow questions, and misinterpret AI-generated insights.
Decision
Adopt a three-phase approach:
- Phase 1 — Manual (Conversations 1–15): All conversations conducted by humans, in person or via video. Team learns the methodology, develops intuition, and internalizes the three rules.
- Phase 2 — AI-Assisted (Conversations 15–50): Use AI tools for transcription, note analysis, and question refinement. Continue conducting conversations manually.
- Phase 3 — AI-Scaled (Conversations 50+): Deploy AI interview platforms for parallel segment validation. Human team focuses on high-stakes conversations, commitment escalation, and relationship management.
No team may deploy AI interview platforms without completing Phase 1.
Consequences
Positive: Ensures foundational methodology is internalized before scaling. Prevents garbage-in/garbage-out problem with AI tools. Negative: Slower initial throughput compared to immediate AI deployment. May frustrate teams under time pressure. Neutral: Phase boundaries are guidelines, not rigid gates. Experienced teams with prior Mom Test practice may compress Phase 1.
Alternatives Considered
- AI-first from day one: Rejected because AI tools require well-formed research goals that teams can only develop through manual practice.
- Manual-only (no AI ever): Rejected because AI tools offer genuine scaling value once foundations are solid, and the 2025–2026 tooling has reached sufficient maturity.
- Parallel manual + AI from the start: Rejected because managing two conversation streams simultaneously while learning the methodology creates cognitive overload.
These ADRs document the foundational decisions for adopting and operationalizing the Mom Test methodology within a product organization. Each decision is reversible — revisit as the team's maturity and tooling landscape evolve.