Streaming Health Diagnostic, v2 Methodology

Created: 2026-05-11 Authors: Nicolás Borja, with Sergio Uzaheta Working title: FAST + CTV Health Diagnostic Status: Draft for Stephen Brooks review Supersedes: v0.1 prototype at diagnostic.html (5 questions, 4 pillars, gut-feel mapping)

0. What changed from v0.1 and why

v0.1 was a five-question demo built to win a meeting. It scored gut-feel against unsourced benchmarks (a widely repeated "LatAm 10% / US 60%" framing for FAST monetization), produced one of four tier labels, and pushed a CTA. It was useful as a credibility prop. It is not yet a diagnostic instrument that an operator would pay to take, and it is not yet a research study that the industry would cite.

Note on the 10% / 60% claim (added 2026-05-12, post-research): the specific numeric pairing has now been pressure-tested against primary sources (S&P Global, Dataxis, Ampere, Comscore, FreeWheel, eMarketer) and is unsubstantiated. See Section 7 defect 3 below and research/2026-05-12-fast-market-sizing.md Section 3 for the sourced refutation. The directional gap (LatAm under-monetizes vs US) holds, but is grounded in programmatic-buying share asymmetry (LatAm approximately 32% programmatic share vs US 90%+), not in a single fill-rate pairing.

v2 reframes the work as two coupled products:

The operator-facing diagnostic. A 18 to 22 item instrument that produces a score, a radar across five pillars, a segment-relative percentile, and a 90-day action plan. Free entry tier, paid full-report tier.
The industry benchmark study. The aggregated, anonymized response dataset becomes a quarterly research output on the FAST and CTV LatAm operator base. This is the asset that justifies investment, because once the cohort is recruited and fielded, the marginal cost of each subsequent operator scored falls toward zero while the value of the benchmark dataset compounds.

The instrument and the study share infrastructure. The instrument generates the data. The study sells the cohort. Both reinforce Stephen's positioning as the operator who has the only longitudinal view of LatAm FAST and CTV health.

1. Pillar architecture

Five pillars, mapped to Stephen's operator pillar language and to the categories an operator can actually move in a 90-day plan.

1.1 Distribution

Where the content lives and how reachable the audience is. Sub-dimensions: - Platform footprint (number and quality of FAST aggregators, AVOD apps, SVOD partners, OEM placements, social syndication). - Geographic reach (markets active, multilingual readiness, regional rights granted vs held back). - Discovery surface (EPG presence, recommendation placement, search visibility, partner-merchandised positions).

1.2 Monetization

What the inventory pays back per impression and per viewer. Sub-dimensions: - Fill rate (overall, by daypart, by daypart-and-region). - Yield (CPM bands by programmatic vs direct, by region, by content vertical). - Ad-stack maturity (SSAI vs CSAI, header bidding, SSP coverage, deal IDs, direct-sold premium positions). - Monetization model mix (AVOD pure, FAST, hybrid, transactional, sponsorship overlay).

1.3 Marketing

Stephen's pillar. Decomposes into his three sub-elements. Sub-dimensions: - Getting Known (channel-level CAC, brand search volume, share of voice in category, awareness measurement). - Partner Leverage (cross-promotion volume, co-marketing deals active, partner-driven acquisition share, syndication-as-marketing). - LTV (granularity, channel-level cohort LTV, churn or attrition signal, LTV vs CAC ratio per channel).

1.4 Curation

How the catalog is built, scheduled, and refreshed. Sub-dimensions: - Catalog depth and breadth (hours by genre, format mix across FAST, AVOD, SVOD, live). - Scheduling and programming sophistication (block design, daypart strategy, refresh cadence, ad-pod placement quality). - Rights and licensing health (windowed vs perpetual, multi-region clearance, music sync clearance for FAST). - Content production cadence (originals vs licensed vs UGC vs AI-assisted).

1.5 Tech

The operational stack that the other four pillars sit on. Sub-dimensions: - Delivery stack (CDN, encoding pipeline, packaging, DRM if applicable). - Ad stack technical health (SCTE-35 marker accuracy, SSAI integration quality, VAST tag latency, error rate). - Data stack (analytics coverage, attribution, content engagement signal, A/B testing capability). - AI and automation readiness (pipeline tooling, multilingual workflow, content factory maturity).

2. Segmentation: operator type as the first question

The single biggest defect in v0.1 is that it scored a FAST channel owner against the same benchmarks as a full streaming platform. They are different businesses. v2 starts with a typing question that determines which sub-set of questions is asked and which benchmark cohort the operator is graded against.

The five operator archetypes:

Archetype	Description	Example workload
A. FAST channel owner	Operates one or more single-genre or single-brand FAST channels distributed via aggregators. Owns the channel brand, licenses or produces content.	Vix Channels, music-genre FAST, retro-sports FAST
B. FAST aggregator / virtual MVPD	Owns the surface where multiple FAST channels are aggregated. Runs the EPG and the ad stack at the aggregator level.	Pluto, Samsung TV Plus, Vix, regional FAST aggregators
C. AVOD or SVOD streaming platform	Owns the content app, the relationship, the catalog, and the subscription or ad business.	Vix, Tubi, ViX Premium, Pluto on-demand, regional AVOD apps
D. CTV network operator or smart-TV OEM	Owns the device, the home screen, and the ad surface. Aggregates content for the device audience.	Samsung TV Plus operator side, LG Channels, Roku Channel operator side
E. Content licensor or independent producer	Owns the content rights and licenses into FAST, AVOD, or SVOD distribution.	Music labels, sports leagues, IP holders, indie producers

Operator type is a hard segmentation. The same 80 score in archetype A and archetype C means different things, and the operator needs the benchmark to be peer-relative to take the diagnostic seriously.

A secondary typing question captures scale (annual ad revenue band) so the percentile is also size-adjusted within archetype.

3. Item structure: 18 to 22 questions

v0.1 had five questions. That gave a thin signal that operators could game in 30 seconds. v2 uses 18 to 22 items, distributed across the five pillars and weighted by pillar contribution to outcome. The instrument should take 12 to 18 minutes to complete, which is long enough to deter casual gaming and short enough to finish in a single sitting.

Each item is one of four types:

Objective numeric (slider or input with a benchmarked anchor). Example: ad inventory fill rate percentage.
Categorical (single-select with ordered options mapped to a score). Example: SSAI vs CSAI vs both vs unsure.
Multi-select (chips, scored by combination quality not raw count). Example: which distribution surfaces are active.
Self-report confidence (Likert 1 to 5 on a stated operational practice). Example: "We measure LTV by acquisition channel with granularity."

The instrument deliberately includes two "honesty traps" embedded in the marketing pillar: questions where the wrong answer reveals an instrumentation gap the operator may not realize they have. This is borrowed from the Sound Check question bank pattern (see project_soundcheck_question_bank.md in MEMORY). It is what makes the score more than a self-flattery exercise.

3.1 Item bank (draft, by pillar)

Distribution (4 items)

D1, Platform footprint. How many distinct distribution surfaces is your content on right now? (Slider 0 to 20, with sub-prompt to enumerate types: FAST aggregator, AVOD app, SVOD partner, OEM placement, social syndication, transactional.)
D2, Footprint quality. Of those platforms, how many drive measurable watch-time, defined as more than 1% of your monthly total hours streamed? (Slider, validates D1 against vanity coverage.)
D3, Geographic reach. Which of these regions does your operation actively distribute in today? (Multi-select: US Hispanic, Mexico, CAM, Andean, Southern Cone, Brazil, Iberia, US general market, Canada, ROW.)
D4, Discovery posture. When a viewer searches your brand or genre on Pluto, Samsung TV Plus, or a smart-TV home screen, how reliably does your content appear in the top three results? (Likert 1 to 5, with honesty trap: "We don't measure this" is a valid option that scores low on instrumentation.)

Monetization (5 items)

M1, Fill rate. What is your overall ad fill rate across all distribution surfaces, last 30 days? (Slider 0 to 100%, anchored with cohort benchmark.)
M2, Fill rate variance. How does your fill rate look in your weakest daypart compared to your strongest daypart? (Categorical: less than 10 point gap, 10 to 25 point gap, 25 to 50 point gap, more than 50 point gap, we do not measure by daypart.)
M3, Ad stack maturity. Which of these is true of your current ad stack? (Multi-select: SSAI active, CSAI active, header bidding implemented, three or more SSPs integrated, direct-sold premium inventory exists, deal IDs active.)
M4, Revenue mix. What percentage of your inventory revenue comes from programmatic vs direct-sold? (Slider showing 0 to 100% programmatic; operator places the marker.)
M5, Yield optimization cadence. How often does your team review SSP performance and rebalance the waterfall? (Categorical: weekly, monthly, quarterly, ad hoc, never.)

Marketing (5 items, decomposed into Stephen's three sub-pillars)

MK1, Getting Known: brand awareness measurement. Do you measure brand search volume or unaided brand awareness in your priority markets? (Categorical: yes, both signals, monthly; yes, one signal, monthly; yes, ad hoc; no.)
MK2, Getting Known: channel mix. What drove most of your new viewers in the last 90 days? (Single select with five options as in v0.1, plus honesty trap "I'm not sure".)
MK3, Partner Leverage: cross-promotion activity. How many active cross-promotion or co-marketing partnerships do you operate today? (Slider 0 to 20.)
MK4, LTV granularity. Do you measure viewer LTV by acquisition channel? (Single select: yes granular per channel, yes aggregate, on roadmap, no, what is LTV.)
MK5, LTV vs CAC ratio. Of your active acquisition channels, what share of them have a known LTV-to-CAC ratio? (Slider 0 to 100%; honesty trap, scored against MK4.)

Curation (4 items)

C1, Format mix. Which formats are active in your catalog right now? (Multi-select: FAST 24/7 linear, AVOD on-demand, SVOD on-demand, live events, short-form clip library, long-form licensed.)
C2, Catalog depth. How many hours of content do you have programmable across all surfaces? (Slider 0 to 10000 hours, log scale.)
C3, Programming sophistication. Who designs your FAST programming blocks? (Categorical: dedicated programmer or programming team, generalist operator with rotation rules, vendor or third party, the aggregator runs it for us, we run shuffle-play, we do not run FAST.)
C4, Rights and licensing health. What share of your catalog has multi-region distribution rights cleared right now? (Slider 0 to 100%, with secondary question on music sync rights for FAST if applicable.)

Tech (3 to 4 items)

T1, Ad-stack technical health. When did you last audit your SCTE-35 marker accuracy or your SSAI integration error rate? (Categorical: this quarter, this year, more than a year ago, we don't audit, we don't know what SCTE-35 is.)
T2, Analytics coverage. Can you answer this in under 5 minutes: "Which content drove the most watch-time on platform X yesterday by region?" (Categorical: yes for any platform; yes for major platforms only; no but the data exists in our stack; no.)
T3, AI and automation readiness. Which of these AI-enabled workflows do you operate today? (Multi-select: automated EPG metadata generation; AI thumbnail variants; AI translation or dubbing; AI content moderation; AI ad-break optimization; none of the above.)
T4, Tech debt signal. What share of your engineering team's time in the last quarter went to incident response or technical debt rather than feature work? (Slider 0 to 100%, with anchor.)

Item count totals: D=4, M=5, MK=5, C=4, T=3-4. Total 21 to 22 items. Final item count locked after Stephen review.

4. Scoring methodology

4.1 Item-level scoring

Each item is scored 0 to 100 against a benchmark anchor. For sliders, the anchor is the cohort median for the operator's archetype. For categoricals, options map to specific scores derived from category-level outcome correlation in the benchmark cohort.

Three scoring modes exist depending on data availability:

Mode A, anchored (full study cohort live): Each item is scored against the actual percentile in the cohort of operators of the same archetype.
Mode B, expert-anchored (pre-launch state): Each item is scored against expert-set anchors based on Stephen, Sergio, and three reference operators we interview to set the bands. This is the launch state.
Mode C, hybrid (partial cohort): Items where we have more than 20 cohort responses use Mode A, items below 20 use Mode B. This is the operating state through Q3.

4.2 Pillar-level scoring

Each pillar score is a weighted average of its items, with item weights tuned per archetype. The weighting is derived from two sources: expert judgment on which items matter most for each archetype, and (once the cohort is live) regression of items against outcome variables (revenue per viewer-hour, retention proxies, fill-rate uplift).

4.3 Total-score weighting

Pillar weights vary by archetype. The v0.1 weighting (25 distribution, 30 monetization, 30 marketing, 15 curation) overweights monetization for content licensors who do not run the ad stack and underweights distribution for FAST channel owners. v2 publishes per-archetype weights:

Archetype	Distribution	Monetization	Marketing	Curation	Tech
A. FAST channel owner	25	25	25	15	10
B. FAST aggregator	15	35	20	15	15
C. AVOD or SVOD platform	20	25	25	15	15
D. CTV / OEM operator	20	30	15	15	20
E. Content licensor	30	10	25	25	10

Weights are draft. Stephen review will sharpen these before launch.

4.4 Tier system

Four tiers, identical labels to v0.1 (the labels are good; the bands now vary by archetype):

Broadcasting (top 25% of cohort within archetype). Sharpening work, not survival work.
Tuned (50th to 75th percentile). System works, scale lever question.
Warming up (25th to 50th percentile). Fundamentals present, engine not yet compounding.
Muted (bottom 25%). Structural gap to market reward.

Tier boundaries are recalculated quarterly as the cohort grows.

4.5 Output

The operator receives:

A total score and tier, percentile against archetype peers.
A five-axis radar with pillar scores.
The three weakest sub-dimensions across all pillars, each tied to a benchmark.
Three named moves for the next 90 days, mapped to the operator's weakest sub-dimensions.
(Paid tier only) A 12 to 18 page diagnostic report with: cohort context, scoring detail per item with benchmark, operator-specific implementation roadmap, suggested vendor and tool stack, sample SOW for engaging the partnership to execute the roadmap.

5. Benchmark cohort design

This is the methodological core of v2 and the part that converts the instrument into a research asset.

5.1 Cohort size targets

Pre-launch interview cohort (Mode B anchoring): 8 to 12 operators, interviewed in semi-structured 60-minute calls. Used to set expert anchors and validate item phrasing.
Launch cohort (first 60 days post-launch): 40 to 60 operators across the five archetypes. Achievable with Stephen's network and modest paid outreach.
Steady-state quarterly cohort: 80 to 120 operators per quarter, refreshed via a mix of new operators and returning operators (returning operators provide the longitudinal signal).

5.2 Recruitment

Three channels in priority order:

Stephen's direct network. Highest-quality cohort, fastest to recruit. Stephen runs the warm intro, we run the diagnostic and produce the operator's free report. Operator participation buys them the report; participation also opts them into the anonymized aggregate.
Industry partnership. A trade body or industry publication (NAB Latin America, NATPE Latin America, FAST and Curious newsletter, the Streaming TV Insider) co-publishes the study in exchange for promotion. Cohort grows via the partner's reach.
Paid outreach. Targeted LinkedIn and email outreach to named role types (Head of Content, VP Monetization, Head of FAST) at known operators. Lowest yield, but predictable.

5.3 Control logic

The instrument is not a controlled experiment. The "control" in this study is the benchmark cohort itself: the operator's score is meaningful only relative to peers in the same archetype. We are explicit in the methodology document that this is a benchmark study, not an RCT. The integrity of the benchmark depends on:

Cohort diversity within archetype (size band, geography, vertical).
Response quality (we drop responses with completion times under 4 minutes as low-quality signal).
Consistency of phrasing and scoring across waves.
Quarterly review of item-to-outcome correlation, with items rotated out if they show no signal.

5.4 Longitudinal signal

The instrument is designed to be retaken every six months. Operators who retake receive a longitudinal report showing movement on each pillar, which becomes the basis for the paid Tier 3 retainer in economic-models.md (continuous practice). This is the engine that converts a one-time diagnostic into a recurring revenue product.

6. What this becomes when combined with Stephen's positioning

Stephen sells alignment, goal-setting, and the strategic relationship. He is already the closest thing in LatAm to a named authority on streaming and FAST health. v2 hands him the only longitudinal, segmented, peer-relative dataset on the LatAm operator base.

Stephen does not have to build the dataset; we run the instrument and the study under his strategic editorial direction. He gets:

A research asset to publish under his name. Quarterly LatAm Streaming Health Report, named for him or co-branded with the partnership.
A qualified-lead engine. Every operator who completes the diagnostic is segmented, scored, and (with permission) eligible for paid follow-up.
A productized service line. The diagnostic becomes the front door to the 90-day implementation (Tier 2) and the continuous practice (Tier 3) in the economic models document.

The diagnostic is not the product. The dataset and the longitudinal practice are. The diagnostic is the recruitment mechanism for the dataset and the conversion mechanism for the practice.

7. Defects in v0.1 this document corrects

For internal alignment and for Stephen to validate that we hear the feedback:

Only 4 pillars. v0.1 dropped Tech. Operators who answered honestly on monetization could not flag the SCTE-35 or SSAI issues that often cause the monetization gap. v2 adds Tech as a fifth pillar.
Same benchmark for all operators. A FAST channel owner was scored against the same monetization bar as a CTV aggregator. v2 introduces five archetypes with per-archetype pillar weights and per-archetype anchors.
Unsourced benchmark anchors. v0.1 cited "LatAm avg 10% / US avg 60%" without an attributed source. Post-research (2026-05-12), this exact numeric pairing is confirmed unsubstantiated by primary sources (see research/2026-05-12-fast-market-sizing.md Section 3). The pairing appears to be an industry artifact possibly conflated from retail supply chain and EU trade reports. v2 attributes every anchor to either a published industry source (with citation) or to the expert-anchor panel. A comprehensive pressure-test of v2 anchors and item bank against the research outputs is queued as Prompt 5 in deep-research-prompts.md; v2.1 revisions will land after that re-run.
Five items is too few. v2 uses 21 to 22 items, with deliberate honesty traps and segment routing.
Scoring was gut-feel mapping. v2 introduces three scoring modes (anchored, expert-anchored, hybrid) and ties final scores to percentiles within archetype.
No longitudinal layer. v0.1 was one-shot. v2 is built to be retaken at six-month intervals, with longitudinal reporting as a paid feature.
No path to research asset. v0.1 produced a CTA, not a dataset. v2 is designed dual-purpose: operator-facing diagnostic plus aggregable industry benchmark study.
No segmentation in the recommendations. v0.1 recommendations were keyed off pillar weakness only. v2 routes recommendations through archetype-specific playbooks (a FAST channel owner with a Distribution weakness gets different advice than a CTV aggregator with the same score).

8. Open questions for Stephen

These are the items we need his input on before locking the v2 instrument.

Archetype labels and definitions. Are the five archetypes the right cut for the LatAm market he sees, or does he split CTV operators from FAST aggregators differently?
Pillar weights per archetype. The weights in Section 4.3 are our draft. Stephen has more reps; his calibration will sharpen these materially.
Item-bank gaps. The Curation and Tech pillars are the lightest. Stephen has named Curation as his consulting territory in prior conversation. Is there anything he wants in or out of the operator-facing instrument so it does not compete with his strategy work?
Anchors per item. For Mode B (expert-anchored) launch, we need cohort medians for each item. Stephen names three to five reference operators per archetype whose responses set the anchors.
Players, vendors, and competitors to name. Which FAST platforms, CTV networks, agencies, and diagnostic vendors does he want anchored in the benchmark anchors and the recommendation playbooks (or held back).
Geographic scope for the first wave. Do we field Mexico plus US Hispanic only, or full LatAm, or LatAm plus Iberia. Affects sample size requirements.
Publishing cadence and venue. Quarterly or biannually. Industry-publication partnership candidate, or self-published under his masthead.
Naming. "Streaming Health Diagnostic" is descriptive but generic. v2 deserves a name that travels. Stephen's call.

9. What we are not changing yet

v2 is a methodology document. The interface stays close to v0.1 visually so Stephen can see the upgrade without redesigning the surface. After Stephen approves the methodology, we ship the UI rebuild as a second sprint. Order matters: methodology then surface, not the reverse.

10. Next steps

Stephen review of this document. One to two weeks. Async written feedback plus one 30-minute call to walk the methodology.
Instrument lock. Item bank, archetype weights, anchors set. Two weeks after Stephen sign-off.
Mode B anchor panel. Three to five reference operators per archetype interviewed for anchor-setting. Two to three weeks.
UI rebuild. Next.js page with segmentation, branching, full item bank, paid-tier gating. Three to four weeks.
Soft launch cohort. First 20 operators recruited through Stephen's warm network. Four weeks.
First publication. Q3 2026 LatAm FAST and CTV Health Report v0.1, published under Stephen's name with ESS as research partner.

End-to-end from this document to the first publication: roughly 12 weeks if Stephen's review lands within two weeks. That puts publication on track for late August or early September 2026. See q3-2026-deployment-plan-en.md and the Spanish equivalent for the operational plan and the costs.