# SaaStock — HeadshotOps Playbook The single reference for how SaaStock finds, verifies, and ships a high-quality professional headshot for every speaker on saastock.com — speaker pages, event rosters, printed badges, mobile app, on-site signage. This playbook extends [SearchOps](/agents/searchops/) (which governs the whole site's SEO/AEO/GEO stack) and sits alongside [MediaOps](/agents/mediaops/) (event gallery photos and videos). HeadshotOps governs **one image per person** — the canonical face used everywhere that person appears. Designed for **500+ speaker scale**. Accuracy matters more than speed. **A wrong headshot is a critical failure** — worse than no headshot at all. --- ## 1. Output convention One file per speaker, dropped at: ``` apps/web/public/headshots/.webp ``` Where `` matches the speaker slug in `apps/web/app/speakers/registry.ts` (e.g. `jason-lemkin.webp`). Rules: - **Format:** `.webp`, quality 82. - **Crop:** square, face centered (top of head ~10% from top edge). - **Size:** 800×800 minimum, 1200×1200 preferred. Reject anything under 400px on the short edge. - **One file per person.** If the same person speaks at multiple events, the same `.webp` is reused. No alt SVGs, no per-event copies, no `-v2` files. --- ## 2. Input record The agent operates on one speaker at a time. Input is a JSON record derived from the speaker registry plus any enrichment we have: ```json { "slug": "jason-lemkin", "full_name": "Jason Lemkin", "job_title": "Founder & CEO", "company_name": "SaaStr", "company_domain": "saastr.com", "linkedin_url": "https://www.linkedin.com/in/jasonmlemkin", "twitter_url": "", "personal_website": "" } ``` Only `slug` and `full_name` are required. Every other field improves match confidence. --- ## 3. Search priority — three tiers Search sources in this order. Stop as soon as a candidate scores ≥80. ### Tier 1 — Highest trust 1. Speaker's LinkedIn profile photo 2. Speaker's official company bio / team page 3. Speaker's personal website 4. Speaker's official X/Twitter profile photo 5. Previous conference speaker page (SaaStock archives, SaaStr Annual, etc.) ### Tier 2 — Medium trust 6. Crunchbase profile 7. Podcast guest pages (where the speaker was the named guest) 8. Press interviews on reputable outlets 9. YouTube channel "About" page 10. Medium author page 11. Substack author profile ### Tier 3 — Lowest trust (fallback) 12. Google image search 13. Bing image search 14. News articles 15. Generic web search LinkedIn is the highest-trust source but the most aggressively rate-limited. Expect a meaningful fraction of speakers to fall through to Tier 2. That's fine — a Tier 2 company-bio shot is still better than a Tier 3 image search guess. --- ## 4. Hard rejection rules Never accept an image that is any of: - group photo - logo - cartoon / illustration / avatar - AI-generated art - screenshot of a video or slide - low resolution (<400px short edge) - blurry, pixelated, or stretched - meme image - event stage photo where the speaker is small in frame - multiple visible faces (any second face means reject) - heavily watermarked - old low-quality press-kit photo when a better one exists --- ## 5. Identity match rules Before accepting, verify all three: ### Name match Must strongly match the speaker's full name on the source page. ```text GOOD: "Jason Lemkin" BAD: "J. Lemkin", "Jason", "Lemkin Ventures Team" ``` ### Company match The source page must reference the speaker's known company or a publicly documented prior company. ```text GOOD: Jason Lemkin + SaaStr BAD: Jason Lemkin + unknown company ``` ### Face confidence - exactly one dominant face in the frame - face occupies >20% of image area - professional appearance (suitable for a conference website) --- ## 6. Image quality rules **Preferred** - 800px+ width - portrait or square orientation - neutral background - face centered - clean crop - modern photo (not a 10-year-old press kit if a recent one exists) - high sharpness **Acceptable minimum** - 500px+ short edge - clear face - minimal compression artifacts **Reject** - tiny thumbnails (under 400px) - pixelated JPGs - stretched / distorted aspect ratio --- ## 7. Search strategy — step by step For each speaker, run the steps in order. Stop at the first step that produces a score ≥80. ### Step 1 — LinkedIn (if URL exists) Visit the profile, locate the profile photo, score confidence. ### Step 2 — Company bio (if domain exists) ```text site:companydomain.com "speaker full name" ``` Visit the team/leadership/about page, extract candidate. ### Step 3 — Personal website ```text site:personalwebsite.com ``` ### Step 4 — Web search ```text "full name" "company" "full name" "company" headshot "full name" speaker ``` ### Step 5 — Image search fallback ```text full name company full name headshot full name speaker ``` --- ## 8. Scoring model Score every candidate. Only candidates with **score ≥80** are auto-accepted. ```text +40 exact name match +25 exact company match +20 source is LinkedIn or official company bio +15 exactly one face detected +15 high resolution (≥800px short edge) +10 portrait or square orientation +10 professional photographic quality -50 group photo -50 ambiguous identity -30 low resolution -30 visible watermark -100 uncertain identity (different person, namesake risk) ``` Anything **<80** is escalated to human review with the top three candidates attached. --- ## 9. Ambiguous identities — never guess If two or more candidates could plausibly be the speaker (common name, namesake, two people at related companies), return: ```json { "status": "needs_review", "speaker_slug": "", "speaker_name": "", "reason": "multiple possible identities", "top_candidates": [ ... ] } ``` Do not pick one. Wrong is worse than nothing. --- ## 10. Output format Return JSON only, one object per speaker. **Confident match** ```json { "status": "success", "speaker_slug": "jason-lemkin", "speaker_name": "Jason Lemkin", "selected_image_url": "https://...", "source_page_url": "https://...", "source_type": "linkedin", "confidence_score": 96, "width": 1200, "height": 1200, "crop_suggestion": { "type": "square", "focus": "face_center" }, "backup_candidates": [ { "image_url": "...", "source_url": "...", "confidence": 88 } ] } ``` **Needs review** ```json { "status": "needs_review", "speaker_slug": "jane-doe", "speaker_name": "Jane Doe", "reason": "two LinkedIn profiles with same name + similar title", "top_candidates": [ ... ] } ``` **Failed** ```json { "status": "failed", "speaker_slug": "...", "reason": "no candidate scored ≥50" } ``` --- ## 11. Batch rules For bulk runs (full event roster): - Process **one speaker at a time**. Never reuse the previous speaker's image. - **Reset identity assumptions** every run — a new search, a new candidate set. - Log every failure with the reason. - Continue on error. One bad lookup does not stop the batch. - Output a single `headshots.jsonl` log so a human reviewer can scan the `needs_review` rows in one pass. --- ## 12. Human review Anything that doesn't auto-accept (score <80, ambiguous, failed) goes into a review grid. The reviewer: 1. Sees the top 3 candidates side-by-side with source URLs. 2. Picks one — or marks "none of these" and uploads a manual file. 3. The accepted file is saved as `public/headshots/.webp` (converted to 1200×1200 square WebP). The review grid is the only place a human touches a headshot. The agent does everything else. --- ## 13. Final rule > **Wrong headshot is worse than no headshot.** > > If uncertain → `needs_review`. Never guess.