Consulting

AI proposal generator reduced scoping time from days to minutes

A 12-person management consultancy was haemorrhaging time on proposal writing. 3 to 5 days per document, assembled by hand from fragmented internal systems. We built a multi-step agentic AI system that interviews the sales lead, retrieves relevant case studies, and drafts a tailored, on-brand proposal in under 10 minutes.

Key Results

95% reduction in proposal preparation time
3x increase in proposals sent per month
Consistent brand voice enforced automatically

Stack

GPT-4oNotionHubSpotZapier

The Situation

Proposals were the bottleneck nobody wanted to admit to.

This consultancy was winning discovery calls. The problem was the conversion rate between "interested prospect" and "signed contract," and the culprit was time. A lot of it.

The existing process looked like this: a consultant would finish a discovery call and then spend the rest of that day and most of the next searching their Notion workspace for relevant past case studies, pulling methodology content from a library of templates, and assembling everything into a Word document. Partners would review it that evening. Edits would go back and forth. A polished proposal might be ready in three days, more often five.

Meanwhile, the prospect had moved on to the next meeting on their calendar.

The firm had tried building Notion templates. They'd tried Google Docs checklist systems. None of it worked because the bottleneck wasn't the formatting. It was the intellectual labour of selecting the right content for this particular client's context and writing it up compellingly.

That's where AI belongs.

Agent Architecture

This system is built as a sequential agent pipeline. Each stage takes the output of the previous one as its input. The whole chain runs in Zapier, orchestrated via API calls to OpenAI.

flowchart TD
    A([📋 HubSpot Intake Form\nSubmitted by Sales Lead]) --> B[Zapier Trigger]

    B --> C[Stage 1: Context Extraction\nGPT-4o parses intake\ninto structured JSON]

    C --> D[Stage 2: Notion RAG Lookup\nQuery case study library\n& methodology database]

    D --> E{Relevant matches\nfound?}
    E -->|3+ matches| F[Stage 3: Section Generator]
    E -->|< 3 matches| G[Fallback: Broader\nthematic search]
    G --> F

    F --> F1[3a: Executive Summary]
    F --> F2[3b: Problem Framing]
    F --> F3[3c: Proposed Approach]
    F --> F4[3d: Relevant Case Studies\nwith metrics]
    F --> F5[3e: Timeline & Investment]

    F1 & F2 & F3 & F4 & F5 --> H[Stage 4: Assembly\n& Tone Pass]

    H --> I[Stage 5: Quality Gate\nChecks for hallucinations,\nblank sections, client name errors]

    I -->|Pass| J[Write Draft to Notion\nTag to HubSpot Deal]
    I -->|Fail| K[Flag to Slack\nfor Manual Review]

    J --> L[Slack Notification\nto Relevant Partner]
    L --> M([✅ Draft Ready\nfor 30-min Review])

The five-stage chain means each part of the proposal is generated with focused, specific context, rather than trying to prompt a single call to do everything at once, which consistently produces generic output.

The Intake Form

The process starts with a structured intake form in HubSpot. The sales lead fills this in immediately after the discovery call while the conversation is still fresh.

// HubSpot intake form payload — triggers the Zapier workflow
{
  "deal_id": "HS-29847",
  "prospect": {
    "company": "Meridian Capital Partners",
    "industry": "Private Equity",
    "company_size": "50-200",
    "hq_location": "London"
  },
  "engagement": {
    "primary_pain": "Portfolio company reporting is fragmented across 11 entities — no consolidated view",
    "desired_outcome": "Single dashboard + automated monthly pack to LP committee",
    "timeline_pressure": "Board meeting in 6 weeks",
    "budget_signal": "£40k–£80k",
    "decision_makers": ["CFO", "Head of Portfolio Operations"],
    "competitors_mentioned": ["McKinsey", "internal team"],
    "tone_preference": "technical"   // "strategic" | "technical" | "executive"
  },
  "sales_lead": "james.whitfield@consultancy.com",
  "submitted_at": "2025-02-20T14:33:00Z"
}

The tone_preference field turns out to be one of the most impactful inputs. A CFO at a PE firm wants rigour and specificity. A founder at a startup wants momentum and clarity. The same underlying proposal reads entirely differently depending on this single field.

Retrieval-Augmented Generation (RAG)

Stage 2 queries their Notion case study library using the Notion API. We restructured the library during the engagement so each case study has a machine-readable frontmatter block, and this is what the retrieval searches against.

// Notion case study page properties — structured for reliable retrieval
{
  "title": "Portfolio Reporting Consolidation — Global Infrastructure Fund",
  "industry_tags": ["Private Equity", "Finance", "Real Assets"],
  "pain_category": "Reporting & Analytics",
  "solution_type": "Data Consolidation + Dashboard",
  "engagement_size_gbp": 65000,
  "duration_weeks": 8,
  "tools_used": ["Power BI", "SQL", "SharePoint"],
  "headline_metric": "From 5 days to 4 hours for monthly LP reporting pack",
  "relevance_score": null   // Computed at retrieval time by GPT-4o embedding similarity
}

We generate an embedding for the incoming engagement context and compare it against pre-computed embeddings for each case study. The top 3 by cosine similarity are passed to Stage 3 as examples.

flowchart LR
    A[Intake JSON\n→ Embedding] --> B[(Notion\nCase Study Library)]
    B --> C[Pre-computed\nEmbeddings]
    C --> D{Cosine\nSimilarity Ranking}
    A --> D

    D --> E[Top 3 Matches\nwith Scores]
    E --> F[Case Study Full Text\n→ Context Window]

    style A fill:#EFF6FF,stroke:#BFDBFE
    style E fill:#F0FDF4,stroke:#A7F3D0
    style F fill:#F0FDF4,stroke:#A7F3D0

The Quality Gate

Stage 5 is the most important step most people skip. Before the draft lands in Notion, a separate GPT-4o call reviews it against a checklist:

// Quality gate prompt output — structured validation result
{
  "checks": {
    "client_name_correct": true,          // Did the model use "Meridian" not a previous client?
    "no_placeholder_text": true,          // "[INSERT X]" style blanks present?
    "all_sections_populated": true,       // Any section under 80 words?
    "metrics_grounded": true,             // Claims reference actual case study data?
    "tone_consistent": true,              // Does it match "technical" preference?
    "no_competitor_praise": true          // Did the model accidentally compliment McKinsey?
  },
  "confidence_score": 0.94,
  "flags": [],
  "recommendation": "PASS"               // "PASS" | "REVIEW" | "REGENERATE"
}

If any check fails or confidence drops below 0.85, the draft is flagged in Slack for manual review instead of being delivered to the partner. In practice, this fires on roughly 1 in 12 proposals, typically when the engagement is highly novel and the RAG lookup returns low-confidence matches.

The Results

The impact compounded across the business in ways beyond the obvious time saving:

Proposal prep time dropped by 95%. From 3-5 days to under an hour including partner review and personalisation.
Proposals sent per month tripled. Not because the pipeline got faster, but because the firm started bidding on work they'd previously passed on because they "didn't have the bandwidth."
Brand voice became consistent across the whole team. Something partners had been trying to enforce through edits for years happened automatically as a property of the prompt design.

One partner put it plainly: "We're now sending proposals the same day as the discovery call. That alone has changed how serious prospects think we are."

What We Learned

Prompt quality is directly proportional to data quality. The retrieval system is only as good as the case study library it queries. We spent more time restructuring their Notion workspace than we spent on the AI pipeline itself. If the inputs are vague, the outputs are vague, regardless of how capable the model is.

"Consistent brand voice" is harder to specify than it sounds. We went through three rounds of prompt iteration comparing AI-generated output against real approved proposals before the partner said "that's us." The process of articulating what "us" means is genuinely valuable for a firm because it surfaces assumptions about voice that had never been written down.

The quality gate earns more trust than any demo. Partners were nervous about AI composing client-facing documents. Showing them the validation layer, and watching it correctly flag a draft where the model had confused two clients' industries, was the moment the trust shifted. Guardrails are features, not caveats.

Want results like these?

Book a free discovery call and we'll map out what's possible.

Book a Free Call