AEO

How to Get Your Startup Found by AI Tools Like ChatGPT and Perplexity

April 13, 2026 · 8 min read · By Nomi Barda

A few months ago I audited how AI tools describe businesses. I picked companies at random, searched for them in ChatGPT and Perplexity, and wrote down what came back. The results were bad. Wrong pricing. Phantom features. One SaaS product described as a meal kit subscription. These weren't obscure companies. They had real sites, real products, real customers. AI just didn't know they existed.

If you're launching something, this is worth paying attention to. More and more people are using AI tools to research products before buying. And if AI can't find you, or gets you wrong, that's a problem with no obvious fix.

This article explains what actually determines whether AI tools cite your site, and what you can do about it from day one.


Why most new sites are invisible to AI

AI tools don't work like Google. Google crawls your site, indexes your pages, and ranks you based on backlinks and content quality. AI tools pull from a mix of sources: indexed web content, structured data, third-party directories, and in some cases direct API crawls.

The overlap between what Google ranks and what AI cites is only partial. A 16-month BrightEdge study found that AI Overview citations overlapped with organic Google rankings for only 54% of results. [1] A separate Ahrefs analysis found that 80% of LLM citations don't rank in Google's top 100 for the original query. [2]

This cuts both ways. You don't need to dominate Google to get cited by AI. But you do need to give AI systems something to work with.


What AI actually uses to decide what to cite

Research from Princeton University published at KDD 2024 studied 10,000 queries across 9 source types and identified factors that affect whether AI systems cite a page. [3] The most actionable findings:

Statistics and quotations matter. Pages that include data-backed statistics are cited by AI at measurably higher rates than pages without them. Original quotations increase citation probability further. [3]

Brand search volume is the strongest predictor. A 2025 analysis of 680 million citations found that brand search volume correlated with LLM citations more strongly than traditional backlinks. [3] If people are searching for your brand name, AI is more likely to know about you.

Being on multiple platforms multiplies your chances. Brands mentioned on 4 or more platforms are 2.8 times more likely to appear in ChatGPT responses. [3] Product directories, startup listings, GitHub, LinkedIn, Medium posts all count.

Different AI tools cite different sources. Only 11% of domains are cited by both ChatGPT and Perplexity. [3] What works for one doesn't automatically work for the other.


Structured data: the part most founders skip

In March 2025, Microsoft's Principal Product Manager stated directly at SMX Munich that "schema markup helps Microsoft's LLMs understand your content." [4] This followed years of Google recommending structured data for AI Overviews. It's now confirmed across both major platforms.

Structured data is code you add to your HTML that tells machines what your content means. It uses a standard called JSON-LD, which Google recommends and which the AI ecosystem has aligned around.

An AccuraCast study analyzed over 2,000 prompts across ChatGPT, Google AI Overviews, and Perplexity and found that 81% of pages that received AI citations included schema markup. [5] BrightEdge analysis found pages with structured data are up to 40% more likely to appear in AI summary positions. [5]

The schema types that matter most for a new startup:

Organization schema tells AI who you are. Your business name, what you do, and links to your profiles on LinkedIn, GitHub, Crunchbase. This builds entity recognition across platforms.

FAQPage schema is the single highest-ROI schema type for AI citations. FAQPage-tagged content is 3.2 times more likely to appear in Google AI Overviews. [4] AI tools are designed to extract Q&A content, and structured FAQ markup makes that trivial.

Article schema on every blog post. Include author, publish date, and headline at minimum.

You can validate your schema with Google's free Rich Results Test at search.google.com/test/rich-results.


llms.txt: worth doing, with honest expectations

llms.txt is a plain text file you place at the root of your domain that describes your site to AI tools, similar to how robots.txt works for search crawlers. It was proposed by Jeremy Howard of Answer.AI in September 2024. [6]

Here's what you need to know before implementing it: as of early 2026, no major AI platform has officially confirmed they read llms.txt files. Server log analysis found that GPTBot, ClaudeBot, PerplexityBot, and Google's AI crawler made zero visits to the llms.txt file across a three-month observation period in 2025. [7]

That said, major companies including Anthropic, Cloudflare, and Stripe have implemented it. Google included it in their Agents to Agents (A2A) protocol. [6]

The honest case for implementing it: it takes under an hour, there's no downside if platforms eventually adopt the standard, and it signals to anyone checking your site that you're thinking about AI readiness. If you have limited time, structured data and third-party presence matter more right now.


What actually moves the needle: third-party presence

The Surfer AI Tracker analyzed 36 million AI Overviews between March and August 2025 and found a consistent pattern: AI trusts institutional authority and community content. [8] Wikipedia, Reddit, YouTube, LinkedIn, and niche directories dominate citations across categories.

For a new startup, you can't get a Wikipedia page on day one. But you can:

Get listed on startup directories. Product Hunt, Indie Hackers, BetaList, AlternativeTo, There's An AI For That, G2. Each listing is a named reference to your brand from a domain that AI systems already cite regularly.

Publish on platforms AI already cites. Medium and Substack both appear in AI citation data. [9] Posts there link back to your domain and create a consistent trail of information about your product.

Answer questions in your niche on Reddit. Reddit is among the most cited sources across ChatGPT, Perplexity, and Google AI Overviews. [8] A genuine, helpful answer that mentions what you're building creates a citation-worthy reference.

Keep your information consistent across platforms. If your pricing is $49/month on your site, it should be $49/month everywhere. Inconsistency is one of the main reasons AI tools get product descriptions wrong.


The practical checklist

If you're launching something and want AI tools to find you accurately, here's what to do from day one:

  1. Add Organization JSON-LD schema to your homepage with your business name, description, and links to your profiles
  2. Add FAQPage schema answering the most common questions about your product
  3. Submit to at least 5 startup directories in your first week
  4. Publish at least one article on Medium or Substack describing what your product does and who it serves
  5. Create a LinkedIn company page with a complete description
  6. Add llms.txt if you have time, but treat it as optional for now

None of these are technically hard. Most founders skip them because they're focused on building. The result is a product that exists but that AI doesn't know about.


Common questions

Does Google ranking affect whether AI cites me?
For Google AI Overviews, ranking helps. For ChatGPT and Perplexity, much less so. Ahrefs found that ChatGPT primarily cites pages ranking at position 21 or lower about 90% of the time. [2] You don't need to be on page one.

How long does it take to show up in AI responses?
It varies. Perplexity crawls fresh content aggressively. ChatGPT tends to cite older, more established content. Google AI Overviews correlate more with traditional ranking signals, which take months to build.

Does llms.txt actually work right now?
No major AI platform has confirmed they use it, and server logs show AI crawlers aren't fetching the file. It's a reasonable future-proofing step but not a current traffic driver.

Does schema markup affect my Google search ranking?
No. Google's John Mueller confirmed in 2025 that structured data doesn't directly influence rankings. [10] It affects rich snippet display and AI citation probability.

If you'd rather have all of this set up automatically against your existing site, Bonai generates your AEO infrastructure (Organization schema, FAQPage schema, llms.txt) and your initial directory listing content from your URL, then runs a weekly audit across ChatGPT, Perplexity, and Gemini to catch what AI gets wrong and correct it with calibrated content.

Sources

  1. BrightEdge. AI Overview Citation and Organic Rank Overlap, 16-month study.
  2. Position Digital. 90+ AI SEO Statistics for 2025.
  3. The Digital Bloom. 2025 AI Visibility Report: How LLMs Choose What Sources to Mention.
  4. GEORaiser. Schema Markup for AI: Why JSON-LD Is the New SEO.
  5. Genrank. JSON-LD Schema: The Secret Language AI Engines Understand.
  6. GetPublii. The Complete Guide to llms.txt.
  7. Semrush. What Is llms.txt & Should You Use It?
  8. Surfer SEO. AI Citation Report 2025: Which Sources AI Overviews Trust Most.
  9. Semrush. The Most-Cited Domains in AI: A 3-Month Study.
  10. Digidop. Structured Data: SEO and GEO Optimization for AI.