This repository contains the complete dataset and analysis from our comprehensive study of AI search citation patterns in Perplexity. We analyzed 23,936 citations across 6,606 domains to understand how AI search engines determine authority and select sources.
- YouTube dominates with 3.3x more citations than any other domain
- Citation positions are evenly distributed (20% each) unlike traditional SEO
- 25 domains achieved perfect cross-vertical presence
- $0.0054 average cost per query with 100% success rate
The Day We Discovered AI Search Doesn't Care About Your #1 Rankings →
├── /data/ # Raw and processed datasets
- Download the complete dataset: data/processed/
- Read the full analysis: [authority-mapping.md)
top_domains_authority.csv- Top 50 domains by authority scorecitation_patterns.csv- Citation patterns by vertical and query typecross_vertical_analysis.csv- Cross-vertical authority analysiscost_efficiency_metrics.csv- Query cost and efficiency data
| Metric | Value |
|---|---|
| Total Domains Analyzed | 6,606 |
| Total Citations | 23,936 |
| Total Queries | 4,835 |
| Verticals Covered | 23 |
| Query Success Rate | 100% |
| Total Cost | $26.09 |
- Cross-vertical presence = Higher authority (correlation: 0.381)
- First position citations strongest predictor of authority
- Query complexity has minimal impact on cost ($0.0054 avg)
- Direct quotes represent only 5% of citations
import pandas as pd
# Load the authority scores
authority_df = pd.read_csv('data/processed/top_domains_authority.csv')
# View top 10 domains
print(authority_df.head(10))This project is licensed under the MIT License - see LICENSE.md for details.
- Email: hello@hueston.co
- LinkedIn: Jeremy Meindl
Lahina Strong!
Keywords: AI search Perplexity SEO research citation analysis authority scoring