Skip to content

Ever wonder how your GitHub work across orgs maps to the industry? RepoMapper analyzes README content from all your repos and subdirectories, categorizes them by domain, and visualizes your technical footprint with AI-powered insights. Auto-updates weekly.

Notifications You must be signed in to change notification settings

haKC-ai/RepoMapper

Repository files navigation

                 ██████████
                █▓       ░██
                █▒        ██ T H E   P I N A C L E  O F  H A K C I N G   Q U A L I T Y
    █████████████░        █████████████████ ████████████ ████████████      ████████████
   ██         ███░        ███▓▒▒▒▒▒▒▒▒▒▒▒██ █▒▒▒▒▒▒▒▒▓████        █████████▓          ▒█
   ██         ███         ███▒▒▒▒▒▒▒▒▒▒▒▒▓██████████████▓        ███▓▒      ▒▓░       ▒█
   ██         ███        ░██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▓██▓▒▒▒▒▒▒▒▒█▓        ███░       ░██░       ▒█
   ██         ███        ▒██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒██▓▒▒▒▒▒▒▒▓▒        ██  ▓        ██░       ▓█
   ██         ██▓        ███▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒█▓▒▒▒▒▒▒▒▓▒       ██   █        ██░       ▓
   ██         ██▒        ██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▒▒▒▒▒▒▒▓▒      ██    █        ▓█████████
   ██                    ██▒▒▒▒▒▒▒▒█▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▒   ▒███████ █░       ░▓        █
   ██         ░░         ██▒▒▒▒▒▒▒▒██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ ▓        ░█ ▓       ░▒       ░█
   ██         ██░       ░█▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ █░        ▒ █                ░█
   ██         ██        ▓█▒▒▒▒▒▒▒▒▒██▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓█ █░        ▒ █░               ▒█
    ██████████  ███████████▓██▓▓█▓█  █▓▒▒▒▒▒▒▒▒▒▓██▓██   █▓▓▓▓▓▓▓█    █▓▓▓▓▓▓▓▓▓▓▓▓▓▓██
  .:/====================█▓██▓██=========████▓█▓█ ███======> [ P R E S E N T S ] ====\:.
                           ██▓██           █▓▓▓██ ██  ▄▄▄▄ ▄▄ ▄▄▄▄▄▄ ██▄  ▄██ ▄████▄ █████▄
                             █▓█             ██▓██   ██ ▄▄ ██   ██   ██ ▀▀ ██ ██▄▄██ ██▄▄█▀
                              ██               ███   ▀███▀ ██   ██   ██    ██ ██  ██ ██

RepoMapper Banner

GitHub Repository Mapper

Python Claude AI License GitHub CLI Automation Security Stars Forks

Automatically discover, analyze, and categorize all your GitHub repositories by cybersecurity domain and purpose. Generates an evergreen mind map that updates weekly.

Overview

RepoMapper Banner

Features

  • Automatic Discovery: Fetches all repositories from your personal account and organizations
  • AI-Powered Categorization: Uses Claude AI to analyze README files and categorize repos
  • Cybersecurity Focus: 14 specialized cybersecurity categories plus general development categories
  • Multiple Output Formats: XMind JSON, Markdown tree view, and statistical reports
  • Intelligent Caching: Only re-analyzes repos when needed (configurable cache period)
  • Weekly Automation: GitHub Actions workflow for automatic weekly updates
  • Evergreen: Continuously maintains up-to-date repository mappings

Categories

The system intelligently categorizes your repositories into specialized domains:

Screenshot 2025-12-09 at 12 29 59 PM
---
config:
  theme: base
  themeVariables:
    primaryColor: '#fff'
    primaryTextColor: '#1e40af'
    primaryBorderColor: '#cbd5e1'
    lineColor: '#94a3b8'
    fontSize: 10px
    fontFamily: Lato
  layout: elk
  look: neo
  elk:
    mergeEdges: true
    nodePlacementStrategy: SIMPLE
  flowchart:
    nodeSpacing: 10
    rankSpacing: 15
    curve: linear
    htmlLabels: true
    padding: 8
---
flowchart TB
    subgraph core["Core Security Domains"]
        direction LR
        subgraph row1[" "]
            direction LR
            subgraph c1["Risk Mgmt"]
                direction TB
                C1A(["Lines of Defense"])
                C1B(["Risk Treatment"])
                C1C(["BCP/DR"])
                C1D(["Cyber Insurance"])
            end
            subgraph c2["Governance"]
                direction TB
                C2A(["Laws & Regs"])
                C2B(["Policies"])
                C2C(["Standards"])
                C2D(["Audit"])
            end
            subgraph c3["Architecture"]
                direction TB
                C3A(["Network"])
                C3B(["Crypto"])
                C3C(["Access Control"])
                C3D(["Cloud"])
            end
        end
        subgraph row2[" "]
            direction LR
            subgraph c4["Operations"]
                direction TB
                C4A(["SOC"])
                C4B(["IR"])
                C4C(["SIEM/SOAR"])
                C4D(["Threat Hunt"])
            end
            subgraph c5["AppSec"]
                direction TB
                C5A(["S-SDLC"])
                C5B(["SAST/DAST"])
                C5C(["API"])
                C5D(["Code Scan"])
            end
            subgraph c6["Assessment"]
                direction TB
                C6A(["Pen Test"])
                C6B(["Vuln Scan"])
                C6C(["3rd Party"])
                C6D(["Assets"])
            end
        end
    end

    subgraph extended[" "]
        direction LR
        subgraph intel["Threat Intel"]
            direction TB
            I1(["IOCs"])
            I2(["OSINT"])
            I3(["Malware"])
        end
        subgraph hack["Hacking"]
            direction TB
            H1(["Exploits"])
            H2(["Red Team"])
            H3(["Social Eng"])
        end
        subgraph ai["AI & ML"]
            direction TB
            A1(["LLMs"])
            A2(["AI Security"])
            A3(["ML Models"])
        end
        subgraph research["Research"]
            direction TB
            R1(["Papers"])
            R2(["PoC Code"])
            R3(["Datasets"])
        end
    end

    subgraph growth[" "]
        direction LR
        subgraph edu["Education"]
            direction TB
            E1(["Awareness"])
            E2(["CTF"])
            E3(["Certs"])
        end
        subgraph dev["Development"]
            direction TB
            D1(["Web Dev"])
            D2(["DevOps"])
            D3(["Automation"])
        end
        subgraph proj["Projects"]
            direction TB
            P1(["Personal"])
            P2(["Hardware"])
            P3(["IoT"])
        end
    end

    C4D -.-> I1
    C4B -.-> I3
    C6A -.-> H1
    I3 -.-> A2
    H1 -.-> R2
    E2 -.-> H1
    D2 -.-> C5A
    P1 -.-> R2

    style row1 fill:none,stroke:none
    style row2 fill:none,stroke:none
    style extended fill:none,stroke:none
    style growth fill:none,stroke:none

    style c1 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c2 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c3 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c4 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c5 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8
    style c6 fill:#f0f9ff,stroke:#0284c7,rx:8,ry:8

    style C1A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C1D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C2D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C3D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C4D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C5D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6A fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6B fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6C fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5
    style C6D fill:#e0f2fe,stroke:#0284c7,rx:5,ry:5

    style intel fill:#faf5ff,stroke:#7c3aed,rx:8,ry:8
    style hack fill:#fdf2f8,stroke:#db2777,rx:8,ry:8
    style ai fill:#ecfdf5,stroke:#059669,rx:8,ry:8
    style research fill:#e0f7fa,stroke:#0891b2,rx:8,ry:8
    style edu fill:#fffbeb,stroke:#d97706,rx:8,ry:8
    style dev fill:#fff7ed,stroke:#ea580c,rx:8,ry:8
    style proj fill:#f5f5f4,stroke:#78716c,rx:8,ry:8

    style I1 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style I2 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style I3 fill:#f3e8ff,stroke:#7c3aed,rx:5,ry:5
    style H1 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style H2 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style H3 fill:#fce8f4,stroke:#db2777,rx:5,ry:5
    style A1 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style A2 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style A3 fill:#d1fae5,stroke:#059669,rx:5,ry:5
    style R1 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style R2 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style R3 fill:#cffafe,stroke:#0891b2,rx:5,ry:5
    style E1 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style E2 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style E3 fill:#fef3c7,stroke:#d97706,rx:5,ry:5
    style D1 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style D2 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style D3 fill:#ffedd5,stroke:#ea580c,rx:5,ry:5
    style P1 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5
    style P2 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5
    style P3 fill:#e7e5e4,stroke:#78716c,rx:5,ry:5

    style core fill:#f8fafc,stroke:#0284c7,stroke-width:2px,rx:10,ry:10

    linkStyle 0 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 1 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 2 stroke:#0284c7,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 3 stroke:#7c3aed,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 4 stroke:#db2777,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 5 stroke:#d97706,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 6 stroke:#ea580c,stroke-width:1px,stroke-dasharray:4,fill:none
    linkStyle 7 stroke:#78716c,stroke-width:1px,stroke-dasharray:4,fill:none
Loading

Cybersecurity Categories (14)

  • Offensive Security - Penetration Testing, Exploit Development, Red Teaming
  • Defensive Security - Blue Teaming, Incident Response, Security Monitoring
  • Vulnerability Management - Scanning, Assessment, Bug Bounty
  • Application Security - SAST/DAST, Code Review, API Security
  • Network Security - Firewall, IDS/IPS, Network Monitoring
  • Cloud Security - AWS, Azure, GCP, Container Security
  • Identity & Access Management - Authentication, Authorization, SSO, PAM
  • Cryptography - Encryption, PKI, Hashing, Key Management
  • Malware Analysis - Reverse Engineering, Forensics, Sandboxing
  • Security Operations - SIEM, SOAR, Log Management
  • Compliance & Governance - GRC, Policy, Audit, Risk Management
  • Data Security - DLP, Privacy, Database Security
  • Security Tools & Utilities - Automation, Frameworks, Libraries
  • Education & Training - Tutorials, CTF, Documentation

General Categories (4)

  • General Development - Web, Backend, Frontend, DevOps
  • Data Science & Analytics - ML, Data Analysis, Visualization
  • Infrastructure & Systems - SysAdmin, Monitoring, Deployment
  • Other/Miscellaneous - Personal Projects, Experiments

Installation

Prerequisites

Quick Setup

sequenceDiagram
    participant User
    participant Installer
    participant Venv
    participant Dependencies

    User->>Installer: ./installer.sh
    Installer->>Venv: Create virtual environment
    Venv-->>Installer: ✓ venv created
    Installer->>Dependencies: Install anthropic + hakcer
    Dependencies-->>Installer: ✓ packages installed
    Installer->>User: ✓ Setup complete!
    Note over User: export ANTHROPIC_API_KEY
    User->>User: ./run.sh
Loading

Setup Steps

  1. Clone the repository
git clone https://github.com/haKC-ai/RepoMapper.git
cd RepoMapper
  1. Run the installer
./installer.sh

This will:

  • Create a Python virtual environment
  • Install all dependencies (anthropic, hakcer)
  • Set up directory structure
  • Make scripts executable
  1. Authenticate with GitHub CLI (if not already done)
gh auth login
  1. Set your Anthropic API key
export ANTHROPIC_API_KEY='your-api-key-here'

Or add it permanently to your shell profile:

echo 'export ANTHROPIC_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

Usage

Organization Selection (Optional)

Before running the mapper, you can interactively select which organizations to include:

source venv/bin/activate
python3 select_orgs.py

This launches an interactive checkbox interface where:

  • All organizations are INCLUDED by default
  • Enter a number to toggle an organization on/off
  • Enter 'all' to select everything
  • Enter 'none' to deselect everything
  • Enter 'done' when finished

Your selections are saved to config.yaml and will be respected by future runs.

Quick Start

Run all three steps in sequence:

# Optional: Select which organizations to include
python3 select_orgs.py

# Step 1: Fetch all repositories and READMEs
chmod +x repo_mapper.sh
./repo_mapper.sh

# Step 2: Analyze and categorize repositories
python3 analyze_repos.py

# Step 3: Generate visualizations
python3 generate_xmind.py

Viewing Results

After running the scripts, you'll find the following in the output/ directory:

  1. repo_map.xmind.json - Import this into XMind for interactive mind mapping
  2. repo_map.md - Markdown tree view (readable in any text editor)
  3. repo_statistics.json - Statistical breakdown of your repositories

Importing into XMind

  1. Open XMind (8 or later)
  2. Go to File → Import
  3. Select output/repo_map.xmind.json
  4. Explore your repository map!

Alternative mind mapping tools that support JSON import:

  • MindMeister
  • Coggle
  • MindNode (with conversion)

Automation

GitHub Actions (Recommended)

For evergreen weekly updates:

  1. Push this repository to GitHub
git init
git add .
git commit -m "Initial commit: Repository mapper"
gh repo create my-repo-map --private --source=. --push
  1. Add the Anthropic API key as a repository secret
gh secret set ANTHROPIC_API_KEY
# Paste your API key when prompted
  1. Enable GitHub Actions

The workflow is already configured in .github/workflows/weekly-update.yml and will:

  • Run every Monday at 9 AM UTC
  • Fetch latest repository data
  • Analyze new/updated repositories
  • Generate fresh visualizations
  • Commit updates automatically
  • Upload artifacts for download
  1. Manual Trigger

You can also trigger the workflow manually:

gh workflow run "Weekly Repository Map Update"

Local Cron Job

Alternatively, set up a local cron job:

# Edit crontab
crontab -e

# Add this line to run every Monday at 9 AM
0 9 * * 1 cd /path/to/repo_map && ./repo_mapper.sh && python3 analyze_repos.py && python3 generate_xmind.py

Configuration

Edit config.yaml to customize:

  • Cache duration: How long before re-analyzing repos
  • Model selection: Choose between Claude models
  • Repository filters: Include/exclude archived, forked, or private repos
  • Custom categories: Add your own categorization schemes
  • Output formats: Enable/disable different output types
  • Notifications: Slack, Discord, or email alerts
  • Schedule: Change the automation frequency

Data Storage

repo_map/
├── data/                      # Repository metadata
│   ├── orgs.json             # Your organizations
│   ├── all_repos.json        # All discovered repositories
│   └── cache/                # README files (cached)
│       └── owner_repo_README.md
├── output/                    # Generated outputs
│   ├── repo_map.xmind.json   # XMind mind map
│   ├── repo_map.md           # Markdown tree
│   ├── repo_statistics.json  # Statistics
│   └── repo_analysis.json    # Full analysis data
└── .github/workflows/         # Automation

How It Works

graph LR
    subgraph "Phase 1: Discovery"
        A[GitHub API] -->|gh CLI| B[Fetch Orgs]
        B --> C[Fetch Repos]
        C --> D[Download READMEs]
        D --> E[Cache Files]
    end

    subgraph "Phase 2: Analysis"
        E --> F[Load Metadata]
        F --> G[Claude AI]
        G -->|Categorize| H[Assign Category]
        H -->|Tag| I[Add Metadata]
        I --> J[Cache Analysis]
    end

    subgraph "Phase 3: Visualization"
        J --> K[Build Hierarchy]
        K --> L[XMind JSON]
        K --> M[Markdown Tree]
        K --> N[Statistics]
    end

    style G fill:#ff9900,stroke:#ff6600,stroke-width:2px
    style L fill:#00cc88,stroke:#009966,stroke-width:2px
    style M fill:#00cc88,stroke:#009966,stroke-width:2px
    style N fill:#00cc88,stroke:#009966,stroke-width:2px
Loading

Phase Breakdown

1. Discovery Phase (repo_mapper.sh)

  • Fetches all your GitHub organizations using gh api user/orgs
  • Lists all repositories from each org and your personal account
  • Downloads README files for each repository
  • Caches READMEs to avoid redundant fetches (7-day cache)

2. Analysis Phase (analyze_repos.py)

  • Loads repository metadata and README content
  • Sends each repo to Claude AI for intelligent categorization
  • Claude analyzes the purpose, technology stack, and domain
  • Assigns primary category, subcategory, confidence level, and tags
  • Caches analysis results (re-analyzes after 7 days)

3. Visualization Phase (generate_xmind.py)

  • Organizes repositories into hierarchical structure
  • Generates XMind-compatible JSON format
  • Creates Markdown tree view for text-based browsing
  • Compiles statistics (by category, language, relevance)

Advanced Usage

Filter Repositories

Edit config.yaml to exclude certain repos:

filters:
  exclude_patterns:
    - "^test-.*"        # Exclude repos starting with "test-"
    - ".*-backup$"      # Exclude backup repos
    - "^archive-.*"     # Exclude archived projects

Custom Categories

Add domain-specific categories in config.yaml:

categories:
  custom_categories:
    - key: "blockchain_security"
      name: "Blockchain & Web3 Security"
      subcategories:
        - "Smart Contract Auditing"
        - "DeFi Security"
        - "Wallet Security"

Change AI Model

Use a more powerful model for better analysis:

analysis:
  model: "claude-opus-4-5-20251101"  # More accurate but slower/costlier

Schedule Frequency

Modify the cron expression in .github/workflows/weekly-update.yml:

schedule:
  - cron: '0 9 * * *'  # Daily at 9 AM instead of weekly

Cost Estimates

Using Claude Sonnet 4.5:

  • Per repository: ~1,000 tokens input + 500 tokens output
  • 100 repositories: ~$0.40 USD per full analysis
  • Weekly updates: Minimal cost due to caching (only new/updated repos)

Troubleshooting

"ANTHROPIC_API_KEY not set"

export ANTHROPIC_API_KEY='your-key-here'

"gh: command not found"

Install GitHub CLI: https://cli.github.com/

Rate Limiting

Increase the delay in config.yaml:

analysis:
  rate_limit_delay: 1.0  # seconds between requests

No repositories found

Make sure you're authenticated:

gh auth status
gh auth login

XMind import fails

Try the Markdown output instead: output/repo_map.md

Privacy & Security

  • All analysis runs locally or in your GitHub Actions
  • No data is sent to third parties except Anthropic Claude API
  • README content is cached locally
  • Private repositories are included by default (configurable)
  • API keys are stored securely as GitHub secrets

Contributing

Suggestions for improvements:

  • Additional cybersecurity categories
  • Alternative visualization formats
  • Integration with other tools
  • Enhanced analysis prompts

License

MIT License - Feel free to modify and distribute

Support

For issues or questions:

  1. Check the troubleshooting section
  2. Review GitHub Actions logs for automated runs
  3. Examine output/repo_analysis.json for detailed results

Built with:

  • GitHub CLI (gh)
  • Claude AI by Anthropic
  • Python 3
  • GitHub Actions

About

Ever wonder how your GitHub work across orgs maps to the industry? RepoMapper analyzes README content from all your repos and subdirectories, categorizes them by domain, and visualizes your technical footprint with AI-powered insights. Auto-updates weekly.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published