∞GPT: Training Large Language Models For Any-To-Any Generation

This is course project for 11-785: Introduction to Deep Learning (Fall 2025) at Carnegie Mellon University. Please find the project report which documents the details of the project. The project is focused on training large language models for any-to-any generation tasks, including multimodal tasks involving images and speech.

Directory Structure

Main

external/
Directory for external dependencies or third-party scripts/tools.
logs/
Contains logs generated during evaluations.
MMMUResults/
Stores evaluation results for MMMU tasks.
MMMUTokenized/
Contains pre-tokenized data for MMMU tasks.
SpeechResults/
Stores evaluation results for speech tasks.
SpeechTokenized/
Contains pre-tokenized data for speech tasks.
SpeechTokenizer/
Repository or module for speech-specific tokenization logic.
SpeechGenResults/
Stores generation-based evaluation results for speech tasks.

Scripts

Evaluation Scripts

eval_mmmu.py
Evaluates MMMU tasks in a constrained setting.
Supports token-based evaluation of instruction-response tasks.
eval_mmmu_gen.py
Performs generation-based evaluation for MMMU tasks.
Focuses on free-form responses.
eval_speech.py
Evaluates speech tasks with pre-tokenized audio data in a constrained manner.
Uses prompts tailored for speech-to-text evaluation.
eval_speech_gen.py
Performs free-form generation-based evaluation for speech tasks.
Handles tasks dynamically with multiple datasets.

Tokenization Scripts

speech_tokenization.py
Tokenizes audio files for speech tasks.
Outputs tokenized representations for use in evaluations.
image_tokenization.py
Tokenizes image data for image-based tasks.
Supports multimodal evaluations.

Shell Scripts

eval_mmmu.sh / eval_mmmu_gen.sh
Shell scripts to run MMMU evaluations.
eval_speech.sh / eval_speech_gen.sh
Shell scripts to run speech evaluations.
tokenize_image_audio.sh
Script for tokenizing both image and audio data.

Other Scripts

inference.py
General inference script for running models on various tasks.
anygpt_install.sh
Script to install dependencies and set up the environment.

Key Files

speech_tasks.json
JSON file containing configurations for speech datasets.
README.md
This file, providing an overview of the project.

Setup

Prerequisites

Python 3.8 or higher
Required Python packages (install using the provided installation script):
```
bash anygpt_install.sh
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

∞GPT: Training Large Language Models For Any-To-Any Generation

Directory Structure

Main

Scripts

Evaluation Scripts

Tokenization Scripts

Shell Scripts

Other Scripts

Key Files

Setup

Prerequisites

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
MMMUGenResults		MMMUGenResults
MMMUResults		MMMUResults
MMMUTokenized		MMMUTokenized
SpeechGenResults		SpeechGenResults
SpeechTokenized		SpeechTokenized
external		external
prompts		prompts
.gitattributes		.gitattributes
.gitignore		.gitignore
InfGPT.pdf		InfGPT.pdf
README.md		README.md
anygpt_install.sh		anygpt_install.sh
edit_results.py		edit_results.py
eval_mmmu.py		eval_mmmu.py
eval_mmmu.sh		eval_mmmu.sh
eval_mmmu_anygpt.py		eval_mmmu_anygpt.py
eval_mmmu_gen.sh		eval_mmmu_gen.sh
eval_speech.py		eval_speech.py
eval_speech.sh		eval_speech.sh
eval_speech_anygpt.py		eval_speech_anygpt.py
eval_speech_gen.py		eval_speech_gen.py
eval_speech_gen.sh		eval_speech_gen.sh
eval_speech_gen_anygpt.py		eval_speech_gen_anygpt.py
image_tokenization.sh		image_tokenization.sh
image_tokenizer.py		image_tokenizer.py
inference.py		inference.py
llm_eval.py		llm_eval.py
llm_eval_all.sh		llm_eval_all.sh
process_results.py		process_results.py
process_results.sh		process_results.sh
sample_outputs.txt		sample_outputs.txt
speech_tasks.json		speech_tasks.json
speech_tokenization.py		speech_tokenization.py
speech_tokenization.sh		speech_tokenization.sh
tokenize_image_audio.sh		tokenize_image_audio.sh

macabdul9/InfGPT

Folders and files

Latest commit

History

Repository files navigation

∞GPT: Training Large Language Models For Any-To-Any Generation

Directory Structure

Main

Scripts

Evaluation Scripts

Tokenization Scripts

Shell Scripts

Other Scripts

Key Files

Setup

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages