git-rust — A Git CLI clone Written in Rust 🦀

git-rust is a simplified Git implementation built from scratch in Rust. It mimics core Git functionality of basic commands, listed below. It serves as a learning project.

This project explores how Git works under the hood — from creating repositories and hashing objects, to writing Git-compliant object files and reading them back from the .git_rust/objects directory.

Current Features and flags implemented

cargo run init          
                        — Initialize a new Git-like repository. Does not re-initialize a repo.

cargo run add <file>    
                        - Add files to staging area.
                        - Index file compatible with git

cargo run hash-object -w <hash>
                        - Hash and write blob objects to .git/objects
                        - flag -w (optional) - works as expected in git

cargo run cat-file -p <hash>      
                        - Decode and print stored objects
                        - flag -p (optional) - different from git
                            Blobs:      Always prints raw bytes to stdout
                            Trees:      Without pretty print, will send raw bytes to stdout
                            Commits:    Always pretty print

cargo run ls-files
                        - Show the files in the index

cargo run ls-tree <hash>
                        - List the contents of a tree object.

cargo run write-tree
                        - Reads the index and creates tree objects.

cargo run commit-tree <hash> -p <hash> -m <message>
                        - Creates a new commit object
                        - flag -p can be used multiple times for parrent commits
                        - flag -m can be used only once

cargo run commit -a -m <message>
                        - Record changes to the repository
                        - flag -a not yet implemented. Changes need to be staged separately.
                        - flag -m can be used only once.

cargo run fetch <url> <branch> <directory> (work in progress)
                        - Download objects and refs from a repository

cargo run clone <url> <directory> (work in progress)
                        - Clone a repository.

Formatting helper

Object files (blobs, tree and commits)

All blob, tree and commit files are tested to be compatible with git. Below is meant to be as documentation. Should be matching git to the best of my research.

All git object files are preceded by a header. They are all compressed with zlib. A header contains the name of the object, followed by the size of the content (number of bytes) and null terminated.
Example: "blob [size]\0
Example: "tree [size]\0
Example: "commit [size]\0

A Blob file

The content is simply the contents of the original file. Contains no metadata.

A Tree file

The content is list of all the objects under that tree.
Each entry contains the mode, file name and the hash of that object.
The mode is terminated with a space and the filename is null terminated.
The hash is always 20 bytes.

example object 1    -> mode+b' '+file name+b'\0'+hash [u8; 20]
example object 1    -> 100644+b' '+test.txt+b'\0'+63aa9936a393155f43c2b03d42d79b1c83290f41
example output 1    -> 100644 blob 63aa9936a393155f43c2b03d42d79b1c83290f41 file.txt

Mode is specific to each object

Object	Mode
blob	100644
tree	40000
commit	160000

A Commit file

The information in the commit file is separated by new line characters and all but the message start with a specific word.
Between the committer and message, there is an extra line.
Lack of a parent SHA indicates it is the initial commit.
Multiple SHA's indicate it is a merge commit

tree <40-character SHA>\n
parent <40-character SHA>\n      (optional, can appear multiple times)
parent <40-character SHA>\n      (optional, for merge commits)
author <name> <<email>> <timestamp> <timezone>\n
committer <name> <<email>> <timestamp> <timezone>\n
\n
<commit message>

Logic of cargo run commit:

1.  run write-tree                  -> get root tree and the hash of the root tree
    if the index does not exist     -> STOP
2.  read the HEAD                   -> get the branch
    if no branch file               -> initial commit. Create the branch file.
3.  read the branch file            -> get the parent commit
4.  read the tree hash from the last commit
5.  evaluate if new tree hash is the same as tree in the last commit
    if the same                     -> STOP
6.  write the tree to file
7.  use commit-tree to create the commit
8.  write commit to file
9.  update the branch file
10. update reflog
11. print summary

The INDEX file (staging area)

It uses a binary layour (raw bytes) and always big-endian format. The index has a header that is 12 bytes, a record of entries (files/blobs) added to the staging area and a checksum (SHA-1) of all the content (header + entries). The entries are not a fixed length, but a multiple of 8 bytes:

62 bytes of fixed-size metadata (ctime, mtime, mode, sha1, etc.).
A null terminated, variable-length path (the filename or relative path), which follows the metadata.
1–8 bytes of padding to ensure the total size of the entry is a multiple of 8 bytes.

Example of the Header:

Field	Size (Bytes)	Description	Example (Hex)
`dircache DIRC`	4	Always set as "DIRC"	`44 49 52 43`
`Version`	4	Version 2, 3 or 4. 2 is most common	`00 00 00 02`
`Entries`	4	Number of entries	`00 00 00 20`

Example of an entry:

Field	Size (Bytes)	Description	Example (Hex)
`ctime`	4	Created time (seconds)	`5E 2D 5A 80`
`ctime_nanos`	4	Created time (nanoseconds)	`00 00 00 00`
`mtime`	4	Modified time (seconds)	`5E 2D 5A 90`
`mtime_nanos`	4	Modified time (nanoseconds)	`00 00 00 00`
`dev`	4	Device ID (from `stat(2)`)	`00 00 00 15`
`ino`	4	Inode number	`00 00 00 01`
`mode`	4	File mode (includes type and permissions)	`00 00 81 A4`
`uid`	4	User ID	`00 00 03 E8`
`gid`	4	Group ID	`00 00 03 E8`
`file_size`	4	File size in bytes	`00 00 00 04`
`sha1`	20	SHA-1 hash of the file contents	`12 34 ... EF`
`flags`	2	Bitfield with name length, stage, and flags	`00 0A`
`path`	N (variable)	File path (UTF-8 bytes, not null-terminated)	`"main.rs"` = `6D 61 69 6E 2E 72 73`
`padding`	0–7	Null bytes to align total entry size to multiple of 8	`00 00 00` (example)

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

git-rust — A Git CLI clone Written in Rust 🦀

Current Features and flags implemented

Formatting helper

Object files (blobs, tree and commits)

A Blob file

A Tree file

A Commit file

Logic of cargo run commit:

The INDEX file (staging area)

About

Uh oh!

Releases

Packages

Languages

License

someotherself/git_rust

Folders and files

Latest commit

History

Repository files navigation

git-rust — A Git CLI clone Written in Rust 🦀

Current Features and flags implemented

Formatting helper

Object files (blobs, tree and commits)

A Blob file

A Tree file

A Commit file

Logic of cargo run commit:

The INDEX file (staging area)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages