Git History & Key Concepts
Section Objectives
- Understand Git's origin and why it was created
- Master the fundamental concepts: repository, commit, snapshot
- Understand the Git object model
- Distinguish the 3 areas of Git (working tree, staging, repository)
The Origin of Git
Linus Torvalds and the Linux Kernel
In 2005, Linus Torvalds (creator of Linux) found himself in a crisis:
- The Linux kernel had thousands of contributors worldwide
- The tool they used (BitKeeper) revoked its free license
- They needed a new tool immediately — and it had to be better
Torvalds spent 10 days writing Git. His requirements:
| Requirement | Explanation |
|---|---|
| Speed | Must be fast even on large projects |
| Distributed | No central server needed |
| Non-linear | Strong support for branching/merging |
| Integrity | Data corruption must be detectable |
| Free and open | Never depend on proprietary tools |
Torvalds chose the name "git" — British slang for "stupid person." He said: "I name all my projects after myself. First Linux, then git."
Git's Timeline
The 3 Areas of Git
This is the most important concept to understand in Git. Every file lives in one of these 3 areas:
| Area | Also Called | Description |
|---|---|---|
| Working Tree | Working Directory | Your files on disk as you edit them |
| Staging Area | Index, Cache | Files marked "ready to commit" |
| Repository | .git directory | Permanent history of all commits |
Practical Analogy
Think of it like preparing a package to ship:
- Working Tree = Your desk where you're working on items
- Staging Area = The box where you're organizing what to ship
- Repository = The warehouse where all shipped packages are stored
Key Concepts
What is a Repository?
A repository (or "repo") is a folder that Git tracks. It contains:
- All your project files
- A hidden
.gitfolder with all version history - Configuration, branches, tags, etc.
my-project/
├── .git/ ← Git's "brain" (never touch manually!)
│ ├── HEAD ← Points to current branch
│ ├── config ← Repository configuration
│ ├── objects/ ← All Git objects (blobs, trees, commits)
│ └── refs/ ← Branch and tag references
├── src/
│ └── main.py
├── README.md
└── .gitignore
What is a Commit?
A commit is a snapshot of your project at a specific moment. Each commit contains:
| Field | Description | Example |
|---|---|---|
| Hash (SHA-1) | Unique identifier | a3f4b2c |
| Author | Who made the commit | Alice <alice@example.com> |
| Date | When it was committed | 2026-03-18 14:30 |
| Message | Description of changes | Add user login feature |
| Parent | Reference to previous commit | f1d3e8a |
| Tree | Snapshot of files | (all files at that moment) |
Git's Snapshot Model
Unlike older VCS (like SVN) that store differences (deltas), Git stores complete snapshots:
Git doesn't store the full content of every file every time. If a file hasn't changed, Git just stores a reference to the identical file from the previous commit. This makes Git both accurate and storage-efficient.
Git's Object Types
Git stores everything as 4 types of objects (all identified by SHA-1 hash):
| Object | Description | Example |
|---|---|---|
| Blob | Content of a file | The bytes of main.py |
| Tree | A directory listing | Links to blobs and sub-trees |
| Commit | A snapshot with metadata | Hash, author, message, parent |
| Tag | A named reference to a commit | v1.0.0 → commit a3f4b2c |
The HEAD Pointer
HEAD is a special pointer that always indicates where you are in the repository:
- In normal state:
HEADpoints to the current branch - In "detached HEAD" state:
HEADpoints directly to a commit
Summary
| Concept | Definition | Analogy |
|---|---|---|
| Repository | Git-tracked folder | Photo album |
| Commit | Snapshot + metadata | One photo in the album |
| Working Tree | Your files as they are now | Paper on your desk |
| Staging Area | Files ready to commit | Box ready to ship |
| HEAD | Where you currently are | Bookmark in the album |
| Hash (SHA-1) | Unique commit identifier | Photo serial number |