Version control with Git and GitHub

You now have Python on your machine and a real program you can run. There's one tool sitting between "I wrote a program" and "I have a project I can save, undo, and show the world" — and almost every guide above this one quietly assumes you already use it. That tool is Git, and its most popular home is GitHub. This lesson teaches both from zero. It's a how-to with traced examples rather than a coding exercise, so there's no in-browser challenge — Git is a terminal tool, so we'll show you the exact commands and the exact output they print, and you'll follow along on your own machine. By the end you'll take the capstone program from the last lesson and turn it into a real, saved, online GitHub repository.

The problem Git solves (in everyday words)

Imagine you're writing an essay. You save essay.docx. You make changes, but you're scared to lose the old version, so you save essay_v2.docx. Then essay_final.docx. Then essay_final_FINAL.docx. Then essay_final_FINAL_v2_actually_done.docx. Now you have eight files, no idea which is newest, and no way to see what changed between any two of them. If you delete the wrong one, it's gone.

That's exactly the mess programmers were in before version control. Version control is software that takes care of all of this for you: it keeps a complete history of your project, lets you save labelled snapshots whenever you want, see precisely what changed between any two of them, and rewind to any earlier point — all inside one folder, with no _FINAL_v2 filenames ever again. Git is the version-control tool nearly the whole industry uses. (There are others, but learning one is learning the idea, and Git is the one jobs ask for.)

So Git answers three questions you'll ask constantly:

What did my project look like last Tuesday? (history)
What exactly did I change since then? (a precise diff, line by line)
Can I undo this and go back? (yes, to any saved snapshot)

The mental model: snapshots, commits, and the repo

The single most important idea in Git is the commit. A commit is a saved snapshot of your entire project at one moment, with a short message you write describing what changed (e.g. "Add letter-grade function"). Think of it as a labelled photograph of every file, frozen forever. Your project's history is just a long chain of these photographs, newest at the end.

The folder Git is tracking — your project folder, now with a hidden history attached — is called a repository, or repo for short. When you turn a normal folder into a repo, Git creates a hidden sub-folder named .git inside it. That .git folder is the history: every commit, every old version, all stored there. You almost never touch it directly; Git commands read and write it for you. Delete .git and you delete the history (the current files stay, but the time machine is gone).

The three areas: working directory → staging → commit

Here's the one piece of Git that confuses every beginner, explained slowly. When you change a file, it doesn't go straight into a commit. It passes through three areas:

The working directory — your actual files on disk, the ones you edit. When you save grades.py in your editor, the change lives here. Git sees it but hasn't recorded anything yet.
The staging area (also called the index) — a holding pen for "changes I want in my next commit." You explicitly move changes here with git add. This lets you choose exactly what goes into a snapshot — maybe you changed three files but only want to commit two of them.
The repository (committed history) — when you run git commit, everything currently in the staging area becomes a new permanent snapshot in the .git history.

Read it as a one-way pipeline: you edit (working directory) → you stage the changes you want (git add) → you snapshot them (git commit).

  working directory          staging area              repository
  (your files on disk)       ("next commit")           (.git history)
  ─────────────────          ─────────────             ───────────────
  edit grades.py   ──git add──▶  grades.py   ──git commit──▶  📸 snapshot #3
                                                               📸 snapshot #2
                                                               📸 snapshot #1

Why two steps (stage, then commit) instead of one? Because the staging area lets you craft a clean snapshot — group related changes into one meaningful commit and leave unrelated half-finished work for later. It feels like extra typing at first; it becomes second nature fast.

The core loop: init, status, add, commit, log

Let's actually do it. We'll make a tiny project and walk every command, showing the real output. First, a one-time setup so Git knows who you are (it stamps your name and email on every commit). Run these once on your machine, ever:

git config --global user.name "Your Name"
git config --global user.email "you@example.com"

Now create a project folder and turn it into a repo. git init is the command that creates the .git history folder, turning a plain folder into a Git repository:

mkdir hello-git
cd hello-git
git init

Git replies:

Initialized empty Git repository in /Users/you/hello-git/.git/

That's the repo, born. Now create a file (use your editor, or this quick terminal trick), then ask Git what it sees with git status — the command you'll run more than any other. It reports the state of your three areas:

echo "print('hello git')" > hello.py
git status

On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        hello.py

nothing added to commit but untracked files present (use "git add" to commit)

Read that carefully — Git is being genuinely helpful. Untracked means "this file exists in your working directory but Git isn't watching it yet." Git even tells you the next command. Let's stage the file with git add (move it into the staging area), then check status again:

git add hello.py
git status

On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
        new file:   hello.py

Now hello.py is staged — sitting in the "next commit" holding pen. Take the snapshot with git commit. The -m flag (short for message) attaches your description:

git commit -m "Add hello.py"

[main (root-commit) a1b2c3d] Add hello.py
 1 file changed, 1 insertion(+)
 create mode 100644 hello.py

That a1b2c3d is the start of the commit's unique ID (a long fingerprint Git shortens for display). You just saved your first snapshot. See the history with git log:

git log

commit a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0 (HEAD -> main)
Author: Your Name <you@example.com>
Date:   Tue Jun 24 10:15:02 2026 -0700

    Add hello.py

(Press q to exit the log view if it fills the screen.) Now do it again — the everyday loop. Edit the file, and watch how status changes for a file Git is already tracking:

echo "print('goodbye git')" >> hello.py
git status

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   hello.py

no changes added to commit (use "git add" to commit)

This time it says modified (not untracked) — Git knows this file and sees it changed. The loop is identical: stage, then commit.

git add hello.py
git commit -m "Add a goodbye line"

[main f9e8d7c] Add a goodbye line
 1 file changed, 1 insertion(+)

That's the entire core loop, and you'll repeat it for the rest of your life as a programmer: edit → git add → git commit -m "...", checking git status whenever you're unsure where things stand. Five commands — init, status, add, commit, log — cover the vast majority of daily Git.

:::tip Write commit messages for future-you A good message says what changed and why, in the imperative ("Add empty-list guard to average", not "stuff" or "fixed it"). Six months from now, git log is the only record of why you did something. Be kind to future-you. :::

Connecting to GitHub: remote, push, and what GitHub adds

Everything so far lived only on your computer. Git is happy entirely offline — the whole history is in your local .git folder. So what's GitHub? GitHub is a website that hosts Git repositories online. Git is the tool; GitHub is a place to put a copy of your repo on the internet. (GitLab and Bitbucket are similar sites — same Git underneath.)

GitHub adds things a local repo can't:

A backup — your code survives even if your laptop dies.
A shareable link — send someone a URL instead of zipping files.
A portfolio — employers look at GitHub profiles; a repo is proof you built something.
Collaboration — others can copy, suggest changes, and contribute.

The copy of your repo that lives on GitHub is called a remote — a version of the repo hosted somewhere other than your machine. The default remote is conventionally named origin. Sending your local commits up to the remote is called a push; pulling new commits down is a pull.

Here's the full path from local repo to GitHub, traced. First, on the GitHub website, create a new empty repository (click New, give it a name like hello-git, and — important — do not let it add a README or .gitignore for you yet, so it stays empty). GitHub then shows you a URL like https://github.com/yourname/hello-git.git. Back in your terminal, in your project folder:

git remote add origin https://github.com/yourname/hello-git.git
git branch -M main
git push -u origin main

Line by line:

git remote add origin <url> tells your local repo "there's a remote, call it origin, it lives at this URL." Nothing is sent yet — you've just saved the address.
git branch -M main makes sure your branch is named main (a branch is a line of development; you've been on the default one all along — for now, just know that main is the standard name for the primary branch).
git push -u origin main uploads your commits to origin. The -u sets origin/main as the default destination, so future pushes are just git push.

The push prints something like:

Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Writing objects: 100% (6/6), 512 bytes | 512.00 KiB/s, done.
Total 6 (delta 0), reused 0 (delta 0)
To https://github.com/yourname/hello-git.git
 * [new branch]      main -> main
branch 'main' set up to track 'origin/main'.

Refresh the GitHub page and your files are there, with your commit history. From now on the loop gains one final step: edit → git add → git commit -m "..." → git push. Commit often locally; push when you want the online copy updated.

:::note GitHub will ask you to prove it's you The first push asks you to authenticate. GitHub no longer accepts your account password on the command line. The simplest beginner path is a personal access token (PAT): on GitHub, go to Settings → Developer settings → Personal access tokens, generate one, and paste it when the terminal asks for a password. Or install the official GitHub CLI (gh) and run gh auth login, which handles it for you. This is a one-time setup per machine. :::

.gitignore: keep junk and secrets out of your repo

Some files should never go into a commit. Two big categories:

Generated / bulky junk — your .venv folder (hundreds of installed-package files you can recreate anytime with pip install -r requirements.txt), Python's __pycache__ cache folders, build output. Committing these bloats the repo with stuff anyone can regenerate.
Secrets — API keys, passwords, tokens, a .env file holding credentials. This is the serious one. Anything you commit and push to a public GitHub repo is visible to the entire internet — and even if you delete it later, it stays in the history forever. Pushing a secret key is one of the most common and most damaging beginner mistakes.

The fix is a file named .gitignore — a plain text file, one pattern per line, listing things Git should pretend it can't see. Git will refuse to track anything matching a pattern, so you can't accidentally git add it. A solid starter .gitignore for a Python project:

# Virtual environment — recreate it with pip, never commit it
.venv/

# Python bytecode cache
__pycache__/
*.pyc

# Secrets — API keys, passwords, tokens. NEVER commit these.
.env
secrets.json

Put this file in your project folder before your first git add, and Git will skip those paths automatically. You can confirm it's working: with a .venv present and ignored, git status simply won't list it as untracked. (Remember the secrets idea from earlier guides — .gitignore is the mechanism that actually enforces "don't commit your keys.")

:::warning If you've already committed a secret, rotate it Removing a leaked key from your files and committing the removal is not enough — it's still sitting in the history, readable by anyone with the repo. The only safe response is to revoke/rotate the key at the provider (generate a new one, disable the old) and treat the old one as compromised. Prevention via .gitignore is far easier than cleanup. :::

Walkthrough: turn your capstone into a GitHub repo

Let's do the real thing end to end with the grade-report tool from the capstone lesson. Assume you've made a folder with grades.py and grades.json and it runs. Here's every command, traced.

1. Move into the project and initialize Git:

cd grade-report          # the folder holding grades.py and grades.json
git init

Initialized empty Git repository in /Users/you/grade-report/.git/

2. Add a .gitignore first (so the venv and any secrets are excluded from the very first commit). Create .gitignore with the Python contents shown above.

3. See what Git will track:

git status

On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        .gitignore
        grades.json
        grades.py

nothing added to commit but untracked files present

Notice .venv/ is not in that list — .gitignore is doing its job. Exactly the files you want, nothing you don't.

4. Stage everything and commit:

git add .
git commit -m "Add grade-report CLI tool"

[main (root-commit) 7c3f1a9] Add grade-report CLI tool
 3 files changed, 41 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 grades.json
 create mode 100644 grades.py

git add . stages everything in the current folder (the . means "here") that isn't ignored — a common shorthand once you trust your .gitignore.

5. Create an empty repo on GitHub named grade-report (no auto README), copy its URL, then connect and push:

git remote add origin https://github.com/yourname/grade-report.git
git branch -M main
git push -u origin main

...
To https://github.com/yourname/grade-report.git
 * [new branch]      main -> main
branch 'main' set up to track 'origin/main'.

Your capstone is now a live GitHub repository with a clean history and a shareable URL — a real portfolio piece. 6. From here, every improvement is the loop: edit grades.py, then git add grades.py, git commit -m "Sort report by average", git push. That's the rhythm of building software for real.

Why it matters

Version control is not optional in professional software — it's the floor. Every team uses Git; every job posting assumes it; every project in the guides above this one starts with a repo and ends with a push. Beyond jobs, it's the thing that lets you experiment fearlessly: try a wild change, and if it breaks, rewind to the last good commit. And GitHub turns your work into something you can show — the difference between "I learned to code" and "here are three working projects, look at the commit history." When the final lesson says put your projects on GitHub, this is the skill it means. You can do it now.

Common pitfalls

Calmly, the mistakes nearly every beginner hits — and exactly what to do:

Committing a secret. You git add . with an API key in secrets.json and push it. Fix: add it to .gitignore before the first commit; if it's already pushed, rotate the key at the provider (see the warning above) — deleting the file isn't enough.
Forgetting to git add. You commit, but a changed file isn't in the snapshot because you never staged it. git commit only captures what's in the staging area. Habit: run git status right before committing and check the "Changes to be committed" list is what you expect.
"detached HEAD" panic. Sometimes git checkout <some-commit-id> (jumping to an old snapshot) drops you into a detached HEAD state, and Git prints a scary paragraph. Don't panic — it just means "you're looking at an old snapshot, not a branch." Nothing is broken and nothing is lost. To get back to normal, type git switch main (or git checkout main). That's the whole fix.
Merge conflicts. When two changes touch the same lines of a file (e.g. you edited a line on your laptop and the GitHub copy has a different edit on that line), a git pull can't decide which to keep, so it pauses with a merge conflict. Git marks the spot in the file like this:
```
<<<<<<< HEAD
your version of the line
=======
the other version of the line
>>>>>>> origin/main
```
This is not an error you broke — it's Git politely asking you to choose. Open the file, delete the <<<<<<<, =======, >>>>>>> marker lines, edit the text to the version you actually want, save, then git add the file and git commit. As a solo beginner you'll rarely hit these; when you do, they're a small editing task, not a catastrophe.
Forgetting to push. Commits live only on your machine until you push. If your "backup" matters, end your session with git push.

Checkpoint

Required checkpoint