Getting Started with Git

Git is an incredibly easy to use version control system that was created by Linus Torvalds in 2005 after the Linux kernel team had a falling out with makers of [BitKeeper[(http://www.bitkeeper.com).

Git was designed to be:

  1. Fast
  2. Distributed
  3. Support non-linear workflows
  4. Compatible with existing protocols
  5. Free

Unlike most commercial version control systems, i.e. Subversion or Team Foundation Server, every Git repository is a full working directory with the complete history and full version tracking capabilities that are completely independent of network access or a central repository server.

The true magic of Git is in the way it stores it's data. Instead of tracking the contents of the repository as a file system, Git maintains it's repository as one of two data structure types. The first being a mutable index that caches information about the current working directory; and an immutable append-only object database.

The object database contains one of four types of objects.

  1. Blob (binary large object; or the contents of a file)
  2. Tree (a directory)
  3. Commit object (links trees together)
  4. Tag object (a container reference that points to other objects)

Each object is identified by it's SHA-1 hash of the contents of the file, not the file itself. Git then calculates the hash and utilizes this value as the objects name. This object is then placed into the directory matching the first two characters of its hash. The remainder of the hash is used to identify the file name of the object.

While there are many graphical user interfaces for Git, the most powerful of all of the Git features is in it's command line interface.

Let's get started with a quick Git primer.

If you do not already have Git installed, from your Linux terminal.

sudo apt-get install git  

You can also install Git from the source code repository.

git clone https://github.com/git/git.git  
Setting Up Git

There are a few configuration commands we need to use in order to identify ourselves to the Git system. We need to let Git know who we are and what our email address will be when checking in files and making commits.

$ git config --global user.name "Craig Derington"
$ git config --gloabl user.email "craig@craigderington.me"

Next, I like to set up a few shortcuts so I don't have to re-type commit or branch everytime.

$ git config --global alias.st status
$ git config --global alias.br branch
$ git config --gloabl alias.co checkout
$ git config --global alias.cm commit

To see all the items in your Git config:

$ git config --list

Now that we have a working Git config, we are ready to start using Git and creating our repos.

Git Workflow

There are several Git workflow types that are in general use by the community of Git users.

  1. Centralized Workflow
  2. Feature Branch Workflow
  3. Gitflow Workflow

Since I mostly work on small development teams, my workflow is a modified version of the types above.

Below, I document my workflow when using Git.

On GitHub.com, I create a new repository. I call my repository beacon. The repository lives at:

https://github.com/craigderington/beacon.git

I initialized my repo with a README.md, a .gitignore and a GNU LICENSE.

On my local system, I open a terminal window in my ~/sites directory and run the following command.

~/sites$ git clone https://github.com/craigderington/beacon.git

Next, I change directory into the new beacon directory...

~/sites$ cd beacon
~/sites/beacon$ 

I like to check to make sure my remotes are correct.

~/sites/beacon$ git remote -v
origin https://github.com/craigderington/beacon.git (fetch)  
origin https://github.com/craigderington/beacon.git (push)  

It appears my remotes are set to origin (default name used by Git) and I am ready to begin scaffolding out my project and creating my virtual environment. I also set one more command to ensure I always fetch using only a Git pull command.

~/sites/beacon$ git branch --set-upstream-to=origin/master

This allows me to push and pull to my remote Git repository using only git pull and git push, instead of the more verbose git pull origin master and git push origin master.

OK, I'm ready to begin creating my project directory structure and adding files.

~/sites/beacon$ virtualenv .env --python=python2.7
Running virtualenv with interpeter /usr/bin/python2.7  
New python executable in .env/bin/python  
Also creating executable in .env/bin/python  
Installing setuptools, pip, wheel..done.  
~/sites/beacon$ . .env/bin/activate
(.env)~/sites/beacon$ pip install django && pip install mysql-python && pip install django-bootstrap3 && pip install django-suit && pip install django-crispy-forms

I now have the necessary Python modules loaded in my activated virtual environment. Now use the Django admin to start a new project.

(.env)~/sites/beacon$ django-admin startproject beacon
(.env)~/sites/beacon$ django-admin startapp radios

If I were to look at my directory tree, I would see...

beacon  
| - beacon
|  - __init__.py
|  - settings.py
|  - wsgi.py
|  - urls.py
|  - views.py
|
| - radios
|   - migrations
|     - __init__.py
|   - models.py
|   - views.py
|   - admin.py
|   - forms.py
|   - tests.py
|   - apps.py

This is looking very good. I have my new project created and the directory tree is ready to be committed into my repository. But first, I need to see what objects Git is currently tracking. Remember, we created an alias for status and use git st instead of git status.

(.env)~/sites/beacon$ git st
On branch master  
Your branch is up-to-date with 'origin/master'

Untracked files:  
(use 'git add <file>...' to include in what will be committed)

     .env/
     beacon/
     radios/

I want to include my project files but not my virtual environment, which I will add to my .gitignore file in just a minute. But first, let's add the beacon/ and radios/ directories and their file contents.

~/sites/beacon$ git add beacon/
~/sites/beacon$ git add radios/

Now, if I run a git status, I will see all of the file objects that Git is now tracking for my project.

Let's take a minute and add our .env virtualenv to .gitignore file.

~/sites/beacon$ sudo nano .gitignore

Near the top of the file, add an entry for the virtual environment directory to be excluded from the Git repository.

# Ignore the .env folder
.env/

Save the file and exit.

We are now ready for our first commit and push.

~/sites/beacon$ git cm -m "Initial project dump"
[master 50e526d] Initial project dump
12 files changed, 189 insertions(+)  
~/sites/beacon$ git push
Counting objects 17, done.  
Delta compression using up to 4 threads.  
Compressing objects: 100% (14/14), done.  
Writing objects:  100% (16/16) 3.25KiB | 0 bytes/s, done.  
Total 16 (delta 0), reused 0 (delta 0)  
To https://github.com/craigderington/beacon.git  
    679388a..50e526d master -> master

I never work in my master branch after the initial project scaffold. Instead, I make feature branching my default workflow.

(.env)~/sites/beacon$ git co -b development

This command creates a new branch and checks out the new branch development in one step. This is essentially the same as entering two different commands, like this...

(.env)~/sites/beacon$ git branch development
(.env)~/sites/beacon$ git checkout development
Switched to branch development  
(.env)~/sites/beacon$ git br
* development
  master

Now, after completing my work on the development branch I am ready to merge these changes back into the master branch, I can simply use:

(.env)~/sites/beacon$ git checkout master
(.env)~/sites/beacon$ git merge development --no-ff

The --no-ff argument tells Git to not use basic recursive strategy of a simple fast forward and instead creates commits at each commit point in the development branch workflow that is being merged into the master branch.

And that is getting started with Git.

Craig Derington

Veteran full stack web dev focused on deploying high-performance, responsive, modern web applications using Python, NodeJS, Django, Flask, MongoDB and MySQL.

comments powered by Disqus