Git 101

Omkar Ubale

Jul 7, 20218 min read

Updated: Jul 8, 2021

If you've worked with software development teams, you're bound to come across the words Git, GitHub, commit, branch, and abbreviations of these words, but have you wondered what they mean? Today we are going to be exploring the version management tools that are used to build software you use on a day-to-day basis. To know why version systems like Git are important, check out my previous article here.

What is Git?

Git workflow diagram from Atlassian

Git is a version management tool that is used to document character-level changes in text-based files. Software development teams use such tools for managing their codebases, but they can be used by anyone who wants to track the changes they make to files over time. Git is a very powerful tool when used in teams as it was built with collaboration in mind. There are many tools like Git, say Mercurial and Blaze, but the principles of version management remain the same: track changes in the file instead of storing a copy of the file every time.

Wait, then what is GitHub?

Github is a cloud-based provider for the version control system Git. In other words, it is a collection of servers where you and your team can store version(s) of your code which you have been managing with Git. Many such providers, namely Bitbucket, Source Forge, and more, provide similar features. One of the biggest advantages of using such a platform is that you make sure your code is in more than one place, making sure you don't lose it in case you lose access to your local systems. Almost all the platforms mentioned here are free, to begin with, but we will stick with Git - GitHub for the scope of this article.

So how does this Git thingy work?

Glad you asked! In a real-world scenario, a software project can last anywhere between two years and the year software came into existence (although Git was launched in 2005, so you can imagine what a mess codebases were before that). If you had to track every change that was made in the codebase, it would be unreasonable to store a copy of the code at every checkpoint. A better idea would be to track the changes made in each iteration instead of storing a copy of the iteration itself. This means every word added and deleted, new line, spaces, and even the change of line endings - everything is tracked - and this is exactly what Git does. This level of detail ensures developers have access to records of every change that was ever made in the code, and finding out what change was made three years ago is not as hard a challenge as it seems.

How do I work with Git?

We discussed what GitHub is and what its purpose is, but to understand how the code gets there, you have to first understand the workflow for using Git. In this article, I will be explaining the workflow using TortoiseGit as it helps visualize the changes you and your team make over time. You can get started with Git here and TortoiseGit here.

Repository

You have decided to track changes in a particular folder. To start tracking with Git, you need to initialize Git in this folder. Once you initialize Git in a folder, Git converts the folder into a repository. If you already had an existing repository on GitHub, you would clone the remotely hosted repository. For the sake of this example, we will create an empty repository (empty folder).

Once you have created a repository, you will notice that a ‘.git’ folder was added. If you don’t see this folder, enable the ‘Hidden Items’ so you can see hidden folders in your windows explorer. This ‘.git’ folder will contain information regarding your repository which git requires to function properly, namely the commits, branches, configuration information, etc.

Branch and Working Tree

There are 3 states in which your code can be in at any given time:

Remote Branch: The code is in the cloud and accessible to everyone on your team.
Local Branch: The code is on your computer, but Git knows about it and has tracked it.
Working Tree: The code is on your computer, but Git doesn't know about it yet. This is where your changes are while you are making them.

The flow of code is usually Working Tree → Local Branch → Remote Branch.

Commit

Once you have created your repository, you can start adding files. Say you decide to add a .txt file with "Hello World" as its contents. Once you have completed your changes, you want to mark the file creation and addition of contents as a change. To do this, you create a commit or commit your changes. You have to describe each commit so it is easier for you to remember what this change was when you come back to it later. By creating a commit, you have created a checkpoint in your work with the help of git.

When you commit, you are telling git to track the changes you made. This also means that the change that you made just moved from the Working Tree to your Local Branch.

To commit your changes, you need to stage your changes for the commit. TortoiseGit does this behind the scenes when you click on the commit option, but you should know about this to work with Git via other platforms such as Visual Studio or VS Code. Once your changes are included in the commit (check the boxes for which changes you want to include in this commit), you can click the commit button to commit the changes. Make sure your commit message is descriptive enough so that it is informative to your team and future self.

Push

Once you have made a commit and are ready to put your code in the cloud (or any remote server you choose), you can go ahead and push your changes. This adds the commits which are in your Local Branch to the Remote Branch. Please note that the changes in your Working Tree do not go into the remote branch.

To Push your changes, you can either right-click in the folder and click on ‘“Push…”, or click the “Git Show Log” option to show the complete log of your repository. I recommend using the second option to immediately verify your operations with this screen open and refresh it with a simple F5.

Once you click on Push, you will be greeted with the popup shown below. Here, you can decide which Local branch you want to push into which Remote branch. You can also specify which remotely hosted repository you want to push your changes to (very handy if you have multiple repositories for the same codebase), or just provide a URL to it. For the first time around, you will need to connect your remote repository to the repository you have on your computer. You can start by creating a repository like this one:

Once you have created this repository, you will get an introductory screen on GitHub with an HTTP URL like the following:

You can copy this string as you will need this to connect your local repository with the one you just created on GitHub. Next, click on the manage button in the “Push” window right beside the remote dropdown. You will see a new popup which is the settings of your repository. You can paste the HTTP URL in the appropriate field and click on “Add New/Save” to add this repository to the list of remote repositories linked to this local repository.

To securely communicate with GitHub, you will also need a Putty key pair. You can read more about how you can create and configure yours here. Once you have selected your putty key, you can save and close this window, and proceed to push your changes. Once you have successfully pushed your changes, you will see that your newly created remote branch (cream in color) is visible right beside your local branch (red in color).

Clone Repository

You have been making a lot of changes in your code, but you want some help with your project. You decide to enlist a friend for help. Your friend will have to Clone the Repository to get the code for your project. When your friend clones your repository, he will get the latest copy of the remote branches in the repository. He can then start making changes in the project and commit and push his changes as well.

To clone the repository, the Putty key pair needs to be prepared beforehand. TortoiseGit allows you to configure a key for your entire system as well, so if you plan on working from a single account on your PC, it’d be a good idea to spend some time setting it up once and for all. Once you are ready to clone the repository, click on the clone option to open the popup.

The URL is the HTTP URL in the repository which is available when you click on the Clone option on the repository main page. You can manually load a putty key by checking the “Load Putty Key” if you want to use a different key for this repository or choose to leave it unchecked to use the system default key configured in TortoiseGit. Once you click “OK”, you should have a copy of the remote repository on your machine.

Pull and Merge

Now that you have two people working on the same project, you are both going to be making changes to the project and building something together. The good part about Git is that it allows changes to be made only on the latest copy of the branch, or commonly known as the head of the branch. So when you decide you want to add changes to the remote branch, but your friend has added some changes to the remote branch, you cannot overwrite his changes (at least not without forcing your changes and getting rid of his changes permanently. More on this later). You are going to have to Pull and Merge the latest remote branch and merge the changes your friend made with your changes. The merging algorithm is powerful and looks at the changes for when they were added. If you and your friend have been making changes in different places, both sets of changes are taken. On the other hand, if you and your friend have made changes in the same place (granted the changes are different), Git will give you a Conflict on the lines which it is unable to decide which changes to take. Whenever you get conflicts, they have to be Resolved manually by the person who is performing the Merge.

Here I have initialized two copies of the same repository to simulate two contributors to the repository. I have added 3 changes from the “DemoFolder” repository, and 2 changes from the “demoRepository” as shown below.

In this scenario, I have pushed the “demoFile2.txt” and “demoFile3.txt” changes to master already, but my second contributor was a bit late and wants to push as well. Unfortunately, this is what the second contributor will see:

The second contributor doesn’t have the latest copy of the branch he is trying to push. For him to push to the branch, he will have to pull from it first.

In this case, you are pulling from a common branch, but you could also pull from another branch in case your teammates have been working on something in parallel.

Once you click OK, Git will pull the latest copy of the branch from the remote server and merge it with the version that you have in your active local branch. The merging of these two versions is also a change in your repository and is documented in the form of a special commit called a Merge commit. You will be able to see the changes it took from the two versions it merged and will ignore the changes which were present in both versions.

Working with Git in a real team

Once you have understood the basics of Git and how it works, it is very simple to use. This makes it a very powerful tool and a crucial skill to possess while working with software teams. Real-world projects which use git can have anywhere between one and hundreds of commits per day, and Git can handle it with ease due to its lightweight algorithms and robust change tracking capability. If you are interested in learning more about Git, stay tuned for the next article!