Git is a version control system. It keeps track of changes within files and allows for complex collaborative work. While it is mostly used by programmers for storing and sharing code, it can theoretically work with any type of file (text, images, etc.). GitHub is the most-used hosting service for Git. On Github, users store and make their Git repositories available to others.
The Git workflow is highly valued in the open science community for a few reasons. It is fast, secure, and well-suited for coordinating large collaborative projects. The characteristic feature of Git is its branch system which allows users to work on different lines of development in parallel. Branches can be easily merged or deleted with minimal risk of losing valuable material.
Inspired by other fellows and mentors of the Fellowship Freies Wissen, I started to use Git and GitHub about two months ago. However, I quickly faced difficulties.
The problem was that I took the wrong approach with Git. When learning a new computer skill (like R), I usually start experimenting early in the process and learn by solving the inevitable problems that come along the way. This “hands-on” approach proved to be more complicated with Git. While I was trying to keep track of the changes on my PhD project, I got rapidly confused by the concept of “staging area” and struggled moving from one branch to the other.
I realized that to start using Git efficiently I would need a more solid theoretical understanding of the system.
I stopped using Git with my project for a while and went back to the basics. I started reading Pro Git, 2nd edition, written by Scott Chacon and Ben Straub (Apress, 2018) and followed through the explanations with example repositories containing simple .txt files. I performed all the operations in Bash, the command-line system, instead of using a GUI.
Pro Git is a great resource: it is distributed under Creative Commons license and was translated in many languages. You can find the book in html, pdf, and other formats here.
If you are also starting to learn Git I would recommend going through the first three chapters (“Getting Started”, “Git Basics”, “Git Branching”) and the first sections of the chapter on GitHub. That’s about 120 pages.
Taking the time to learn Git properly proved very useful. After a few hours of reading, I was able to take advantage of most of the basic functions of Git. More and more, I am discovering the advantages of Git and wished I had learned it earlier.
You can now follow my work on GitHub here. I will continue to post vignettes in R and will distribute replication material for my papers.