Branching and merging
Branching as a local fork
We have learned how to use forks and Pull Requests on GitHub in order to collaborate with colleagues. The model of forking a personal copy of a repo, making an individual change and then folding that change back into the original (upstream) source is a very useful workflow.
The fork and PR model is a version of git
's own branch-and-merge
functionality, in which commits can follow separate paths (called branches) and
then be brought back together.
How commits work
What exactly is a commit? We've talked about them up to this point as snapshots
of a file or files at a particular moment. That's an accurate description, but
it's also worth examining how commits are related to one another. This will help
develop intuition about why git
handles certain situations the way that it
does.
When we learned about navigating the UNIX filesystem, we explained the file structure as an inverted tree, where each folder can see all of the files and folders below it, but can't see anything above it.
Commits have the exact opposite structure; the tree isn't inverted, it's right-side-up. If we think of the initial commit as the roots of the tree, each commit can trace back to the initial commit via all of its ancestors, but has no knowledge of any commits that come after it (its descendents).
Slides
When to branch?
A branch is a good idea if you want to try adding a big feature (for one example). Making one small change might not require a whole branch (although you can still use a branch), but if you are going out on a limb and trying something new out, a branch is an easy way to make a bunch of changes. If you decide later that you don't want them, it's easy to get rid of them. If you want to keep them, it's easy to merge them into your master branch.
$ cd git/wordcount
Now check the status to make sure everything is clean.
$ git status
On branch master
nothing to commit, working tree clean
git branch
git status
tells us we're on the branch master
, what other branches are
there? We can check using the git branch
command:
$ git branch
* master
There's only one branch at this point, called master
.
Let's create another branch. The easiest way to do this is using the checkout
command with the -b
flag. If we use the -b
flag, then we will create a new
branch and also switch to that branch. Let's try it:
$ git checkout -b make_function
Switched to a new branch 'make_function'
Ok, we switched to a new branch! Let's check git branch
again:
$ git branch
* make_function
master
Now we have two branches and the *
denotes which branch is active.
A quick git status
shows us that the repo is clean and no changes have been
made.
$ git status
On branch make_function
nothing to commit, working tree clean
If we look at the contents of the folder, they also haven't changed.
$ ls
README.md word_count.py
All we have done is change the name of the pointer tracking changes to the repository. Let's make some changes.
Now let's check the diff
$ git diff
diff --git a/word_count.py b/word_count.py
index 66135a8..5a05630 100644
--- a/word_count.py
+++ b/word_count.py
@@ -1,10 +1,17 @@
-happy = input("Enter a statement to word count: ")
+def wordcount(happy):
+ """
+ Splits the input string and performs a frequency word check
+ """
-words = happy.split()
+ words = happy.split()
-counts = {}
-for word in words:
- counts[word] = counts.get(word, 0) + 1
+ counts = {}
+ for word in words:
+ counts[word] = counts.get(word, 0) + 1
-print("The word frequency of your statement is: ")
-print(counts)
+ print("The word frequency of your statement is: ")
+ print(counts)
+
+def main():
+ happy = input("Enter a statement to word count: ")
+ wordcount(happy)
That's a lot of changes, but I think it looks good. Let's commit it.
$ git add word_count.py
$ git commit
[make_function 3196a4e] convert wordcount to function and add main function
1 file changed, 14 insertions(+), 7 deletions(-)
After the changes, word_count.py
should look like this:
$ cat word_count.py
def wordcount(happy):
"""
Splits the input string and performs a frequency word check
"""
words = happy.split()
counts = {}
for word in words:
counts[word] = counts.get(word, 0) + 1
print("The word frequency of your statement is: ")
print(counts)
def main():
happy = input("Enter a statement to word count: ")
wordcount(happy)
Let's switch back to the master
branch now.
$ git checkout master
Switched to branch 'master'
What's the status of master
?
$ git status
On branch master
nothing to commit, working tree clean
All clean! What does word_count.py
look like here?
$ cat word_count.py
happy = input("Enter a statement to word count: ")
words = happy.split()
1 add call to main
counts = {}
for word in words:
counts[word] = counts.get(word, 0) + 1
print("The word frequency of your statement is: ")
print(counts)
It looks the same as it did before! We have added a commit on the
make_function
branch, but master
is separate from that so nothing has
changed here.
Ok, let's go back to the make_function
branch. Since the branch already
exists, we don't need to use the -b
flag.
$ git checkout make_function
Switched to branch 'make_function'
And a quick git branch
check to make sure we are where we think we are:
$ git branch
* make_function
master
Ok. We should have checked before to make sure that the new version works. Let's do that now.
$ python word_count.py
Oops. We defined main
and wordcount
but we forgot to call them!
Let's fix that and check that it works.
$ python word_count.py
Enter a statement to word count: oh now it is working that is a good thing
The word frequency of your statement is:
{'thing': 1, 'a': 1, 'now': 1, 'working': 1, 'good': 1, 'oh': 1, 'it': 1, 'is': 2, 'that': 1}
Better.
$ git diff
diff --git a/word_count.py b/word_count.py
index 5a05630..e66dafd 100644
--- a/word_count.py
+++ b/word_count.py
@@ -15,3 +15,5 @@ def wordcount(happy):
def main():
happy = input("Enter a statement to word count: ")
wordcount(happy)
+
+main()
We just had to add the main()
at the bottom of the script. Now let's commit
this change:
$ git add word_count.py
$ git commit
[make_function 1071b15] add call to main
1 file changed, 2 insertions(+)
Now what does the log look like?
$ git log --oneline
1071b15 add call to main
3196a4e convert wordcount to function and add main function
3f62d8f Create README.md
de8fbc3 Add user-friendly print statement
09633c8 add helper text to input function
97fba8d allow user input of statement to word count
47f748f Add initial version of word count script
Ok, we added two extra commits on the make_function
branch.
What does the log look like on master
? Let's check it out.
$ git checkout master
Switched to branch 'master'
$ git log --oneline
3f62d8f Create README.md
de8fbc3 Add user-friendly print statement
09633c8 add helper text to input function
97fba8d allow user input of statement to word count
47f748f Add initial version of word count script
Right, the master
branch has the same log as before and doesn't have the two
commits that we added to the make_function
branch.
diff
across branches
Up to now, we have used git diff
to compare new changes to the most recent
commit, but you can also use diff
to compare the state of two different
branches. Let's diff
between master
and make_function
.
$ git diff master make_function
diff --git a/word_count.py b/word_count.py
index 66135a8..e66dafd 100644
--- a/word_count.py
+++ b/word_count.py
@@ -1,10 +1,19 @@
-happy = input("Enter a statement to word count: ")
+def wordcount(happy):
+ """
+ Splits the input string and performs a frequency word check
+ """
-words = happy.split()
+ words = happy.split()
-counts = {}
-for word in words:
- counts[word] = counts.get(word, 0) + 1
+ counts = {}
+ for word in words:
+ counts[word] = counts.get(word, 0) + 1
-print("The word frequency of your statement is: ")
-print(counts)
+ print("The word frequency of your statement is: ")
+ print(counts)
+
+def main():
+ happy = input("Enter a statement to word count: ")
+ wordcount(happy)
+
+main()
Cool!
git merge
We have two branches and make_function
has the new feature, which makes
wordcount
a function. That's a better way to write scripts. Since the
make_function
branch contains the feature that we want and we think that it's
ready to go, we can bring the contents of the make_function
branch into the
master
branch by using git merge
.
To do a merge (locally), git checkout
the branch you want to merge INTO.
Then type git merge <branch>
where <branch>
is the branch you want to merge
FROM.
We are on the master
branch and want to merge in make_function
so we do:
$ git merge make_function
Updating 3f62d8f..1071b15
Fast-forward
word_count.py | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
Because the history of master
and the history of make_function
share common
ancestors and don't have any divergence descendents, we get a "fast-forward"
merge. That means that all of the changes we made in make_function
look as if
they were made directly in master
.
Sometimes, you might want to force git
to create a merge commit, so that you
know where a branch was merged in. In that case, you can prevent git
from
doing a "fast-forward" by merging with the -no-ff
flag.
(note that this command won't work now that we have already merged in
make_function
)
$ git merge --no-ff make_function
Deleting your branch
Once you have merged your branch into master
it doesn't really make sense to
keep it around. All of its information and history is now contained in master
.
Confirm first that you are on the master
branch:
$ git branch
make_function
* master
Confirmed. Now we can delete the make_function
branch:
$ git branch -d make_function
Deleted branch make_function (was 1071b15).
Note that if you use git branch -d
it will only delete a branch if ALL of the
commits in that branch are also in master
, so you won't accidentally lose any
of your commits.
Pushing a branch to GitHub
You can also open Pull Requests between separate branches on GitHub. This often
presents a good way for collaborating with people who have access to the same
repository. You don't want to all be pushing to the master
branch all the
time. Instead, each person can create their own branch, work separately, and
then open a pull request to merge that branch into master.
If you create a local branch in your repo, you can push it to GitHub as follows:
First, make sure that you are on the branch that you want to push:
$ git branch
* make_function
master
Then run
$ git push -u origin <branch_name>
in this case
$ git push -u origin make_function
to push the branch to GitHub.
How do I know if I've already pushed a branch to GitHub?
If you have already done a push -u
, then git status
will contain an extra
line of output. For example:
$ git status
On branch make_function
Your branch is up-to-date with 'origin/make_function'.
nothing to commit, working tree clean
Note the additional line that "Your branch is up-to-date with
'origin/make_function'". If there is any mention of origin/make_function
(or
whatever your branch name) then you have set up the branch on GitHub.
Of course, you can also just browse to your copy of the repo on GitHub and see if the branch is there, but do whichever thing seems easiest to you.