@derrickstolee recently discussed several different
git clone options, but how do those options actually affect your Git performance? Which option is fastest for your client experience? Which option is fastest for your build machines? How can these options impact server performance? If you are a GitHub Enterprise Server administrator it’s important that you understand how the server responds to these options under the load of multiple simultaneous requests.
Here at GitHub, we use a data-driven approach to answer these questions. We ran an experiment to compare these different clone options and measured the client and server behavior. It is not enough to just compare
git clone times, because that is only the start of your interaction with a Git repository. In particular, we wanted to determine how these clone options change the behavior of future Git operations such as
In this experiment, we aimed to answer the below questions:
- How fast are the various
- Once we have cloned a repository, what kind of impact do future
git fetchcommands have on the server and client?
- What impact do full, shallow and partial clones have on a Git server? This is mostly important for our GitHub Enterprise Server Admins.
- Will the repository shape and size make any difference in the overall performance?
It is worth special emphasis that these results come from simulations that we performed in our controlled environments and do not simulate complex workflows that might be used by many Git users. Depending on your workflows and repository characteristics these results may change. Perhaps this experiment provides a framework that you could follow to measure how your workflows are affected by these options. If you would like help analyzing your worksflows, feel free to engage with GitHub’s Professional Services team.
For a summary of our findings, feel free to jump to our conclusions and recommendations.
To maximize the repeatability of our experiment, we use open source repositories for our sample data. This way, you can compare your repository shape to the tested repositories to see which is most applicable to your scenario.
These repositories were mirrored to a GitHub Enterprise Server running version 2.22 on a 8-core cloud machine. We use an internal load testing tool based on Gatling to generate
git requests against the test instance. We ran each test with a specific number of users across 5 different load generators for 30 minutes. All of our load generators use
git version 2.28.0 which by default is using protocol version 1. We would like to make a note that protocol version 2 only improves ref advertisement and therefore we don’t expect it to make a difference in our tests.
Once a test is complete, we use a combination of Gatling results,
ghe-governor and server health metrics to analyze the test.
Test repository characteristics
The git-sizer tool measures the size of Git repositories along many dimensions. In particular, we care about the total size on disk along with the count of each object type. The table below contains this information for our three test repositories.
jquery/jquery repository is a fairly small repository with only 40MB of disk space.
apple/swift is a medium-sized repository with around 130 thousand commits and 750MB on disk. The
torvalds/linux repository is typically the gold standard for Git performance tests on open source repositories. It uses 4 gigabytes of disk space and has close to a million commits.
We care about the following clone options:
- Full clones.
- Shallow clones (
- Treeless clones (
- Blobless clones (
In addition to these options at clone time, we can also choose to fetch in a shallow way using
--depth=1. Since treeless and blobless clones have their own way to reduce the objects downloaded during
git fetch, we only test shallow fetches on full and shallow clones.
We organized our test scenarios into the following ten categories, labeled T1 through T10. T1 to T4, simulate four different
git clone types. T5 to T10 simulate various
git fetch operations into these cloned repositories.
||Treeless partial clone|
||Blobless partial clone|
||Full fetch in a fully cloned repository|
||Shallow fetch in a fully cloned repository|
||Full fetch in a shallow cloned repository|
||Shallow fetch in a shallow cloned repository|
||Full fetch in a treeless partially cloned repository|
||Full fetch in a blobless partially cloned repository|
In partial clones, the new blobs at the new ref tip are not downloaded until we navigate to that position and populate our working directory with those blob contents. To be a fair comparison with the full and shallow clone cases, we also have our simulation run
git reset --hard origin/$branch in all T5 to T10 tests. In T5 to T8 this extra step will not have a huge impact, but in T9 and T10 it will ensure the blob downloads are included in the cost.
In all the scenarios above, a single user was also set to repeatedly change 3 random files in the repository and push them to the same branch that the other users were cloning and fetching. This simulates repository growth so the
git fetch commands actually have new data to download.
Let’s dig into the numbers to see what our experiment says.
git clone performance
The full numbers are provided in the tables below.
Unsurprisingly, shallow clone is the fastest clone for the client, followed by a treeless then blobless partial clones, and finally full clones. This performance is directly proportional to the amount of data required to satisfy the clone request. Recall that full clones need all reachable objects, blobless clones need all reachable commits and trees, treeless clones need all reachable commits. A shallow clone is the only clone type that does not grow at all along with the history of your repository.
The performance impact of these clone types grows in proportion to the repository size, especially the number of commits. For example, a shallow clone of
torvalds/linux is four times faster than a full clone, while a treeless clone is only twice as fast and a blobless clone is only 1.5 times as fast. It is worth noting that the development culture of the Linux project promotes very small blobs that compress extremely well. We expect that the performance difference to be greater for most other projects with a higher blob count or size.
As for server performance, we see that the Git CPU time per clone is higher for the blobless partial clone (T4). Looking a bit closer with
ghe-governor, we observe that the higher Git CPU is mainly due to the higher amount of
pack-objects operations in the partial clone scenarios (T3 and T4). In the
torvalds/linux repository, the Git CPU time spent on
pack-objects is four times more in a treeless partial clone (T3) compared to a full clone (T1). In contrast, in the smaller
jquery/jquery repository, a full clone consumes more CPU per clone compared to the partial clones (T3 and T4). Shallow clone of all the three different repositories, consumes the lowest amount of total and Git CPU per clone.
If the full clone is sending more data in a full clone, then why is it spending more CPU on a partial clone? When Git sends all reachable objects to the client, it mostly transfers the data it has on disk without decompressing or converting the data. However, partial and shallow clones need to extract that subset of data and repackage it to send to the client. We are investigating ways to reduce this CPU cost in partial clones.
The real fun starts after cloning the repository and users start developing and pushing code back up to the server. In the next section we analyze scenarios T5 through T10, which focus on the
git fetch and
git reset --hard origin/$branch commands.
jquery/jquery clone performance
|Test#||description||clone avgRT (milliseconds)||git CPU spent per clone|
|T1||full clone||2,000ms (Slowest)||450ms (Highest)|
|T2||shallow clone||300ms (6x faster than T1)||15ms (30x less than T1)|
|T3||treeless partial clone||900ms (2.5x faster than T1)||270ms (1.7x less than T1)|
|T4||blobless partial clone||900ms (2.5x faster than T1)||300ms (1.5x less than T1)|
apple/swift clone performance
|Test#||description||clone avgRT (seconds)||git CPU spent per clone|
|T1||full clone||50s (Slowest)||8s (2x less than T4)|
|T2||shallow clone||8s (6x faster than T1)||3s (6x less than T4)|
|T3||treeless partial clone||16s (3x faster than T1)||13s (Similar to T4)|
|T4||blobless partial clone||22s (2x faster than T1)||15s (Highest)|
torvalds/linux clone performance
|Test#||description||clone avgRT (minutes)||git CPU spent per clone|
|T1||full clone||5m (Slowest)||60s (2x less than T4)|
|T2||shallow clone||1.2m (4x faster than T1)||40s (3.5x less than T4)|
|T3||treeless partial clone||2.4m (2x faster than T1)||120s (Similar to T4)|
|T4||blobless partial clone||3m (1.5x faster than T1)||130s (Highest)|
git fetch performance
The full fetch performance numbers are provided in the tables below, but let’s first summarize our findings.
The biggest finding is that shallow fetches are the worst possible options, in particular from full clones. The technical reason is that the existence of a “shallow boundary” disables an important performance optimization on the server. This causes the server to walk commits and trees to find what’s reachable from the client’s perspective. This is more expensive in the full clone case because there are more commits and trees on the client that the server is checking to not duplicate. Also, as more shallow commits are accumulated, the client needs to send more data to the server to describe those shallow boundaries.
The two partial clone options have drastically different behavior in the
git fetch and
git reset --hard origin/$branch sequence. Fetching from blobless partial clones increases the reset command by a small, measurable way, but not enough to make a huge difference from the user perspective. In contrast, fetching from a treeless partial clone causes significant more time because the server needs to send all trees and blobs reachable from a commit’s root tree in order to satisfy the
git reset --hard origin/$branch command.
Due to these extra costs as a repository grows, we strongly recommend against shallow fetches and fetching from treeless partial clones. The only recommended scenario for a treeless partial clone is for quickly cloning on a build machine that needs access to the commit history, but will delete the repository at the end of the build.
Blobless partial clones do increase the Git CPU costs on the server somewhat, but the network data transfer is much less than a full clone or a full fetch from a shallow clone. The extra CPU cost is likely to become less important if your repository has larger blobs than our test repositories. In addition, you have access to the full commit history, which might be valuable to real users interacting with these repositories.
It is also worth noting that we noticed a surprising result during our testing. During T9 and T10 tests for the Linux repository, our load generators encountered memory issues as it seems that these scenarios with the heavy load that we were running, triggered more auto Garbage Collections (GC). GC in the Linux repository is expensive and involves a full repack of all Git data. Since we were testing on a Linux client, the GC processes were launched in the background to avoid blocking our foreground commands. However, as we kept fetching we ended up with several concurrent background processes; this is not a realistic scenario but a factor of our synthetic load testing. We ran
git config gc.auto false to prevent this from affecting our test results.
It is worth noting that blobless partial clones might trigger automatic garbage collection more often than a full clone. This is a natural byproduct of splitting the data into a larger number of small requests. We have work in progress to make Git’s repository maintenance be more flexible, especially for large repositories where a full repack is too time-consuming. Look forward to more updates about that feature here on the GitHub blog.
jquery/jquery fetch performance
|Test#||scenario||git fetch avgRT||git reset –hard||git CPU spent per fetch|
|T5||full fetch in a fully cloned repository||200ms||5ms||4ms (Lowest)|
|T6||shallow fetch in a fully cloned repository||300ms||5ms||18ms (4x more than to T5)|
|T7||full fetch in a shallow cloned repository||200ms||5ms||4ms (Similar to T5)|
|T8||shallow fetch in a shallow cloned repository||250ms||5ms||14ms (3x more than to T5)|
|T9||full fetch in a treeless partially cloned repository||200ms||85ms||9ms (2x more than to T5)|
|T10||full fetch in a blobless partially cloned repository||200ms||40ms||6ms (1.5x more than to T5)|
apple/swift fetch performance
|Test#||scenario||git fetch avgRT||git reset –hard||git CPU spent per fetch|
|T5||full fetch in a fully cloned repository||350ms||80ms||20ms (Lowest)|
|T6||shallow fetch in a fully cloned repository||1,500ms||80ms||300ms (13x more than to T5)|
|T7||full fetch in a shallow cloned repository||300ms||80ms||20ms (Similar to T5)|
|T8||shallow fetch in a shallow cloned repository||350ms||80ms||45ms (2x more than to T5)|
|T9||full fetch in a treeless partially cloned repository||300ms||300ms||70ms (3x more than to T5)|
|T10||full fetch in a blobless partially cloned repository||300ms||150ms||35ms (1.5x more than to T5)|
torvalds/linux fetch performance
|Test#||scenario||git fetch avgRT||git reset –hard||git CPU spent per fetch|
|T5||full fetch in a fully cloned repository||350ms||250ms||40ms (Lowest)|
|T6||shallow fetch in a fully cloned repository||6,000ms||250ms||1,000ms (25x more than T5)|
|T7||full fetch in a shallow cloned repository||300ms||250ms||50ms (1.3x more than T5)|
|T8||shallow fetch in a shallow cloned repository||400ms||250ms||80ms (2x more than to T5)|
|T9||full fetch in a treeless partially cloned repository||350ms||1250ms||400ms (10x more than to T5)|
|T10||full fetch in a blobless partially cloned repository||300ms||500ms||140ms (3.5x more than to T5)|
What does this mean for you?
Our experiment demonstrated some performance changes between these different clone and fetch options. Your mileage may vary! Our experimental load was synthetic, and your repository shape can differ greatly from these repositories.
Here are some common themes we identified that could help you choose the right scenario for your own usage:
If you are a developer focused on a single repository, the best approach is to do a full clone and then always perform a full fetch into that clone. You might deviate to a blobless partial clone if that repository is very large due to many large blobs, as that clone will help you get started more quickly. The trade-off is that some commands such as
git checkout or
git blame will require downloading new blob data when necessary.
In general, calculating a shallow fetch is computationally more expensive compared to a full fetch. Always use a full fetch instead of a shallow fetch both in fully and shallow cloned repositories.
In workflows such as CI builds when there is a need to do a single clone and delete the repository immediately, shallow clones are a good option. Shallow clones are the fastest way to get a copy of the working directory at the tip commit. If you need the commit history for your build, then a treeless partial clone might work better for you than a full clone. Bear in mind that in larger repositories such as the
torvalds/linux repository, it will save time on the client but it’s a bit heavier on your git server when compared to a full clone.
Blobless partial clones are particularly effective if you are using Git’s sparse-checkout feature to reduce the size of your working directory. The combination greatly reduces the number of blobs you need to do your work. Sparse-checkout does not reduce the required data transfer for shallow clones.
Notably, we did not test repositories that are significantly larger than the
torvalds/linux repository. Such repositories are not really available in the open, but are becoming increasingly common for private repositories. If you feel your repository is not represented by our test repositories, then we recommend trying to replicate our experiments yourself.
As can be observed, our test process is not simulating a real life situation where users have different workflows and work on different branches. Also the set of Git commands that have been analyzed in this study is a small set and is not a representative of a user’s daily Git usage. We are continuing to study these options to get a holistic view of how they change the user experience.