[git] How can I debug git/git-shell related problems?

Git 2.22 (Q2 2019) introduces trace2 with commit ee4512e by Jeff Hostetler:

trace2: create new combined trace facility

Create a new unified tracing facility for git.
The eventual intent is to replace the current trace_printf* and trace_performance* routines with a unified set of git_trace2* routines.

In addition to the usual printf-style API, trace2 provides higer-level event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.

Trace2 defines 3 output targets.
These are set using the environment variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".
These may be set to "1" or to an absolute pathname (just like the current GIT_TRACE).

Note: regarding environment variable name, always use GIT_TRACExxx, not GIT_TRxxx.
So actually GIT_TRACE2, GIT_TRACE2_PERF or GIT_TRACE2_EVENT.
See the Git 2.22 rename mentioned later below.

What follows is the initial work on this new tracing feature, with the old environment variable names:

  • GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command summary data.

  • GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
    It extends the output with columns for the command process, thread, repo, absolute and relative elapsed times.
    It reports events for child process start/stop, thread start/stop, and per-thread function nesting.

  • GIT_TR2_EVENT is a new structured format. It writes event data as a series of JSON records.

Calls to trace2 functions log to any of the 3 output targets enabled without the need to call different trace_printf* or trace_performance* routines.

See commit a4d3a28 (21 Mar 2019) by Josh Steadmon (steadmon).
(Merged by Junio C Hamano -- gitster -- in commit 1b40314, 08 May 2019)

trace2: write to directory targets

When the value of a trace2 environment variable is an absolute path referring to an existing directory, write output to files (one per process) underneath the given directory.
Files will be named according to the final component of the trace2 SID, followed by a counter to avoid potential collisions.

This makes it more convenient to collect traces for every git invocation by unconditionally setting the relevant trace2 envvar to a constant directory name.


See also commit f672dee (29 Apr 2019), and commit 81567ca, commit 08881b9, commit bad229a, commit 26c6f25, commit bce9db6, commit 800a7f9, commit a7bc01e, commit 39f4317, commit a089724, commit 1703751 (15 Apr 2019) by Jeff Hostetler (jeffhostetler).
(Merged by Junio C Hamano -- gitster -- in commit 5b2d1c0, 13 May 2019)

The new documentation now includes config settings which are only read from the system and global config files (meaning repository local and worktree config files and -c command line arguments are not respected.)

Example:

$ git config --global trace2.normalTarget ~/log.normal
$ git version
git version 2.20.1.155.g426c96fcdb

yields

$ cat ~/log.normal
12:28:42.620009 common-main.c:38                  version 2.20.1.155.g426c96fcdb
12:28:42.620989 common-main.c:39                  start git version
12:28:42.621101 git.c:432                         cmd_name version (version)
12:28:42.621215 git.c:662                         exit elapsed:0.001227 code:0
12:28:42.621250 trace2/tr2_tgt_normal.c:124 atexit elapsed:0.001265 code:0

And for performance measure:

$ git config --global trace2.perfTarget ~/log.perf
$ git version
git version 2.20.1.155.g426c96fcdb

yields

$ cat ~/log.perf
12:28:42.620675 common-main.c:38                  | d0 | main                     | version      |     |           |           |            | 2.20.1.155.g426c96fcdb
12:28:42.621001 common-main.c:39                  | d0 | main                     | start        |     |  0.001173 |           |            | git version
12:28:42.621111 git.c:432                         | d0 | main                     | cmd_name     |     |           |           |            | version (version)
12:28:42.621225 git.c:662                         | d0 | main                     | exit         |     |  0.001227 |           |            | code:0
12:28:42.621259 trace2/tr2_tgt_perf.c:211         | d0 | main                     | atexit       |     |  0.001265 |           |            | code:0

As documented in Git 2.23 (Q3 2019), the environment variable to use is GIT_TRACE2.

See commit 6114a40 (26 Jun 2019) by Carlo Marcelo Arenas Belón (carenas).
See commit 3efa1c6 (12 Jun 2019) by Ævar Arnfjörð Bjarmason (avar).
(Merged by Junio C Hamano -- gitster -- in commit e9eaaa4, 09 Jul 2019)

That follows the work done in Git 2.22: commit 4e0d3aa, commit e4b75d6 (19 May 2019) by SZEDER Gábor (szeder).
(Merged by Junio C Hamano -- gitster -- in commit 463dca6, 30 May 2019)

trace2: rename environment variables to GIT_TRACE2*

For an environment variable that is supposed to be set by users, the GIT_TR2* env vars are just too unclear, inconsistent, and ugly.

Most of the established GIT_* environment variables don't use abbreviations, and in case of the few that do (GIT_DIR, GIT_COMMON_DIR, GIT_DIFF_OPTS) it's quite obvious what the abbreviations (DIR and OPTS) stand for.
But what does TR stand for? Track, traditional, trailer, transaction, transfer, transformation, transition, translation, transplant, transport, traversal, tree, trigger, truncate, trust, or ...?!

The trace2 facility, as the '2' suffix in its name suggests, is supposed to eventually supercede Git's original trace facility.
It's reasonable to expect that the corresponding environment variables follow suit, and after the original GIT_TRACE variables they are called GIT_TRACE2; there is no such thing is 'GIT_TR'.

All trace2-specific config variables are, very sensibly, in the 'trace2' section, not in 'tr2'.

OTOH, we don't gain anything at all by omitting the last three characters of "trace" from the names of these environment variables.

So let's rename all GIT_TR2* environment variables to GIT_TRACE2*, before they make their way into a stable release.


Git 2.24 (Q3 2019) improves the Git repository initialization.

See commit 22932d9, commit 5732f2b, commit 58ebccb (06 Aug 2019) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit b4a1eec, 09 Sep 2019)

common-main: delay trace2 initialization

We initialize the trace2 system in the common main() function so that all programs (even ones that aren't builtins) will enable tracing.

But trace2 startup is relatively heavy-weight, as we have to actually read on-disk config to decide whether to trace.
This can cause unexpected interactions with other common-main initialization. For instance, we'll end up in the config code before calling initialize_the_repository(), and the usual invariant that the_repository is never NULL will not hold.

Let's push the trace2 initialization further down in common-main, to just before we execute cmd_main().


Git 2.24 (Q4 2019) makes also sure that output from trace2 subsystem is formatted more prettily now.

See commit 742ed63, commit e344305, commit c2b890a (09 Aug 2019), commit ad43e37, commit 04f10d3, commit da4589c (08 Aug 2019), and commit 371df1b (31 Jul 2019) by Jeff Hostetler (jeffhostetler).
(Merged by Junio C Hamano -- gitster -- in commit 93fc876, 30 Sep 2019)

And, still Git 2.24

See commit 87db61a, commit 83e57b0 (04 Oct 2019), and commit 2254101, commit 3d4548e (03 Oct 2019) by Josh Steadmon (steadmon).
(Merged by Junio C Hamano -- gitster -- in commit d0ce4d9, 15 Oct 2019)

trace2: discard new traces if target directory has too many files

Signed-off-by: Josh Steadmon

trace2 can write files into a target directory.
With heavy usage, this directory can fill up with files, causing difficulty for trace-processing systems.

This patch adds a config option (trace2.maxFiles) to set a maximum number of files that trace2 will write to a target directory.

The following behavior is enabled when the maxFiles is set to a positive integer:

  • When trace2 would write a file to a target directory, first check whether or not the traces should be discarded.
    Traces should be discarded if:
    • there is a sentinel file declaring that there are too many files
    • OR, the number of files exceeds trace2.maxFiles.
      In the latter case, we create a sentinel file named git-trace2-discard to speed up future checks.

The assumption is that a separate trace-processing system is dealing with the generated traces; once it processes and removes the sentinel file, it should be safe to generate new trace files again.

The default value for trace2.maxFiles is zero, which disables the file count check.

The config can also be overridden with a new environment variable: GIT_TRACE2_MAX_FILES.


And Git 2.24 (Q4 2019) teach trace2 about git push stages.

See commit 25e4b80, commit 5fc3118 (02 Oct 2019) by Josh Steadmon (steadmon).
(Merged by Junio C Hamano -- gitster -- in commit 3b9ec27, 15 Oct 2019)

push: add trace2 instrumentation

Signed-off-by: Josh Steadmon

Add trace2 regions in transport.c and builtin/push.c to better track time spent in various phases of pushing:

  • Listing refs
  • Checking submodules
  • Pushing submodules
  • Pushing refs

With Git 2.25 (Q1 2020), some of the Documentation/technical is moved to header *.h files.

See commit 6c51cb5, commit d95a77d, commit bbcfa30, commit f1ecbe0, commit 4c4066d, commit 7db0305, commit f3b9055, commit 971b1f2, commit 13aa9c8, commit c0be43f, commit 19ef3dd, commit 301d595, commit 3a1b341, commit 126c1cc, commit d27eb35, commit 405c6b1, commit d3d7172, commit 3f1480b, commit 266f03e, commit 13c4d7e (17 Nov 2019) by Heba Waly (HebaWaly).
(Merged by Junio C Hamano -- gitster -- in commit 26c816a, 16 Dec 2019)

trace2: move doc to trace2.h

Signed-off-by: Heba Waly

Move the functions documentation from Documentation/technical/api-trace2.txt to trace2.h as it's easier for the developers to find the usage information beside the code instead of looking for it in another doc file.

Only the functions documentation section is removed from Documentation/technical/api-trace2.txt as the file is full of details that seemed more appropriate to be in a separate doc file as it is, with a link to the doc file added in the trace2.h. Also the functions doc is removed to avoid having redundandt info which will be hard to keep syncronized with the documentation in the header file.

(although that reorganization had a side effect on another command, explained and fixed with Git 2.25.2 (March 2020) in commit cc4f2eb (14 Feb 2020) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 1235384, 17 Feb 2020))


With Git 2.27 (Q2 2020): Trace2 enhancement to allow logging of the environment variables.

See commit 3d3adaa (20 Mar 2020) by Josh Steadmon (steadmon).
(Merged by Junio C Hamano -- gitster -- in commit 810dc64, 22 Apr 2020)

trace2: teach Git to log environment variables

Signed-off-by: Josh Steadmon
Acked-by: Jeff Hostetler

Via trace2, Git can already log interesting config parameters (see the trace2_cmd_list_config() function). However, this can grant an incomplete picture because many config parameters also allow overrides via environment variables.

To allow for more complete logs, we add a new trace2_cmd_list_env_vars() function and supporting implementation, modeled after the pre-existing config param logging implementation.


With Git 2.27 (Q2 2020), teach codepaths that show progress meter to also use the start_progress() and the stop_progress() calls as a "region" to be traced.

See commit 98a1364 (12 May 2020) by Emily Shaffer (nasamuffin).
(Merged by Junio C Hamano -- gitster -- in commit d98abce, 14 May 2020)

trace2: log progress time and throughput

Signed-off-by: Emily Shaffer

Rather than teaching only one operation, like 'git fetch', how to write down throughput to traces, we can learn about a wide range of user operations that may seem slow by adding tooling to the progress library itself.

Operations which display progress are likely to be slow-running and the kind of thing we want to monitor for performance anyways.

By showing object counts and data transfer size, we should be able to make some derived measurements to ensure operations are scaling the way we expect.

And:

With Git 2.27 (Q2 2020), last-minute fix for our recent change to allow use of progress API as a traceable region.

See commit 3af029c (15 May 2020) by Derrick Stolee (derrickstolee).
(Merged by Junio C Hamano -- gitster -- in commit 85d6e28, 20 May 2020)

progress: call trace2_region_leave() only after calling _enter()

Signed-off-by: Derrick Stolee

A user of progress API calls start_progress() conditionally and depends on the display_progress() and stop_progress() functions to become no-op when start_progress() hasn't been called.

As we added a call to trace2_region_enter() to start_progress(), the calls to other trace2 API calls from the progress API functions must make sure that these trace2 calls are skipped when start_progress() hasn't been called on the progress struct.

Specifically, do not call trace2_region_leave() from stop_progress() when we haven't called start_progress(), which would have called the matching trace2_region_enter().


That last part is more robust with Git 2.29 (Q4 2020):

See commit ac900fd (10 Aug 2020) by Martin Ågren (none).
(Merged by Junio C Hamano -- gitster -- in commit e6ec620, 17 Aug 2020)

progress: don't dereference before checking for NULL

Signed-off-by: Martin Ågren

In stop_progress(), we're careful to check that p_progress is non-NULL before we dereference it, but by then we have already dereferenced it when calling finish_if_sparse(*p_progress).
And, for what it's worth, we'll go on to blindly dereference it again inside stop_progress_msg().

We could return early if we get a NULL-pointer, but let's go one step further and BUG instead.
The progress API handles NULL just fine, but that's the NULL-ness of *p_progress, e.g., when running with --no-progress.
If p_progress is NULL, chances are that's a mistake.
For symmetry, let's do the same check in stop_progress_msg(), too.


With Git 2.29 (Q4 2020), there is even more trace, this time in a Git development environment.

See commit 4441f42 (09 Sep 2020) by Han-Wen Nienhuys (hanwen).
(Merged by Junio C Hamano -- gitster -- in commit c9a04f0, 22 Sep 2020)

refs: add GIT_TRACE_REFS debugging mechanism

Signed-off-by: Han-Wen Nienhuys

When set in the environment, GIT_TRACE_REFS makes git print operations and results as they flow through the ref storage backend. This helps debug discrepancies between different ref backends.

Example:

$ GIT_TRACE_REFS="1" ./git branch
15:42:09.769631 refs/debug.c:26         ref_store for .git
15:42:09.769681 refs/debug.c:249        read_raw_ref: HEAD: 0000000000000000000000000000000000000000 (=> refs/heads/ref-debug) type 1: 0
15:42:09.769695 refs/debug.c:249        read_raw_ref: refs/heads/ref-debug: 3a238e539bcdfe3f9eb5010fd218640c1b499f7a (=> refs/heads/ref-debug) type 0: 0
15:42:09.770282 refs/debug.c:233        ref_iterator_begin: refs/heads/ (0x1)
15:42:09.770290 refs/debug.c:189        iterator_advance: refs/heads/b4 (0)
15:42:09.770295 refs/debug.c:189        iterator_advance: refs/heads/branch3 (0)

git now includes in its man page:

GIT_TRACE_REFS

Enables trace messages for operations on the ref database. See GIT_TRACE for available trace output options.


With Git 2.30 (Q1 2021), like die() and error(), a call to warning() will also trigger a trace2 event.

See commit 0ee10fd (23 Nov 2020) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano -- gitster -- in commit 2aeafbc, 08 Dec 2020)

Examples related to git

Does the target directory for a git clone have to match the repo name? Git fatal: protocol 'https' is not supported Git is not working after macOS Update (xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools) git clone: Authentication failed for <URL> destination path already exists and is not an empty directory SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 GitLab remote: HTTP Basic: Access denied and fatal Authentication How can I switch to another branch in git? VS 2017 Git Local Commit DB.lock error on every commit How to remove an unpushed outgoing commit in Visual Studio?

Examples related to debugging

How do I enable logging for Spring Security? How to run or debug php on Visual Studio Code (VSCode) How do you debug React Native? How do I debug "Error: spawn ENOENT" on node.js? How can I inspect the file system of a failed `docker build`? Swift: print() vs println() vs NSLog() JavaScript console.log causes error: "Synchronous XMLHttpRequest on the main thread is deprecated..." How to debug Spring Boot application with Eclipse? Unfortunately MyApp has stopped. How can I solve this? 500 internal server error, how to debug

Examples related to trace

When tracing out variables in the console, How to create a new line? How can I debug git/git-shell related problems? How to echo shell commands as they are executed Logging best practices How can I add (simple) tracing in C#?