Wednesday, June 21, 2023

git add


Currently, I am curious to find out what Git will do for us behind the scene if we run the 'git add' command. Below are the logs coming from my experiment. 


Reference: Git Book


Preparation


Create a brand new folder and run the 'git init' command.


Step 1: 

Dump the .git tree structure first.


$ tree .git/
.git/ ├── HEAD ├── branches ├── config ├── description ├── hooks │   ├── applypatch-msg.sample │   ├── commit-msg.sample │   ├── fsmonitor-watchman.sample │   ├── post-update.sample │   ├── pre-applypatch.sample │   ├── pre-commit.sample │   ├── pre-merge-commit.sample │   ├── pre-push.sample │   ├── pre-rebase.sample │   ├── pre-receive.sample │   ├── prepare-commit-msg.sample │   ├── push-to-checkout.sample │   └── update.sample ├── info │   └── exclude ├── objects │   ├── info │   └── pack └── refs ├── heads └── tags 9 directories, 17 files


Step 2:

Create a new file and use the 'git add' command to add it to the 'staging area' (Changes to be committed).


    $ echo 'test' > test.txt
    $ git add .


Check the .git tree structure.


    $ tree .git/
.git/ ├── HEAD ├── branches ├── config ├── description ├── hooks │   ├── applypatch-msg.sample │   ├── commit-msg.sample │   ├── fsmonitor-watchman.sample │   ├── post-update.sample │   ├── pre-applypatch.sample │   ├── pre-commit.sample │   ├── pre-merge-commit.sample │   ├── pre-push.sample │   ├── pre-rebase.sample │   ├── pre-receive.sample │   ├── prepare-commit-msg.sample │   ├── push-to-checkout.sample │   └── update.sample ├── index (a new file)
├── info │   └── exclude ├── objects │   ├── 9d (a new directory) │   │   └── aeafb9864cf43055ae93beb0afd6c7d144bfa4 (a new file for git blob object) │   ├── info │   └── pack └── refs ├── heads
└── tags 10 directories, 19 files

After running the 'git add' command, there are two files and one directory created.

    1 directory:

        9d

    2 files:

        aeafb9864cf43055ae93beb0afd6c7d144bfa4

        index


9daeafb9864cf43055ae93beb0afd6c7d144bfa4 is a SHA-1 hash key which is calculated by the combination of the 'blob' type, the content of the file, and content length, and it is a reference (also, the filename of git objects) to finding the mapping git blob object storing the content. (Refer to 'Object Storage' topic in Git Book)


We can use the 'git cat-file' command to show the file content via the SHA-1 key:


$ git cat-file -p 9daeaf
test


Check its git object type:


$ git cat-file -t 9daeaf
blob


Check its git object length:


$ git cat-file -s 9daeaf
5


index (Staging area) connects 'working directory' and 'git directory (Repository)', and it records the filename and SHA-1 Hash key of the file snapshot.


Using 'git ls-file -s' to show the index file.


git ls-files -s
  100644 9daeafb9864cf43055ae93beb0afd6c7d144bfa4 0       test.txt

Based on this file structure, we know the filename and its content through the index file and git blob objects.


Step 3:

Add a new file with the same content.


    $ echo 'test' > test2.txt
    $ git add .


Check the .git tree structure.


    $ tree .git/
.git/ ├── HEAD ├── branches ├── config ├── description ├── hooks │   ├── applypatch-msg.sample │   ├── commit-msg.sample │   ├── fsmonitor-watchman.sample │   ├── post-update.sample │   ├── pre-applypatch.sample │   ├── pre-commit.sample │   ├── pre-merge-commit.sample │   ├── pre-push.sample │   ├── pre-rebase.sample │   ├── pre-receive.sample │   ├── prepare-commit-msg.sample │   ├── push-to-checkout.sample │   └── update.sample ├── index
├── info │   └── exclude ├── objects │   ├── 9d │   │   └── aeafb9864cf43055ae93beb0afd6c7d144bfa4 │   ├── info │   └── pack └── refs ├── heads
└── tags 10 directories, 19 files

Because those two files (test.txt and test2.txt) have the same content, Git does not need to create an extra git blob object to track the new file. (Also, the blob file does not record the filename)


Check the index file.

We notice that those two files refer to the same SHA-1 hash key because they have the same content.


git ls-files -s
100644 9daeafb9864cf43055ae93beb0afd6c7d144bfa4 0 test.txt 100644 9daeafb9864cf43055ae93beb0afd6c7d144bfa4 0 test2.txt


Step 4:

Remove 'test2.txt' and replace the 'test.txt' file with different content.


    $ rm test2.txt
    $ echo 'hello' > test.txt
    $ git add .


Check the .git tree structure.


    $ tree .git/
.git/ ├── HEAD ├── branches ├── config ├── description ├── hooks │   ├── applypatch-msg.sample │   ├── commit-msg.sample │   ├── fsmonitor-watchman.sample │   ├── post-update.sample │   ├── pre-applypatch.sample │   ├── pre-commit.sample │   ├── pre-merge-commit.sample │   ├── pre-push.sample │   ├── pre-rebase.sample │   ├── pre-receive.sample │   ├── prepare-commit-msg.sample │   ├── push-to-checkout.sample │   └── update.sample ├── index ├── info │   └── exclude ├── objects │   ├── 9d │   │   └── aeafb9864cf43055ae93beb0afd6c7d144bfa4 │   ├── ce (a new directory) │   │   └── 013625030ba8dba906f756967f9e9ca394464a (a new file for git blob object) │   ├── info │   └── pack └── refs ├── heads └── tags 11 directories, 20 files

Git will create a new git blob object to track the new content of 'test.txt'.

Also, Git does not delete any git objects even if you delete related files from a working directory (such as 9daeafb).


Check the index file. 

The filename 'test.txt' refers to a new SHA-1 hash key.


git ls-files -s
100644 ce013625030ba8dba906f756967f9e9ca394464a 0 test.txt


Step 5:

Create a new directory and create a new file under it.


    $ mkdir sub
    $ cd sub
    $ echo 'hello from subfolder' > test2.txt
    $ cd ..
    $ git add .

Check the .git tree structure.


    $ tree .git/
.git/ ├── HEAD ├── branches ├── config ├── description ├── hooks │   ├── applypatch-msg.sample │   ├── commit-msg.sample │   ├── fsmonitor-watchman.sample │   ├── post-update.sample │   ├── pre-applypatch.sample │   ├── pre-commit.sample │   ├── pre-merge-commit.sample │   ├── pre-push.sample │   ├── pre-rebase.sample │   ├── pre-receive.sample │   ├── prepare-commit-msg.sample │   ├── push-to-checkout.sample │   └── update.sample ├── index ├── info │   └── exclude ├── objects │   ├── 9d │   │   └── aeafb9864cf43055ae93beb0afd6c7d144bfa4 │   ├── ce │   │   └── 013625030ba8dba906f756967f9e9ca394464a
│   ├── ed (a new directory) │   │   └── 91b9af9b0a3e32ef72b93a26677d112583504f (a new file for git blob object) │   └── pack └── refs ├── heads └── tags 12 directories, 21 files


Check its git object content:


$ git cat-file -p ed91b9
hello from subfolder


Check its git object type:


$ git cat-file -t ed91b9
blob

From the git blob object, we cannot know the directory information.


Check the index file. 

The directory information is stored here.


git ls-files -s
100644 ed91b9af9b0a3e32ef72b93a26677d112583504f 0 sub/test2.txt 100644 ce013625030ba8dba906f756967f9e9ca394464a 0 test.txt


No comments:

Post a Comment