From a77b316fa358b0b13c0363d0d456c4b04bfc5cda Mon Sep 17 00:00:00 2001 From: Marcel Kapfer Date: Mon, 18 Nov 2019 15:56:38 +0100 Subject: [PATCH] Import Debian changes 0.0~git20180821.e1733b1-1 git-fat (0.0~git20180821.e1733b1-1) unstable; urgency=low * Initial release. --- debian/changelog | 5 + debian/compat | 1 + debian/control | 21 +++ debian/copyright | 60 +++++++ debian/install | 1 + debian/manpage.1 | 352 ++++++++++++++++++++++++++++++++++++++++++ debian/patches/series | 1 + debian/rules | 4 + debian/source/format | 1 + debian/watch | 1 + 10 files changed, 447 insertions(+) create mode 100644 debian/changelog create mode 100644 debian/compat create mode 100644 debian/control create mode 100644 debian/copyright create mode 100644 debian/install create mode 100644 debian/manpage.1 create mode 100644 debian/patches/series create mode 100755 debian/rules create mode 100644 debian/source/format create mode 100644 debian/watch diff --git a/debian/changelog b/debian/changelog new file mode 100644 index 0000000..c385bb2 --- /dev/null +++ b/debian/changelog @@ -0,0 +1,5 @@ +git-fat (0.0~git20180821.e1733b1-1) unstable; urgency=low + + * Initial release. + + -- Marcel Kapfer Mon, 18 Nov 2019 15:56:38 +0100 diff --git a/debian/compat b/debian/compat new file mode 100644 index 0000000..b4de394 --- /dev/null +++ b/debian/compat @@ -0,0 +1 @@ +11 diff --git a/debian/control b/debian/control new file mode 100644 index 0000000..c0dc761 --- /dev/null +++ b/debian/control @@ -0,0 +1,21 @@ +Source: git-fat +Section: vcs +Priority: optional +Maintainer: Marcel Kapfer +Build-Depends: debhelper (>=11~) +Standards-Version: 4.1.4 +Homepage: https://github.com/jedbrown/git-fat + +Package: git-fat +Architecture: any +Multi-Arch: foreign +Depends: ${misc:Depends}, ${shlibs:Depends}, rsync, python2 +Description: Simple way to handle fat files without committing them to git + Checking large binary files into a source repository (Git or otherwise) is a + bad idea because repository size quickly becomes unreasonable. Even if the + instantaneous working tree stays manageable, preserving repository integrity + requires all binary files in the entire project history, which given the + typically poor compression of binary diffs, implies that the repository size + will become impractically large. Some people recommend checking binaries into + different repositories or even not versioning them at all, but these are not + satisfying solutions for most workflows. diff --git a/debian/copyright b/debian/copyright new file mode 100644 index 0000000..e00bdff --- /dev/null +++ b/debian/copyright @@ -0,0 +1,60 @@ +Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ +Upstream-Name: git-fat +Source: https://github.com/jedbrown/git-fat/ +# +# Please double check copyright with the licensecheck(1) command. + +Files: * +Copyright: 2012 Jed Brown +License: BSD-2-clause + +Files: debian/* +Copyright: 2019 Marcel Kapfer +License: Expat + +License: BSD-2-clause + Copyright (c) 2012, Jed Brown + All rights reserved. + . + Redistribution and use in source and binary forms, with or without modification, + are permitted provided that the following conditions are met: + . + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + . + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND + ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR + ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON + ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +License: Expat + Copyright (c) 2019 Marcel Kapfer + . + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to + permit persons to whom the Software is furnished to do so, subject to + the following conditions: + . + The above copyright notice and this permission notice shall be included + in all copies or substantial portions of the Software. + . + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY + CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + diff --git a/debian/install b/debian/install new file mode 100644 index 0000000..d6ce6c2 --- /dev/null +++ b/debian/install @@ -0,0 +1 @@ +git-fat usr/bin/ diff --git a/debian/manpage.1 b/debian/manpage.1 new file mode 100644 index 0000000..bec9639 --- /dev/null +++ b/debian/manpage.1 @@ -0,0 +1,352 @@ +.SH Introduction +.PP +Checking large binary files into a source repository (Git or otherwise) +is a bad idea because repository size quickly becomes unreasonable. +Even if the instantaneous working tree stays manageable, preserving +repository integrity requires all binary files in the entire project +history, which given the typically poor compression of binary diffs, +implies that the repository size will become impractically large. +Some people recommend checking binaries into different repositories or +even not versioning them at all, but these are not satisfying solutions +for most workflows. +.SS Features of \f[C]git\-fat\f[R] +.IP \[bu] 2 +clones of the source repository are small and fast because no binaries +are transferred, yet fully functional with complete metadata and +incremental retrieval (\f[C]git clone \-\-depth\f[R] has limited +granularity and couples metadata to content) +.IP \[bu] 2 +\f[C]git\-fat\f[R] supports the same workflow for large binaries and +traditionally versioned files, but internally manages the \[lq]fat\[rq] +files separately +.IP \[bu] 2 +\f[C]git\-bisect\f[R] works properly even when versions of the binary +files change over time +.IP \[bu] 2 +selective control of which large files to pull into the local store +.IP \[bu] 2 +local fat object stores can be shared between multiple clones, even by +different users +.IP \[bu] 2 +can easily support fat object stores distributed across multiple hosts +.IP \[bu] 2 +depends only on stock Python and rsync +.SS Related projects +.IP \[bu] 2 +git\-annex (http://git-annex.branchable.com) is a far more comprehensive +solution, but with less transparent workflow and with more dependencies. +.IP \[bu] 2 +git\-media (https://github.com/schacon/git-media) adopts a similar +approach to \f[C]git\-fat\f[R], but with a different synchronization +philosophy and with many Ruby dependencies. +.SH Installation and configuration +.PP +Place \f[C]git\-fat\f[R] in your \f[C]PATH\f[R]. +.PP +Edit (or create) \f[C].gitattributes\f[R] to regard any desired +extensions as fat files. +.IP +.nf +\f[C] +$ cd path\-to\-your\-repository +$ cat >> .gitattributes +*.png filter=fat \-crlf +*.jpg filter=fat \-crlf +*.gz filter=fat \-crlf +\[ha]D +\f[R] +.fi +.PP +Run \f[C]git fat init\f[R] to activate the extension. +Now add and commit as usual. +Matched files will be transparently stored externally, but will appear +complete in the working tree. +.PP +Set a remote store for the fat objects by editing \f[C].gitfat\f[R]. +.IP +.nf +\f[C] +[rsync] +remote = your.remote\-host.org:/share/fat\-store +\f[R] +.fi +.PP +This file should typically be committed to the repository so that others +will automatically have their remote set. +This remote address can use any protocol supported by rsync. +.PP +Most users will configure it to use remote ssh in a directory with +shared access. +To do this, set the \f[C]sshuser\f[R] and \f[C]sshport\f[R] variables in +\f[C].gitfat\f[R] configuration file. +For example, to use rsync with ssh, with the default port (22) and +authenticate with the user \[lq]\f[I]fat\f[R]\[rq], your configuration +would look like this: +.IP +.nf +\f[C] +[rsync] +remote = your.remote\-host.org:/share/fat\-store +sshuser = fat +\f[R] +.fi +.SH A worked example +.PP +Before we start, let\[cq]s turn on verbose reporting so we can see +what\[cq]s happening. +Without this environment variable, all the output lines starting with +\f[C]git\-fat\f[R] will not be shown. +.IP +.nf +\f[C] +$ export GIT_FAT_VERBOSE=1 +\f[R] +.fi +.PP +First, we create a repository and configure it for use with +\f[C]git\-fat\f[R]. +.IP +.nf +\f[C] +$ git init repo +Initialized empty Git repository in /tmp/repo/.git/ +$ cd repo +$ git fat init +$ cat > .gitfat +[rsync] +remote = localhost:/tmp/fat\-store +$ mkdir \-p /tmp/fat\-store # make sure the remote directory exists +$ echo \[aq]*.gz filter=fat \-crlf\[aq] > .gitattributes +$ git add .gitfat .gitattributes +$ git commit \-m\[aq]Initial repository\[aq] +[master (root\-commit) eb7facb] Initial repository + 2 files changed, 3 insertions(+) + create mode 100644 .gitattributes + create mode 100644 .gitfat +\f[R] +.fi +.PP +Now we add a binary file whose name matches the pattern we set in +\f[C].gitattributes\f[R]. +.IP +.nf +\f[C] +$ curl https://nodeload.github.com/jedbrown/git\-fat/tar.gz/master \-o master.tar.gz + % Total % Received % Xferd Average Speed Time Time Time Current + Dload Upload Total Spent Left Speed +100 6449 100 6449 0 0 7741 0 \-\-:\-\-:\-\- \-\-:\-\-:\-\- \-\-:\-\-:\-\- 9786 +$ git add master.tar.gz +git\-fat filter\-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324 +$ git commit \-m\[aq]Added master.tar.gz\[aq] +[master b85a96f] Added master.tar.gz +git\-fat filter\-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324 + 1 file changed, 1 insertion(+) + create mode 100644 master.tar.gz +\f[R] +.fi +.PP +The patch itself is very simple and does not include the binary. +.IP +.nf +\f[C] +$ git show \-\-pretty=oneline HEAD +918063043a6156172c2ad66478c6edd5c7df0217 Add master.tar.gz +diff \-\-git a/master.tar.gz b/master.tar.gz +new file mode 100644 +index 0000000..12f7d52 +\-\-\- /dev/null ++++ b/master.tar.gz +\[at]\[at] \-0,0 +1 \[at]\[at] ++#$# git\-fat 1f218834a137f7b185b498924e7a030008aee2ae +\f[R] +.fi +.SS Pushing fat files +.PP +Now let\[cq]s push our fat files using the rsync configuration that we +set up earlier. +.IP +.nf +\f[C] +$ git fat push +Pushing to localhost:/tmp/fat\-store +building file list ... +1 file to consider + +sent 61 bytes received 12 bytes 48.67 bytes/sec +total size is 6449 speedup is 88.34 +\f[R] +.fi +.PP +We might normally set a remote now and push the git repository. +.SS Cloning and pulling +.PP +Now let\[cq]s look at what happens when we clone. +.IP +.nf +\f[C] +$ cd .. +$ git clone repo repo2 +Cloning into \[aq]repo2\[aq]... +done. +$ cd repo2 +$ git fat init # don\[aq]t forget +$ ls \-l # file is just a placeholder +total 4 +\-rw\-r\-\-r\-\- 1 jed users 53 Nov 25 22:42 master.tar.gz +$ cat master.tar.gz # holds the SHA1 of the file +#$# git\-fat 1f218834a137f7b185b498924e7a030008aee2ae +\f[R] +.fi +.PP +We can always get a summary of what fat objects are missing in our local +cache. +.IP +.nf +\f[C] +Orphan objects: +1f218834a137f7b185b498924e7a030008aee2ae +\f[R] +.fi +.PP +Now get any objects referenced by our current \f[C]HEAD\f[R]. +This command also accepts the \f[C]\-\-all\f[R] option to pull full +history, or a revision to pull selected history. +.IP +.nf +\f[C] +$ git fat pull +receiving file list ... +1 file to consider +1f218834a137f7b185b498924e7a030008aee2ae + 6449 100% 6.15MB/s 0:00:00 (xfer#1, to\-check=0/1) + +sent 30 bytes received 6558 bytes 4392.00 bytes/sec +total size is 6449 speedup is 0.98 +Restoring 1f218834a137f7b185b498924e7a030008aee2ae \-> master.tar.gz +git\-fat filter\-smudge: restoring from /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae +\f[R] +.fi +.PP +Everything is in place +.IP +.nf +\f[C] +$ git status +git\-fat filter\-clean: caching to /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae +# On branch master +nothing to commit, working directory clean +$ ls \-l # recovered the full file +total 8 +\-rw\-r\-\-r\-\- 1 jed users 6449 Nov 25 17:10 master.tar.gz +\f[R] +.fi +.SS Summary +.IP \[bu] 2 +Set the \[lq]fat\[rq] file types in \f[C].gitattributes\f[R]. +.IP \[bu] 2 +Use normal git commands to interact with the repository without thinking +about what files are fat and non\-fat. +The fat files will be treated specially. +.IP \[bu] 2 +Synchronize fat files with \f[C]git fat push\f[R] and +\f[C]git fat pull\f[R]. +.SS Retroactive import using \f[C]git filter\-branch\f[R] [Experimental] +.PP +Sometimes large objects were added to a repository by accident or for +lack of a better place to put them. +\f[I]If\f[R] you are willing to rewrite history, forcing everyone to +reclone, you can retroactively manage those files with +\f[C]git fat\f[R]. +Be sure that you understand the consequences of +\f[C]git filter\-branch\f[R] before attempting this. +This feature is experimental and irreversible, so be doubly careful with +backups. +.SS Step 1: Locate the fat files +.PP +Run \f[C]git fat find THRESH_BYTES > fat\-files\f[R] and inspect +\f[C]fat\-files\f[R] in an editor. +Lines will be sorted by the maximum object size that has been at each +path, and look like +.IP +.nf +\f[C] +something.big filter=fat \-text # 8154677 1 +\f[R] +.fi +.PP +where the first number after the \f[C]#\f[R] is the number of bytes and +the second number is the number of modifications that path has seen. +You will normally filter out some of these paths using grep and/or an +editor. +When satisfied, remove the ends of the lines (including the \f[C]#\f[R]) +and append to \f[C].gitattributes\f[R]. +It\[cq]s best to \f[C]git add .gitattributes\f[R] and commit at this +time (likely enrolling some extant files into \f[C]git fat\f[R]). +.SS Step 2: \f[C]filter\-branch\f[R] +.PP +Copy \f[C].gitattributes\f[R] to \f[C]/tmp/fat\-filter\-files\f[R] and +edit to remove everything after the file name (e.g., +\f[C]sed s/ \[rs]+filter=fat.*$//\f[R]). +Currently, this may only contain exact paths relative to the root of the +repository. +Finally, run +.IP +.nf +\f[C] +git filter\-branch \-\-index\-filter \[rs] + \[aq]git fat index\-filter /tmp/fat\-filter\-files \-\-manage\-gitattributes\[aq] \[rs] + \-\-tag\-name\-filter cat \-\- \-\-all +\f[R] +.fi +.PP +(You can remove the \f[C]\-\-manage\-gitattributes\f[R] option if you +don\[cq]t want to append all the files being enrolled in +\f[C]git fat\f[R] to \f[C].gitattributes\f[R], however, future users +would need to use \f[C].git/info/attributes\f[R] to have the +\f[C]git fat\f[R] fileters run.) When this finishes, inspect to see if +everything is in order and follow the Checklist for Shrinking a +Repository (http://www.kernel.org/pub/software/scm/git/docs/git-filter-branch.html#_checklist_for_shrinking_a_repository) +in the \f[C]git filter\-branch\f[R] man page, typically +\f[C]git clone file:///path/to/repo\f[R]. +Be sure to \f[C]git fat push\f[R] from the original repository. +.PP +See the script \f[C]test\-retroactive.sh\f[R] for an example of +cleaning. +.SS Implementation notes +.PP +The actual binary files are stored in \f[C].git/fat/objects\f[R], +leaving \f[C].git/objects\f[R] nice and small. +.IP +.nf +\f[C] +$ du \-bs .git/objects +2212 .git/objects/ +$ ls \-l .git/fat/objects # This is where the file actually goes, but that\[aq]s not important +total 8 +\-rw\-\-\-\-\-\-\- 1 jed users 6449 Nov 25 17:01 1f218834a137f7b185b498924e7a030008aee2ae +\f[R] +.fi +.PP +If you have multiple clones that access the same filesystem, you can +make \f[C].git/fat/objects\f[R] a symlink to a common location, in which +case all content will be available in all repositories without extra +copies. +You still need to \f[C]git fat push\f[R] to make it available to others. +.SH Some refinements +.IP \[bu] 2 +Allow pulling and pushing only select files +.IP \[bu] 2 +Relate orphan objects to file system +.IP \[bu] 2 +Put some more useful message in smudged (working tree) version of +missing files. +.IP \[bu] 2 +More friendly configuration for multiple fat remotes +.IP \[bu] 2 +Make commands safer in presence of a dirty tree. +.IP \[bu] 2 +Private setting of a different remote. +.IP \[bu] 2 +Gracefully handle unmanaged files when the filter is called (either +legacy files or files matching the pattern that should some reason not +be treated as fat). diff --git a/debian/patches/series b/debian/patches/series new file mode 100644 index 0000000..4a97dfa --- /dev/null +++ b/debian/patches/series @@ -0,0 +1 @@ +# You must remove unused comment lines for the released package. diff --git a/debian/rules b/debian/rules new file mode 100755 index 0000000..2d33f6a --- /dev/null +++ b/debian/rules @@ -0,0 +1,4 @@ +#!/usr/bin/make -f + +%: + dh $@ diff --git a/debian/source/format b/debian/source/format new file mode 100644 index 0000000..163aaf8 --- /dev/null +++ b/debian/source/format @@ -0,0 +1 @@ +3.0 (quilt) diff --git a/debian/watch b/debian/watch new file mode 100644 index 0000000..9e7c0da --- /dev/null +++ b/debian/watch @@ -0,0 +1 @@ +version=3