Import Debian changes 0.0~git20180821.e1733b1-1

git-fat (0.0~git20180821.e1733b1-1) unstable; urgency=low

  * Initial release.
This commit is contained in:
Marcel Kapfer 2019-11-18 15:56:38 +01:00
parent eb2d41f476
commit a77b316fa3
10 changed files with 447 additions and 0 deletions

5
debian/changelog vendored Normal file
View file

@ -0,0 +1,5 @@
git-fat (0.0~git20180821.e1733b1-1) unstable; urgency=low
* Initial release.
-- Marcel Kapfer <opensource@mmk2410.org> Mon, 18 Nov 2019 15:56:38 +0100

1
debian/compat vendored Normal file
View file

@ -0,0 +1 @@
11

21
debian/control vendored Normal file
View file

@ -0,0 +1,21 @@
Source: git-fat
Section: vcs
Priority: optional
Maintainer: Marcel Kapfer <opensource@mmk2410.org>
Build-Depends: debhelper (>=11~)
Standards-Version: 4.1.4
Homepage: https://github.com/jedbrown/git-fat
Package: git-fat
Architecture: any
Multi-Arch: foreign
Depends: ${misc:Depends}, ${shlibs:Depends}, rsync, python2
Description: Simple way to handle fat files without committing them to git
Checking large binary files into a source repository (Git or otherwise) is a
bad idea because repository size quickly becomes unreasonable. Even if the
instantaneous working tree stays manageable, preserving repository integrity
requires all binary files in the entire project history, which given the
typically poor compression of binary diffs, implies that the repository size
will become impractically large. Some people recommend checking binaries into
different repositories or even not versioning them at all, but these are not
satisfying solutions for most workflows.

60
debian/copyright vendored Normal file
View file

@ -0,0 +1,60 @@
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: git-fat
Source: https://github.com/jedbrown/git-fat/
#
# Please double check copyright with the licensecheck(1) command.
Files: *
Copyright: 2012 Jed Brown <jed@jedbrown.org>
License: BSD-2-clause
Files: debian/*
Copyright: 2019 Marcel Kapfer <opensource@mmk2410.org>
License: Expat
License: BSD-2-clause
Copyright (c) 2012, Jed Brown
All rights reserved.
.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
.
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
License: Expat
Copyright (c) 2019 Marcel Kapfer <opensource@mmk2410.org>
.
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
.
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

1
debian/install vendored Normal file
View file

@ -0,0 +1 @@
git-fat usr/bin/

352
debian/manpage.1 vendored Normal file
View file

@ -0,0 +1,352 @@
.SH Introduction
.PP
Checking large binary files into a source repository (Git or otherwise)
is a bad idea because repository size quickly becomes unreasonable.
Even if the instantaneous working tree stays manageable, preserving
repository integrity requires all binary files in the entire project
history, which given the typically poor compression of binary diffs,
implies that the repository size will become impractically large.
Some people recommend checking binaries into different repositories or
even not versioning them at all, but these are not satisfying solutions
for most workflows.
.SS Features of \f[C]git\-fat\f[R]
.IP \[bu] 2
clones of the source repository are small and fast because no binaries
are transferred, yet fully functional with complete metadata and
incremental retrieval (\f[C]git clone \-\-depth\f[R] has limited
granularity and couples metadata to content)
.IP \[bu] 2
\f[C]git\-fat\f[R] supports the same workflow for large binaries and
traditionally versioned files, but internally manages the \[lq]fat\[rq]
files separately
.IP \[bu] 2
\f[C]git\-bisect\f[R] works properly even when versions of the binary
files change over time
.IP \[bu] 2
selective control of which large files to pull into the local store
.IP \[bu] 2
local fat object stores can be shared between multiple clones, even by
different users
.IP \[bu] 2
can easily support fat object stores distributed across multiple hosts
.IP \[bu] 2
depends only on stock Python and rsync
.SS Related projects
.IP \[bu] 2
git\-annex (http://git-annex.branchable.com) is a far more comprehensive
solution, but with less transparent workflow and with more dependencies.
.IP \[bu] 2
git\-media (https://github.com/schacon/git-media) adopts a similar
approach to \f[C]git\-fat\f[R], but with a different synchronization
philosophy and with many Ruby dependencies.
.SH Installation and configuration
.PP
Place \f[C]git\-fat\f[R] in your \f[C]PATH\f[R].
.PP
Edit (or create) \f[C].gitattributes\f[R] to regard any desired
extensions as fat files.
.IP
.nf
\f[C]
$ cd path\-to\-your\-repository
$ cat >> .gitattributes
*.png filter=fat \-crlf
*.jpg filter=fat \-crlf
*.gz filter=fat \-crlf
\[ha]D
\f[R]
.fi
.PP
Run \f[C]git fat init\f[R] to activate the extension.
Now add and commit as usual.
Matched files will be transparently stored externally, but will appear
complete in the working tree.
.PP
Set a remote store for the fat objects by editing \f[C].gitfat\f[R].
.IP
.nf
\f[C]
[rsync]
remote = your.remote\-host.org:/share/fat\-store
\f[R]
.fi
.PP
This file should typically be committed to the repository so that others
will automatically have their remote set.
This remote address can use any protocol supported by rsync.
.PP
Most users will configure it to use remote ssh in a directory with
shared access.
To do this, set the \f[C]sshuser\f[R] and \f[C]sshport\f[R] variables in
\f[C].gitfat\f[R] configuration file.
For example, to use rsync with ssh, with the default port (22) and
authenticate with the user \[lq]\f[I]fat\f[R]\[rq], your configuration
would look like this:
.IP
.nf
\f[C]
[rsync]
remote = your.remote\-host.org:/share/fat\-store
sshuser = fat
\f[R]
.fi
.SH A worked example
.PP
Before we start, let\[cq]s turn on verbose reporting so we can see
what\[cq]s happening.
Without this environment variable, all the output lines starting with
\f[C]git\-fat\f[R] will not be shown.
.IP
.nf
\f[C]
$ export GIT_FAT_VERBOSE=1
\f[R]
.fi
.PP
First, we create a repository and configure it for use with
\f[C]git\-fat\f[R].
.IP
.nf
\f[C]
$ git init repo
Initialized empty Git repository in /tmp/repo/.git/
$ cd repo
$ git fat init
$ cat > .gitfat
[rsync]
remote = localhost:/tmp/fat\-store
$ mkdir \-p /tmp/fat\-store # make sure the remote directory exists
$ echo \[aq]*.gz filter=fat \-crlf\[aq] > .gitattributes
$ git add .gitfat .gitattributes
$ git commit \-m\[aq]Initial repository\[aq]
[master (root\-commit) eb7facb] Initial repository
2 files changed, 3 insertions(+)
create mode 100644 .gitattributes
create mode 100644 .gitfat
\f[R]
.fi
.PP
Now we add a binary file whose name matches the pattern we set in
\f[C].gitattributes\f[R].
.IP
.nf
\f[C]
$ curl https://nodeload.github.com/jedbrown/git\-fat/tar.gz/master \-o master.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 6449 100 6449 0 0 7741 0 \-\-:\-\-:\-\- \-\-:\-\-:\-\- \-\-:\-\-:\-\- 9786
$ git add master.tar.gz
git\-fat filter\-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324
$ git commit \-m\[aq]Added master.tar.gz\[aq]
[master b85a96f] Added master.tar.gz
git\-fat filter\-clean: caching to /tmp/repo/.git/fat/objects/b3489819f81603b4c04e8ed134b80bace0810324
1 file changed, 1 insertion(+)
create mode 100644 master.tar.gz
\f[R]
.fi
.PP
The patch itself is very simple and does not include the binary.
.IP
.nf
\f[C]
$ git show \-\-pretty=oneline HEAD
918063043a6156172c2ad66478c6edd5c7df0217 Add master.tar.gz
diff \-\-git a/master.tar.gz b/master.tar.gz
new file mode 100644
index 0000000..12f7d52
\-\-\- /dev/null
+++ b/master.tar.gz
\[at]\[at] \-0,0 +1 \[at]\[at]
+#$# git\-fat 1f218834a137f7b185b498924e7a030008aee2ae
\f[R]
.fi
.SS Pushing fat files
.PP
Now let\[cq]s push our fat files using the rsync configuration that we
set up earlier.
.IP
.nf
\f[C]
$ git fat push
Pushing to localhost:/tmp/fat\-store
building file list ...
1 file to consider
sent 61 bytes received 12 bytes 48.67 bytes/sec
total size is 6449 speedup is 88.34
\f[R]
.fi
.PP
We might normally set a remote now and push the git repository.
.SS Cloning and pulling
.PP
Now let\[cq]s look at what happens when we clone.
.IP
.nf
\f[C]
$ cd ..
$ git clone repo repo2
Cloning into \[aq]repo2\[aq]...
done.
$ cd repo2
$ git fat init # don\[aq]t forget
$ ls \-l # file is just a placeholder
total 4
\-rw\-r\-\-r\-\- 1 jed users 53 Nov 25 22:42 master.tar.gz
$ cat master.tar.gz # holds the SHA1 of the file
#$# git\-fat 1f218834a137f7b185b498924e7a030008aee2ae
\f[R]
.fi
.PP
We can always get a summary of what fat objects are missing in our local
cache.
.IP
.nf
\f[C]
Orphan objects:
1f218834a137f7b185b498924e7a030008aee2ae
\f[R]
.fi
.PP
Now get any objects referenced by our current \f[C]HEAD\f[R].
This command also accepts the \f[C]\-\-all\f[R] option to pull full
history, or a revision to pull selected history.
.IP
.nf
\f[C]
$ git fat pull
receiving file list ...
1 file to consider
1f218834a137f7b185b498924e7a030008aee2ae
6449 100% 6.15MB/s 0:00:00 (xfer#1, to\-check=0/1)
sent 30 bytes received 6558 bytes 4392.00 bytes/sec
total size is 6449 speedup is 0.98
Restoring 1f218834a137f7b185b498924e7a030008aee2ae \-> master.tar.gz
git\-fat filter\-smudge: restoring from /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae
\f[R]
.fi
.PP
Everything is in place
.IP
.nf
\f[C]
$ git status
git\-fat filter\-clean: caching to /tmp/repo2/.git/fat/objects/1f218834a137f7b185b498924e7a030008aee2ae
# On branch master
nothing to commit, working directory clean
$ ls \-l # recovered the full file
total 8
\-rw\-r\-\-r\-\- 1 jed users 6449 Nov 25 17:10 master.tar.gz
\f[R]
.fi
.SS Summary
.IP \[bu] 2
Set the \[lq]fat\[rq] file types in \f[C].gitattributes\f[R].
.IP \[bu] 2
Use normal git commands to interact with the repository without thinking
about what files are fat and non\-fat.
The fat files will be treated specially.
.IP \[bu] 2
Synchronize fat files with \f[C]git fat push\f[R] and
\f[C]git fat pull\f[R].
.SS Retroactive import using \f[C]git filter\-branch\f[R] [Experimental]
.PP
Sometimes large objects were added to a repository by accident or for
lack of a better place to put them.
\f[I]If\f[R] you are willing to rewrite history, forcing everyone to
reclone, you can retroactively manage those files with
\f[C]git fat\f[R].
Be sure that you understand the consequences of
\f[C]git filter\-branch\f[R] before attempting this.
This feature is experimental and irreversible, so be doubly careful with
backups.
.SS Step 1: Locate the fat files
.PP
Run \f[C]git fat find THRESH_BYTES > fat\-files\f[R] and inspect
\f[C]fat\-files\f[R] in an editor.
Lines will be sorted by the maximum object size that has been at each
path, and look like
.IP
.nf
\f[C]
something.big filter=fat \-text # 8154677 1
\f[R]
.fi
.PP
where the first number after the \f[C]#\f[R] is the number of bytes and
the second number is the number of modifications that path has seen.
You will normally filter out some of these paths using grep and/or an
editor.
When satisfied, remove the ends of the lines (including the \f[C]#\f[R])
and append to \f[C].gitattributes\f[R].
It\[cq]s best to \f[C]git add .gitattributes\f[R] and commit at this
time (likely enrolling some extant files into \f[C]git fat\f[R]).
.SS Step 2: \f[C]filter\-branch\f[R]
.PP
Copy \f[C].gitattributes\f[R] to \f[C]/tmp/fat\-filter\-files\f[R] and
edit to remove everything after the file name (e.g.,
\f[C]sed s/ \[rs]+filter=fat.*$//\f[R]).
Currently, this may only contain exact paths relative to the root of the
repository.
Finally, run
.IP
.nf
\f[C]
git filter\-branch \-\-index\-filter \[rs]
\[aq]git fat index\-filter /tmp/fat\-filter\-files \-\-manage\-gitattributes\[aq] \[rs]
\-\-tag\-name\-filter cat \-\- \-\-all
\f[R]
.fi
.PP
(You can remove the \f[C]\-\-manage\-gitattributes\f[R] option if you
don\[cq]t want to append all the files being enrolled in
\f[C]git fat\f[R] to \f[C].gitattributes\f[R], however, future users
would need to use \f[C].git/info/attributes\f[R] to have the
\f[C]git fat\f[R] fileters run.) When this finishes, inspect to see if
everything is in order and follow the Checklist for Shrinking a
Repository (http://www.kernel.org/pub/software/scm/git/docs/git-filter-branch.html#_checklist_for_shrinking_a_repository)
in the \f[C]git filter\-branch\f[R] man page, typically
\f[C]git clone file:///path/to/repo\f[R].
Be sure to \f[C]git fat push\f[R] from the original repository.
.PP
See the script \f[C]test\-retroactive.sh\f[R] for an example of
cleaning.
.SS Implementation notes
.PP
The actual binary files are stored in \f[C].git/fat/objects\f[R],
leaving \f[C].git/objects\f[R] nice and small.
.IP
.nf
\f[C]
$ du \-bs .git/objects
2212 .git/objects/
$ ls \-l .git/fat/objects # This is where the file actually goes, but that\[aq]s not important
total 8
\-rw\-\-\-\-\-\-\- 1 jed users 6449 Nov 25 17:01 1f218834a137f7b185b498924e7a030008aee2ae
\f[R]
.fi
.PP
If you have multiple clones that access the same filesystem, you can
make \f[C].git/fat/objects\f[R] a symlink to a common location, in which
case all content will be available in all repositories without extra
copies.
You still need to \f[C]git fat push\f[R] to make it available to others.
.SH Some refinements
.IP \[bu] 2
Allow pulling and pushing only select files
.IP \[bu] 2
Relate orphan objects to file system
.IP \[bu] 2
Put some more useful message in smudged (working tree) version of
missing files.
.IP \[bu] 2
More friendly configuration for multiple fat remotes
.IP \[bu] 2
Make commands safer in presence of a dirty tree.
.IP \[bu] 2
Private setting of a different remote.
.IP \[bu] 2
Gracefully handle unmanaged files when the filter is called (either
legacy files or files matching the pattern that should some reason not
be treated as fat).

1
debian/patches/series vendored Normal file
View file

@ -0,0 +1 @@
# You must remove unused comment lines for the released package.

4
debian/rules vendored Executable file
View file

@ -0,0 +1,4 @@
#!/usr/bin/make -f
%:
dh $@

1
debian/source/format vendored Normal file
View file

@ -0,0 +1 @@
3.0 (quilt)

1
debian/watch vendored Normal file
View file

@ -0,0 +1 @@
version=3