Git Large File Storage (LFS) allows for versioning files larger than 100MB on GitHub. While this seems like a no brainer at first glance, it also comes with some draw backs. Thus, here’s a rough guide on howto set-up Git-LFS and uninstall from a project again.
GitHub limits the size of files allowed in repositories to up to
100MB. For working with larger files like data sets or binaries such as
MARC record sets, we have to find a solution around that limit. In comes
Git Large File Storage
(LFS)↗.
Git-LFS is an open source Git extension for versioning files above 100MB
by replacing them with text pointers inside Git, while storing the file
contents outside of the normal Git project on a remote server like
GitHub.com.
see more info here: https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github
Git Bash
$ git lfs install
> Git LFS initialized.
cd
into the repository’s directory we’d like to use
with Git-LFS.# e.g. associate all .ZIP files with Git-LFS:
$ git lfs track "*.zip"
> Adding path *.zip
$ git lfs track --filename [path to file]
> Tracking "[path to file]"
$ git add [path to file]
$ git commit -m "update MARC"
$ git push origin main
list all the (large) files manage by Git-LFS.
$ cd [path to repository]
$ git lfs ls-files
$ git lfs push --all origin
So while on paper we get the benefit of being able to handle 100MB+ files, Git-LFS also suddenly adds limitations to your repository’s total size as well as to the bandwidth, i.e. 1GB each, resulting in the following error message:
Uploading LFS objects: 0% (0/1), 0 B | 0 B/s, done. batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
This can happen really fast, especially if you have lots of med sized
files and work not as organized or efficient in your repository … as an
librarian maybe…
Therefore, it is also good to know how to uninstall Git-LFS and start
over more organized moving forward.
The other solution would be a paid subscription.
Simply removing the files from the project does not work, as the Git-LFS objects still exist on the remote storage and will continue to count toward the Git-LFS storage quota. To remove Git-LFS objects from a repository, delete and recreate the repository.
$ git lfs uninstall
$ git lfs ls-files | sed -r 's/^.{13}//' > lfs_files.txt
while read line; do
git rm --cached "$line"
done < lfs_files.txt
while read line; do
git add "$line"
done < lfs_files.txt
$ git add .gitattributes
$ git commit -m "de-lfs"
$ git push origin
$ git lfs ls-files
$ rm -rf .git/lfs lfs_files.txt
$ git lfs uninstall
# then manually remove the LFS filters from .gitattributes
$ git lfs untrack "*.zip"
$ git add --renormalize
$ git commit -m "de-lfs"
$ git push origin
For attribution, please cite this work as
Schmalfuss (2022, Feb. 28). OS DataMercs: rough guide to Git Large File Storage (LFS). Retrieved from https://www.datamercs.net/posts/2022-02-28-rough-guide-to-git-large-file-storage-lfs/
BibTeX citation
@misc{schmalfuss2022rough, author = {Schmalfuss, Olaf}, title = {OS DataMercs: rough guide to Git Large File Storage (LFS)}, url = {https://www.datamercs.net/posts/2022-02-28-rough-guide-to-git-large-file-storage-lfs/}, year = {2022} }