Bigfiles Extension

This extension is not distributed with Mercurial.

Author: Andrei Vermel

Repository: http://bitbucket.org/avermel/bigfiles/

Overview

Support versions of big files with storage outside hg repo.

This is useful for several reasons.

Implementation

Big files are not put to hg repo. They are listed in a file called '.bigfiles', which also serves as an ignore file similar to .hgignore, so they do not clutter output of hg commands. The file also stores check sums of the big files in a form of comments. File '.bigfiles' is versioned by hg, so each changeset knows which big files it uses from the names and checksums. The file can be diffed and merged, which is nice.

The versions of big files are stored in a versions directory, with checksums attached to filenames.

Usage

Configuration

Configure your .hgrc to enable the extension by adding following lines:

[extensions]
bigfiles = path/to/bigfiles.py

[bigfiles]
repo = path/to/versions/dir 

Sample Usage Session

# Checking in a big file to a new repo:
C:\tmp>ls -sh
total 9.8M
9.8M windows-essential-20071007.zip

C:\tmp>hg init

C:\tmp>hg stat
? windows-essential-20071007.zip

C:\tmp>hg add
adding windows-essential-20071007.zip
windows-essential-20071007.zip: files over 10MB may cause memory and 
performance problems
(use 'hg revert windows-essential-20071007.zip' to unadd the file)
# The warning from hg means that the file can be controlled by 'bigfiles'

# Get a reminder that we need to specify which dir to use to store the 
big files.
C:\tmp>hg bstat
abort: bigfiles.repo path not configured

# Let's create a repo for the big files in a new dir somewhere.
C:\tmp>mkdir ..\tmp_bigrepo

C:\tmp>echo [bigfiles] > .hg\hgrc
C:\tmp>echo repo =c:/tmp_bigrepo >> .hg\hgrc
C:\tmp>cat .hg\hgrc
[bigfiles]
repo =c:/tmp_bigrepo

# Bstat shows that a big file is about to be added.
C:\tmp>hg bstat
A windows-essential-20071007.zip

# Put it under control of 'bigfiles'
C:\tmp>hg bref
forgetting windows-essential-20071007.zip

# now a file .bigfiles is created
C:\tmp>ls .bigfiles
.bigfiles

C:\tmp>cat .bigfiles
syntax: glob

windows-essential-20071007.zip#fc8fb93abeb53fe301594fe6463c0ac2436c59f8

# We want to keep '.bigfiles' revisioned by hg
C:\tmp>hg add
adding .bigfiles

# Complete checking in of the big file - hg only stores '.bigfile', not 
# the windows-essential-20071007.zip
C:\tmp>hg ci -m "Added a big file"

# The big file is actually versioned in the big files repo
C:\tmp>ls ../tmp_bigrepo
windows-essential-20071007.zip.fc8fb93abeb53fe301594fe6463c0ac2436c59f8

# The big file gets some modification
C:\tmp>echo "qqq" >> windows-essential-20071007.zip

C:\tmp>hg bstat
M windows-essential-20071007.zip

# Put modified file under control of 'bigfiles'
C:\tmp>hg bref

C:\tmp>hg stat
M .bigfiles

C:\tmp>hg diff
diff --git a/.bigfiles b/.bigfiles
--- a/.bigfiles
+++ b/.bigfiles
@@ -1,3 +1,3 @@
 syntax: glob

-windows-essential-20071007.zip#fc8fb93abeb53fe301594fe6463c0ac2436c59f8
+windows-essential-20071007.zip#4187f81bc4fe5fe6c9586a8481cec4179ac63aa0

C:\tmp>hg ci -m "Modified a big file"

C:\tmp>ls ../tmp_bigrepo
windows-essential-20071007.zip.4187f81bc4fe5fe6c9586a8481cec4179ac63aa0
windows-essential-20071007.zip.fc8fb93abeb53fe301594fe6463c0ac2436c59f8

C:\tmp>hg log
changeset:   1:4e312e37f18c
tag:         tip
user:        Andrei Vermel <avermel@mail.ru>
date:        Thu Sep 24 23:49:23 2009 +0400
summary:     Modified a big file

changeset:   0:f6eafba99057
user:        Andrei Vermel <avermel@mail.ru>
date:        Thu Sep 24 23:47:37 2009 +0400
summary:     Added a big file

# Check out the previous revision - the big file gets fetched from the repo
C:\tmp>hg co -r 0
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
fetching 
windows-essential-20071007.zip.fc8fb93abeb53fe301594fe6463c0ac2436c59f8

You may also have interest in LargefilesExtension and SnapExtension.


CategoryExtensionsByOthers

BigfilesExtension (last edited 2012-02-15 19:21:08 by ks3095497)