[PATCH 1 of 3] largefiles: update design.txt to describe current design

Greg Ward greg-hg at gerg.ca
Sun Oct 23 18:49:24 UTC 2011


# HG changeset patch
# User Greg Ward <greg at gerg.ca>
# Date 1319394091 14400
# Branch stable
# Node ID 29ba04fdcd0dbd03aeedc1623e85b49ff522cbd5
# Parent  c1f912707a0b66c5145a7649b8b7bab20df866d8
largefiles: update design.txt to describe current design

This is my best attempt at describing the design, warts and all. Let's
try to keep this document consistent with the code, at least until
the code is so amazingly clear that the document is unnecessary.

diff --git a/hgext/largefiles/design.txt b/hgext/largefiles/design.txt
--- a/hgext/largefiles/design.txt
+++ b/hgext/largefiles/design.txt
@@ -1,17 +1,18 @@
-= largefiles - manage large binary files =
-This extension is based off of Greg Ward's bfiles extension which can be found
-at http://mercurial.selenic.com/wiki/BfilesExtension.
+= Design of largefiles extension =
+
+See __init__.py for rationale and usage information. This file describes
+how largefiles works under the hood; it's for developers who need to
+understand largefiles rather than for users who need to use it.
 
 == The largefile store ==
 
-largefile stores are, in the typical use case, centralized servers that have
-every past revision of a given binary file.  Each largefile is identified by
-its sha1 hash, and all interactions with the store take one of the following
-forms.
+A largefile store, typically on a centralized server, has every past revision
+of every largefile. Each largefile revision is identified by its SHA-1 hash,
+and all interactions with the store take one of the following forms.
 
--Download a bfile with this hash
--Upload a bfile with this hash
--Check if the store has a bfile with this hash
+-Download a particular largefile revision (by hash)
+-Upload a particular largefile revision (by hash)
+-Check if the store has a largefile with this hash
 
 largefiles stores can take one of two forms:
 
@@ -21,9 +22,8 @@
 == The Local Repository ==
 
 The local repository has a largefile store in .hg/largefiles which holds a
-subset of the largefiles needed. On a clone only the largefiles at tip are
-downloaded. When largefiles are downloaded from the central store, a copy is
-saved in this store.
+subset of the largefiles needed. When largefiles are downloaded from the
+central store, a copy is saved in this store.
 
 == The User Cache ==
 
@@ -33,10 +33,11 @@
 
 == Implementation Details ==
 
-Each largefile has a standin which is in .hglf. The standin is tracked by
-Mercurial.  The standin contains the SHA1 hash of the largefile. When a
-largefile is added/removed/copied/renamed/etc the same operation is applied to
-the standin. Thus the history of the standin is the history of the largefile.
+Each largefile has a standin file in .hglf/. The standin is tracked by
+Mercurial. The standin contains the SHA-1 hash of the largefile contents. When
+a largefile is added/removed/copied/renamed/etc the same operation is applied
+to the standin. Thus the history of the standin is the history of the
+largefile.
 
 For performance reasons, the contents of a standin are only updated before a
 commit.  Standins are added/removed/copied/renamed from add/remove/copy/rename
@@ -47,4 +48,6 @@
 
 A Mercurial dirstate object tracks the state of the largefiles. The dirstate
 uses the last modified time and current size to detect if a file has changed
-(without reading the entire contents of the file).
+(without reading the entire contents of the file). (Unfortunately, the use of
+dirstate limits largefiles to 2 GB. This will hopefully be fixed after
+Mercurial 2.0.)


More information about the Mercurial-devel mailing list