Encrypted Repositories?

Ryan Michael kerinin at gmail.com
Sat Sep 8 14:21:13 CDT 2007


> > Your criticism of my thread-model ambiguity is correct.  My real
> > interest is being able to push and pull changes to a backup server and
> > have the backup server be secure.  The basic threat would be someone
> > gaining unauthorized access to the server - either a hacker or
> > malicious sysadmin.
>
> This is now a third and completely different requirement. This one can
> be accomplished in a reasonably satisfactory way with a change hook that
> makes a tarball of the repo, encrypts it, and ships it.

After some thought, I think this is a pretty good description of what
I'd like to accomplish.  I got a little carried away thinking about
various options, but the core of the problem is secure public storage
that I can push to and pull from in different contexts (home, work,
laptop, etc) to keep my working copies synchronized.  I use source
control for two basic reasons: being able to back out mistakes and to
be able to work on multiple machines without needing to keep a
thumbdrive on me at all times.  I usually work alone, so collaboration
isn't a big priority for me.

That being said, a solution to my problem would be a lot more useful
if it addressed collaborative development.  This seems to require
thinking about your earlier comments regarding the need for a system
of group key management, key distribution and key revocation.

I agree with you that encrypting local working copies is pointless -
all that is really needed is a remote repository storing encrypted
changesets (I don't see any reason for it to have a working copy).  As
you said, this makes the problem one of handling encrypted changesets.

I found some interesting papers regarding cryptographic access control
as an alternative to typical ACL's [1].  The basic idea is that for
each file for which access is being restricted there would exist three
security classes: users with read/write access, users with write
access, and a server capable of verifying changes come from a user
with write access but not able to read the data.  Essentially there
are two components: a public/private keypair (which in this context
can be considered a write permission / permission verification
keypair) and a symmetric encryption key.  data to be protected is
encrypted using the symmetric key and the encrypted file is signed
using the write permission key.  When the encrypted/signed file is
uploaded to the server, the server uses the permission verification
key to verify that the file was created by an authorized user, then
stores the file.  users with read access request the file from the
server, then use the symmetric key to decrypt it.  In this system,
read and write access are conferred by posession of the appropriate
key.

Another idea which was presented in the paper you originally suggested
I read [2] could be very useful: time bound encryption keys.  The
paper discusses this in the context of a heirarchical encryption
scheme, but it would be useful in other contexts as well.  The basic
idea is that the keys used to encrypt data are derived in part using a
time value.

I think these systems would be ideal for an encrypted repository.  The
server would keep a list of permission verification keys (the public
PGP keys of developers with write access).  encryption keys for each
changeset would be generated by appending the revision number to some
random string and hashing it.  This would allow read access to be
granted either indefinitely (by giving out the random base string) or
for a subset of revisions (by giving out the hashed encryption keys
for each revision).  The server would only store encrypted changesets,
and would only accept changesets which were signed by one of the keys
in it's permission list.  key distribution could be handled by the
server as well by using public key encryption on the read keys.

Implementing this would require additional functions for the client
(clone-encrypted for example) and probably some some work to implement
the server.  I'm not sure how many commands would choke on encrypted
data, but it seems like only a small subset would really be needed
(commands such as commit and export would obviously not be relevant).

My opinion on data security is that operating systems should be
trusted to protect private data only as a last resort.

> I'm really not trying to be obnoxious. It's just that we're not going to
> understand the solution until we understand the question.

No apology needed - i wrote in in hopes of useful criticism and yours
has been quite useful.  Please feel free to offer more...

-Ryan

[1] https://www.cs.tcd.ie/publications/tech-reports/reports.03/TCD-CS-2003-28.pdf
[2]http://eprint.iacr.org/2006/225


More information about the Mercurial-devel mailing list