Groking the Mercurial

Essien Ita Essien me at essienitaessien.com
Fri Sep 7 08:18:04 CDT 2007


Hi all,

[long mail warning]

I'm slowly trying to build a working mental model of mercurials innards 
and after some pointers on IRC, and (re(re-)-)reading  the wiki pages, 
there's still a couple of holes left, but i can see some 
not-so-blackness(tm) at the end of the tunnel.

I want to describe my thoughts here and please someone point my brain in 
the right direction when I make a mistake.

If i have a repo:

/path/to/repo
	+ file1
	+ file2


CASE 1
=======
If I modify file1 with file1_mod1.
	- file1_mod1 is referred to as a CHANGE.
	- the state in which file1 currently is in, is called a FILECONTEXT 
(filectx) for file1.
	- if i commit this current state of affairs, a new context file1_ctx1 
will be created for file1 which will have a unique file FILE_NODEID.
	- file1_ctx1 has a log (the FILELOG), which is eventually stored in the 
REVLOG for file1. The FILE_NODEID is part of the log information.

[True or False?]

CASE 2
========
Assuming that in CASE 1 above I also made file2_mod1 to file2 at the 
same time, and committed both of them, I would end up having

	- 2 Changes, which are grouped together into CHANGECONTEXT (changectx).
	
	- The change context contains information on all filecontexts that make 
up that changecontext.

[this should be true. else... i'm pretty damn thick in the cranium :( ]



CASE 3
=======

Anytime A commit is made on /path/to/repo, mercurial checkpoints the 
current changecontext and creates a REVISIONID for it. This REVISIONID 
is a  unique HASH that shows how the CURRENT changecontext relates to 
previous changecontexts (its parents). With this REVISIONID, one can at 
anytime, callback the changecontext that contitutes that particular time 
t on mercurial's timeline.

Actually, the changecontext that make up a REVISION is actually a LIST 
of FILELOGS stored together in a single revlog. This revlog is called 
the MANIFEST and is really a FILENAME NODEID list.

At any revision, the repo is simply made up of each file listed in the 
manifest, consisting of the filecontext specified by the listed NODEID.

At commit time, this revisionid, current manifest, a commit message, 
username, date, etc are all combined to form a single CHANGESET.

[this is the part that confuses me the most]


CASE 4
=======
The changesets that document the timeline and progressive state of the 
repo are all stored in yet another revlog, called the CHANGELOG.



Fallouts Of Cases 1 to 4
=========================

If the above hold true, one should be able to do the following 
pseudocode should be possible:

[its PSEUDOCODE, not real mercurial api functions]

chglog = repo.changelog
revid = 1

chgset = chglog.getchangeset(revid)

commsg = chgset.commitmsg
comuser = chgset.user
manifest = chgset.manifest

# print list of all files that where affected at revision 1
# should print file1 and file2

print manifest.files


chgcontext = manifest.changecontext
file1_context1 = chgcontext.getfilecontext('file1')

#if i want to wipe a revision 10 out of the project history
# i would do

chglog.removechangeset(10)
chglog.save()

# if i want to remove history of file2 throughout the repo
#
for rid in xrange(0, chglog.getmaxrevid() - 1):
	cset = chglog.getchangeset(rid)
	if 'file2' in cset.manifest.files:
		cctx = cset.man.getchangecontext()
		cset.manifest.changecontext.removefilecontext('file2')
		cset.manifest.save()
chlog.save()



[phew... that was a mouthfull... but hopefully i'm on the right track?]







More information about the Mercurial-devel mailing list