Bug 1025 - Should be able to retreive logs from remote http repository
Summary: Should be able to retreive logs from remote http repository
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: unspecified
Hardware: All All
: normal feature
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-03-09 03:02 UTC by Chris Turner
Modified: 2012-05-13 05:00 UTC (History)
5 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Turner 2008-03-09 03:02 UTC
Trying to run

hg log -R http://localhost/hg/myrepo

gives the following error:

abort: repository 'http://localhost/hg/myrepo' is not local.

Any way to get this information without having to clone the repo?
Comment 1 Adrian Buehlmann 2008-03-09 04:24 UTC
Use "hg serve" and a webbrowser as a workaround.
Comment 2 Chris Turner 2008-03-09 04:43 UTC
I'm actually serving the repo with lighttpd, so browsing it manually is
available. The problem is that the only way to do this programatically (as far
as I can tell) is to screen scrape... Obviously not ideal.

I'm currently working around this by just doing a pull and log on a local
instance, but due to size constraints this is not a long term solution...
Comment 3 Adrian Buehlmann 2008-03-09 08:33 UTC
BestFriendChris: You could use "hg incoming -f" (see "hg help incoming") and
compare the remote repo with an empty local repo:

> mkdir emptyrepo
> cd emptyrepo
> hg init
> hg incoming -f http://www.selenic.com/repo/hello
comparing with http://www.selenic.com/repo/hello
changeset:   0:0a04b987be5a
user:        mpm@selenic.com
date:        Fri Aug 26 01:20:50 2005 -0700
summary:     Create a standard "hello, world" program

changeset:   1:82e55d328c8c
tag:         tip
user:        mpm@selenic.com
date:        Fri Aug 26 01:21:28 2005 -0700
summary:     Create a makefile

I know, not exactly what you want, but at least that's currently available and
"hg incoming" provides "--style" and "--template" options like "hg log".

What's missing is, for example, "-l" ("--limit").

Maybe a new command "hg rlog" (remotelog) could be implemented?
Comment 4 Adrian Buehlmann 2008-03-09 10:22 UTC
Actually, option -f is unneeded here, since the local repo is empty anyway.
-n lists the csets in reverse order (same order as hg log):

> mkdir emptyrepo
> cd emptyrepo
> hg init
> hg incoming -n http://www.selenic.com/repo/hello
comparing with http://www.selenic.com/repo/hello
changeset:   1:82e55d328c8c
tag:         tip
user:        mpm@selenic.com
date:        Fri Aug 26 01:21:28 2005 -0700
summary:     Create a makefile

changeset:   0:0a04b987be5a
user:        mpm@selenic.com
date:        Fri Aug 26 01:20:50 2005 -0700
summary:     Create a standard "hello, world" program
Comment 5 Matt Mackall 2008-03-09 15:57 UTC
As it happens, hg incoming is pulling the very same data stream that hg pull is.
That's how it's able to implement -p. It's potentially quite expensive.

Log is a much more complex and powerful command and so we can either:

a) provide a heavily crippled remote log command and frustrate users
b) provide a remote log that downloads an entire repo (and then deletes it)
c) convince people that having local mirrors is the smart thing to do

I chose option (c) about two years ago. Marking resolved.
Comment 6 Adrian Buehlmann 2008-03-09 16:49 UTC
Interesting. So "hg incoming" comparing to an empty repo transfers the same
data as a full "hg clone". Sounds like a bad idea not to save that on disk.

Probably another stupid idea: What about doing an incomplete "hg clone", which
pulls nothing but the changelog? ("hg clone --changelog").

All commands that would need more info than what's in the changelog (e.g. diffs)
would then abort asking the user to complement the cloning first.

Next step to heaven (or hell) might then be lazy cloning of file revlogs.
Comment 7 Matt Mackall 2008-03-09 17:11 UTC
It is a bad idea not to save that on disk. That's why there's a --bundle option.
But it's also pretty silly to hg incoming with an empty repo in the first place.

Your second idea is identical to option (a): provide a crippled log command. I'm
quite sure we'd get 10 times as many questions/complaints about that as we do
about the current situation.

And your last idea will be hopelessly slow (a server request and round trip per
file).

There's a fundamental tension between making a tool that's designed to be fast
but has some limits to its flexibility, and a tool that's designed to flexible,
and if you happen to use it in particular non-obvious ways, might also be fast.

You cannot have both. If you aim for the former, you have to supply people with
workarounds. If you aim for the latter, you'll have a never ending chorus of
"this thing is too slow" and you'll end up writing and maintaining special
performance hacks for each corner case which will probably never be comparable
in performance to the workarounds!
Comment 8 Adrian Buehlmann 2008-03-09 17:24 UTC
Sorry for my silly ideas :-). And thanks for the explanations.
Comment 9 Chris Turner 2008-03-09 19:58 UTC
The thing I don't understand is, if I'm speaking to hgweb.cgi on a remote 
machine, it should be able to do a 'hg log' for me and stream back the response 
(as binary, xml, whatever). Why would this be bad/inefficient?

I agree that using 'hg incoming' for this purpose is a horribly inefficient 
hack, but that's not what I want to do anyway... Just have the remote repository 
do the hg log for me (with all the same arguments hg log supports)
Comment 10 Matt Mackall 2008-03-10 00:09 UTC
Imagine I send a log -p command to http://www.kernel.org/hg/linux-2.6, which is
approaching 100k changesets. At one diff per second (lots of seeking), this will
take about 3 hours of CPU/disk time on the server, nevermind metric tons of
bandwidth. It would be faster and simpler for everyone just to clone the repo
and do the log locally.

In a distributed system, you do all the expensive work on the clients. Otherwise
you don't scale.

The deeper question is: why are you trying to run log remotely? For most use
cases, you'll generally want to be able to checkout, which means having a local
copy. If you're doing some web automation thing, you might investigate the raw
links that give you fairly efficient access to individual cset info:

http://selenic.com/hg/raw-rev/d43c94414ba1
Comment 11 Chris Turner 2008-03-10 02:49 UTC
That's fair... I've never worked in mercurial with a repository that size, so I 
wasn't aware of how long a changset could take to be generated.

My use case is for working with a continuous integration server. The CI server 
needs to know when there are new changes in the repository, but since the builds 
will not be run on that machine (and it's heavily disk space constrained) I was 
hoping to just ask our shared repo if there are any changes between two dates 
(using 'hg log --date "previous_built_date to now" ...').

I'll probably just hack together some sort of remote hg command runner, that 
streams the output back to the CI server to take care of my problem.
Comment 12 Adrian Buehlmann 2008-03-10 06:36 UTC
I do heave the impression that my "hg clone --changelog" idea (see below)
would be equivalent to
http://www.selenic.com/mercurial/wiki/index.cgi/PartialClone
for the boundary case where all files are ignored.

Just for reference.
Comment 14 Matt Mackall 2008-03-10 11:05 UTC
abuehl: What would that show? One changeset? Ten? 100?

Chris: hg id -r tip <URL> will tell you what the tipmost changeset is very
efficiently. When it changes, you have new changesets.
Comment 15 Chris Turner 2008-03-10 20:22 UTC
Thanks for the tip. I'll probably use that to help with performance, but the 
constraints of the CI server are that it needs to know what the modifications 
are (including file adds, removes, etc). --template works like a dream for this, 
so that's what I'm using.

I'll admit that I'm probably in the minority for this use case, so I'll just 
assume that this won't happen for the reasons you specified. No worries.
Comment 16 Adrian Buehlmann 2008-03-25 18:17 UTC
I made a patch for the raw templates described here:
http://selenic.com/pipermail/mercurial-devel/2008-March/005373.html
(in case somebody is willing to patch his Mercurial installation for this)
Comment 17 Jari Aalto 2011-04-24 07:21 UTC
FAQ 4.23. How can I do a "hg log" of a remote repository?

".. You can't. ... This is a very deliberate explicit design decision made 
by project leader Matt Mackall (mpm)"

- - -

Please reconsider. Many of the other VCSs offer this ability.

The log command is one of those few commands that is useful to run at 
remote. E.g to see what has changed since X, or durign X-Y. It's 
impractical to need to download whole repository before getting access to 
this information. Not to mention consumed quote, amount of downloaded data, 
and network bandwidth consumed; when the reader is not a developer and does 
not have any other use for it on his disk.
Comment 18 Jari Aalto 2011-04-24 07:25 UTC
[restored: resolved]
Please excuse, wrong issue number.
Comment 19 Bugzilla 2012-05-12 08:48 UTC

--- Bug imported by bugzilla@serpentine.com 2012-05-12 08:48 EDT  ---

This bug was previously known as _bug_ 1025 at http://mercurial.selenic.com/bts/issue1025