[PATCH 4 of 4 full-series] obsolete: exchange obsolete markers using dedicated wireprotocol commands

Wed Jul 18 19:22:56 CDT 2012

On 19 juil. 2012, at 01:42, Matt Mackall wrote:

> On Wed, 2012-07-18 at 05:27 +0200, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david at logilab.fr>
>> # Date 1342580030 -7200
>> # Node ID f9b94324c38a82477702e8b83292aa3bc2e3c565
>> # Parent  227fb128b2be11ef030856b14390a069c9c4f94b
>> obsolete: exchange obsolete markers using dedicated wireprotocol commands
> 
> I've taken the first three, thanks. But this needs significantly more
> thought, I'm afraid I'm going to have to defer it.
> 
>> This changeset drop pushkeybased exchange in favor of dedicated wireprotocol
>> commands. Pushkey is not designed to exchange high amount of data. For example
>> the http implementation of pushkey use http header and just can't handle
> 
> handle what...?

Can't handles obsolete markers exchange except in very small number.

> Yes, pushkey is indeed not designed to send large amounts of data. But
> it's also obviously not at all designed to send _gigantic values_
> attached to single keys. If you're running into the ~1k or ~100k http
> header limits, it sounds like you're abusing pushkey.

I totally confess that we are abusing pushkey.

> Pushkey will, however, happily process a practically unlimited number of
> small key/value pairs. With our wireproto batching support, this can
> even be moderately efficient.

We could "precursor" as key and marker as value (for this precursors only) will be very inefficient:
- We don't have batch now
- This means O(markers) transaction opened and closed at each push

We could artificially split markers in sensible block size small enough for http:
- dump1: 0-300th 
- dump2: 300-600th
- dump…
Is that your suggestion? This would work but be even worse in terms on abusing pushkey.

At any rate, pushing obsolete markers as multiple keys means pushing obsolete marker in multiple transaction.
This sound like a very bad idea given the headache we already have:
- unbundle is done in its own transaction
- phase push are done in their own transaction (each)
- obsolete markers are pushed in another transaction.

Trying to hook on the right event to do the right thing on incoming changes is already pretty complicated. having even more of those independent transactions seems a bad idea.

-- 
Pierre-Yves