encode/decode filter hooks (was: Fun stuff in tip)

Thu Sep 15 15:34:53 CDT 2005

On Thu, Sep 15, 2005 at 08:55:20AM -0400, Kevin Smith wrote:
> Matt Mackall wrote:
> >I've also finished up the file filtering code. This allows you to
> >specify arbitrary file filtering for checkin/checkout in hgrc, eg:
> >
> >[encode]
> >*.gz = gunzip
> >
> >[decode]
> >*.gz = gzip
> >
> >This can also be used to handle line ending issues via
> >dos2unix/unix2dos and expansion of variables ala CVS.
> 
> Sounds cool, but will need some good documentation.
> 
> If I understand your example above, any .gz files in the project would 
> be decompressed before being handled by hg, which would improve the 
> chances of delta storage saving space.
> 
> Fortunately, gzip is smart enough not to gzip or ungzip something twice. 
> Apparently dos2unix/unix2dos is as well. Not all tools will be, and 
> those tools pose a danger, especially when a new rule is added to an 
> existing repo. As long as the in-repo files are already in the encoded 
> format, it's ok. But if they are currently stored in the decoded format, 
> and the decoder can't detect that, they will end up getting double-decoded.

> Is there any mechanism to do a one-time conversion to apply a new 
> [encode] rule to all the files in an existing repo?

Not as such, no.

> Also, is it true that these rules can only be applied via filename 
> pattern matching?

Yes.

> (As opposed to being detected by the 'type' or 'file' 
> command.) If so, then things like README will have to be filtered by its 
> specific name, which is ok. I assume these patterns can contain 
> directory names, which will help.

They can't contain ' ' or '='.

-- 
Mathematics is the supreme nostalgia of our time.