"hg archive" creates archive files with timestamp in GMT. "tar" archives seem to be extracted by "tar" command with local timezone offset of the environment where archive files are EXTRACTED. But "zip" archives seem to be extracted without local timezone offset: at least, "unzip" on Linux and Explorer on Windows 7 extract in such manner. For example, the zip file archiving the changeset commited at "2012-08-27 19:44 +0900" creates files with "2012-08-27 10:44" timestamp. In one of the most usual usecase, commiting changesets, archiving files in one of them and extracting archived files are done in the same timezone area. So, files with GMT timestamp look strange in such case. Of course, each actions (commiting/archiving/extracting) may be done in different timezone areas. But in such case, many of users seem not to mind about timestamp of extracted files. So, "zip" archives should be created with not GMT but localtime of the environment where archive files are CREATED. Or, should they be choosable as archive type ? - "zip" for localtime, and "gmtzip" for GMT, or - "zip" for GMT, and "ltzip" for localtime
A slightly different perspective on this story: The zip format do not have any awareness of timezones in its timestamps. The timezone must thus somehow be handled 'manually' when creating or extracting zip files. The traditional wisdom has been to use the local timezone in both cases. The timezone setting on modern systems is however only a matter of how timestamps should be displayed locally. Timers and timestamps "always" uses UTC. Using the local timezone setting when creating archives is not really an option - especially not in a VCS where we try hard to track data correctly and consistently and make them reproducible. The right way to handle zip archives is thus to use the UTC timezone when creating or extracting zips. Use something like "TZ=UTC unzip foo.zip" on unix. ... BUT that is a bit unfortunately that we have to claim that Mercurial is the only tool that use the zip format correctly :-(
(In reply to comment #1) Thank you for your comments, kiilerix. Please let me confirm about cause of this problem: - "tar" extracting implementations ignore TIMEZONE information, so extracted files has timestamp in GMT: and "ls -l" shows it with local timezone offset, so users can see appropriate datetime - some (or many ?) "zip" extracting implementations cares about local timezone offset INCORRECTLY, so extracted files has timestamp not in GMT (maybe not in localtime, too): this causes wrong datetime of extracted files. So, "zip" extracting implementations seem to be responsible to this timestamp problem. But for many of users extracting "zip" archives created by "hg archive", Mercurial seems to be responsible to this timestamp problem, even though "zip" extracting implementations should be so in fact. In addition to it, if users understood that "zip" should be extracted with "TZ=UTC", there is no easy way to specify "TZ=UTC" for extracting from Explorer on Windows, isn't it ? So, what about adding new "lt.zip" archive type to create "zip" archives with timestamp not in GMT but in localtime ? This seems to be suitable for one of the most usual usecases: committing/archiving/extracting in same timezone offset area.
(In reply to comment #2) > - "tar" extracting implementations ignore TIMEZONE information, > so extracted files has timestamp in GMT: and "ls -l" shows it > with local timezone offset, so users can see appropriate > datetime I think it is important to get this right: tar like most filesystems and systems stores timestamps in UTC, not "ignoring" the timezone as a kind of error but acknowledging that timestamps must be stored in a globally unambigious way and that the local timezone doesn't matter for this purpose. (btw: Mercurial commits will in addition to this timestamp also keep a record of which timezone the user was using.) The timestamp situation might be more complex on Windows. And yes, it is a fact that zip just is different and doesn't have the means to do the right thing. Mercurial might have to adapt to that somehow.
I'm going to set this to WONTFIX. As it happens, unzip(1) on Unix systems does the right thing today. So any change we make here to fix Windows by default will break Unix. That would be a regression, which would make this change a net step backwards. Also note that there are almost certainly some set of Windows archive utilities that also get this right. Judging from web searches, Winzip seems to be one of them, 7-zip seems to not be. Which leaves us with adding some form of command line option that will probably create more confusion than it solves ('this option will help only if you are creating zip files that are only going to be extracted on Windows only in the same time zone you're currently in only with broken extractors').
My testing showed that unzip(1) didn't do the right thing. Further testing shows that we both are right. For hg archives the timestamp do depend on the timezone: $ hg archive hg-x.zip $ rm -rf x; TZ=UTC unzip -q hg-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 30 00:44 x/foo $ rm -rf x; TZ=UTC-1 unzip -q hg-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 29 23:44 x/foo For zip(1) archives the timezone doesn't matter: $ zip -rq zip-x.zip x $ rm -rf x; TZ=UTC unzip -q zip-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 29 23:44 x/foo $ rm -rf x; TZ=UTC-1 unzip -q zip-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 29 23:44 x/foo But it do depend on the timezone if we don't use extra file attributes: $ zip -rqX zip-x.zip x $ rm -rf x; TZ=UTC unzip -q zip-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 30 2012 x/foo $ rm -rf x; TZ=UTC-1 unzip -q zip-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 30 00:44 x/foo $ rm -rf x; TZ=UTC-3 unzip -q zip-x.zip ; ls -l x/foo -rw-r--r--. 1 mk mk 4 Aug 29 22:44 x/foo It seems like the situation could be improved by somehow using extra file attributes when Mercurial creates zips.
I can also confirm that zip archive file with extended attribute (created by zip on Unix) can be extracted with expected timestamp by unzip on Unix and Explorer on Windows. So, adding extended attribute seems to resolve this problem. But according to my quick looking at Python zipfile module source (of Python 2.6), there is no way to add/record extended file attribute to zip archive file in it. Do I just overlook ?
(In reply to comment #6) > But according to my quick looking at Python zipfile module > source (of Python 2.6), there is no way to add/record extended > file attribute to zip archive file in it. I think you are right. It should probably be investigated/reported/discussed upstream. There might be a workaround or monkey patch that could make it work and make it feasible to fix this issue.
Reopening.
Created attachment 1693 [details] Adding extended timestamp extra field to solve
(In reply to comment #6)
@Jun: could you post you patch on mercurial-devel mailing list using patchbomb extension? Patches cannot be reviewed on the bug tracker http://mercurial.selenic.com/wiki/ContributingChanges
(In reply to comment #6) > I can also confirm that zip archive file with extended attribute > (created by zip on Unix) can be extracted with expected timestamp > by unzip on Unix and Explorer on Windows. We can use `ZipInfo.extra` to create the attribute. http://docs.python.org/library/zipfile.html#zipfile.ZipInfo.extra I think that the attribute is "Extended Timestamp Extra Field". http://www.opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld I confirmed the attached patch, http://bz.selenic.com/attachment.cgi?id=1693, works well to me with UnZip 5.52 on CentOS 5 and 7-zip 9.20 on Windows XP. BTW, > Add an attachment (do not attach patches, please!) Sorry about attaching the patch....
Fixed by http://selenic.com/repo/hg/rev/133d13e44544 FUJIWARA Katsunori <foozy@lares.dti.ne.jp> archival: add "extended-timestamp" extra block for zip archives (issue3600) Before this patch, zip archives created by "hg archive" are extracted with unexpected timestamp, if TZ is not configured as GMT. This patch adds "extended-timestamp" extra block to zip archives, and unzip will extract such archives with timestamp specified in added extra block, even though TZ is not configured as GMT. Please see documents below for detail about specification of zip file format and "extended-timestamp" extra block: http://www.pkware.com/documents/casestudies/APPNOTE.TXT http://www.opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld Original implementation of this patch was suggested by "Jun Omae <jun66j5@gmail.com>". (please test the fix)