[Gsoc - 2016] Allow largefiles to be at a different location

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Sun Mar 6 10:51:34 EST 2016


At Fri, 04 Mar 2016 22:08:38 -0500,
Matt Harbison wrote:
> 
> On Fri, 04 Mar 2016 06:10:28 -0500, Piotr Listkiewicz  
> <piotr.listkiewicz at gmail.com> wrote:
> 
> >>
> >> I'm not sure if there's a lot of teaching value in this, but maybe
> >> consider replacing some of the os.path.* code in largefiles with the vfs
> >> layer.  There might be enough there that you can see how it looks up  
> >> files
> >> in the store, and so forth.  If nothing else, it will give you  
> >> something to
> >> do when looking at the code.
> >>     https://www.mercurial-scm.org/wiki/WindowsUTF8Plan
> >
> >
> > I would like to do it, but i need guidance.
> >
> > In function lfutil.storepath(
> > https://selenic.com/hg/file/e00e57d83653/hgext/largefiles/lfutil.py#l175)
> > is returned path to the largefile directory in .hg, it is used nearly
> > everywhere else as base path.
> 
> I'm not sure I follow.  A quick reading of the code looks like the callers  
> use this as-is: an absolute path.

AFAIK, this is just for historical reason.

At the initial release of largefiles as bundled extension, "vfs"
mechanism isn't yet introduced, and "opener" is just used to open for
file read/write.

Therefore, there are many code paths composing absolute path for file
API other than read/write.

BTW, "lfutil.storepath()" is used in many code paths, and refactoring
it requires knowledge about largefiles, IMHO. If you try
WindowsUTF8Plan for your understanding largefiles, narrow scope change
is easier, at first. For example:

  - replace invocation of os.path.* + repo.wjoin()-ed path by vfs method
  - replace lfutil.hashfile() invocations by lfutil.hashrepofile() if possible
  - make getexecutable() use vfs for file API
    (and make callers of it pass not repo.wjoin()-ed path to it)
  - make updatestandin() fully work with repo relative path


> > This method could be refactored to return vfs object - which would be  
> > used
> > instead of invoking for example util.makedirs, but i have no idea how
> > should i do it properly
> 
> I'm a bit unclear on the plan for this.  The wiki says the plan is to get  
> rid of util.* on repo _relative paths_.  Somewhere in the last few years,  
> I got the impression from the mailing list that util.* methods were fine  
> (I can't find a citation for this).  If this future unicode layer is to be  
> used unconditionally on Windows, I'm not sure why the appropriate util.*  
> methods can't be replaced, similar to how util is assigned methods from  
> posix.py or windows.py.
> 
> I don't recall exactly why I didn't use a vfs object here.  But it may  
> have been because you may need a vfs object relative to this repo's  
> .hg/largefiles, or the share source's cache, or the standalone user cache  
> directory.  I wasn't sure if it was good to have more (rarely used) vfs  
> fields in localrepo for sharing and largefiles, so I guess I punted.
> 
> I think foozy did a bunch of vfs stuff in the last year or so, so I Cc'd  
> him.
> 
> > ( i also dont understand why this function
> > returns repo.vfs.reljoin(repo.sharedpath, longname, hash)
> > or repo.join(longname, hash)  - i just dont get the difference).
> 
> repo.join() will give you an absolute path, after you give it repo  
> relative path components: "/path/to/repo/$longname/$hash".
> 
> repo.sharedpath is already an absolute path (to the other local repo that  
> is the source of the sharing).  reljoin() just adds the given parts  
> together, without the implicit '/path/to/repo' prefix.  So you end up with  
> "/path/to/shareparent/.hg/$longname/$hash".
> 
> The other thing to pay attention to as you read the vfs code is that  
> repo.vfs is relative to "/path/to/repo/.hg" (repo data), and repo.wvfs is  
> relative to "/path/to/repo" (working directory).
> 
> > I would appreciate any hints or guidance.
> >
> > This is more advanced, but would be nice to have in some form:
> >>     https://bz.mercurial-scm.org/show_bug.cgi?id=4242
> >> (My recollection of this is that it wants to verify the files upstream  
> >> on
> >> the default path instead of touching anything locally.  I think I end up
> >> using '--config paths.default=' to force it to verify locally.)
> >
> >
> > I sent patch to it, can you take a look and do code-review?
> 
> Sorry, I was too busy to get to it earlier.
> 
> Another thing I ran into today testing that patch is that it keeps  
> prompting for a password when I verify against a password protected https  
> server.  Whatever it is doing, it isn't reusing the same connection for  
> each file it fetches.  Usually it isn't an issue for me, but the Windows  
> source install doesn't know how to access the keyring extension shipped  
> with thg.  That sort of connection management might be a good thing to  
> understand for what you want to do.
> 
> >
> >> Is there some specific area you are wondering about?  Maybe look for
> >> commits that start with 'largefiles:'.  I fixed a series of bugs about a
> >> year ago or so, and tried to leave enough in the commit comment to  
> >> explain
> >> what was wrong or how something works.
> >
> >
> > Now im trying to understand how largefile works in big picture(and inner
> > workings of it) , so i have no specific area at this moment.
> >
> > 2016-03-01 4:05 GMT+01:00 Matt Harbison <mharbison72 at gmail.com>:
> >
> >> On Mon, 29 Feb 2016 09:57:21 -0500, Piotr Listkiewicz <
> >> piotr.listkiewicz at gmail.com> wrote:
> >>
> >> Hello,
> >>> I am Piotr Listkiewicz (nicknamed liscju), Computer Science student  
> >>> from
> >>> Cracow in Poland, maybe some of you remembers me from 3.6 Sprint in
> >>> London.
> >>>
> >>> I am interested in working on "Allow largefiles to be at a different
> >>> location" project, but i need guidance.
> >>>
> >>> First of all ,are there any easy bugs that you would recommend for the
> >>> newcomer for largefile extension for familiarizing myself with the  
> >>> source
> >>> code?
> >>>
> >>
> >> I don't think there are any easy bugs left.  The wrapping done by the
> >> extension can make things surprisingly complicated.  There are a few
> >> additional archived (hidden) largefile bugs on bz, but I'm not sure they
> >> are easy either.
> >>
> >> I'm not sure if there's a lot of teaching value in this, but maybe
> >> consider replacing some of the os.path.* code in largefiles with the vfs
> >> layer.  There might be enough there that you can see how it looks up  
> >> files
> >> in the store, and so forth.  If nothing else, it will give you  
> >> something to
> >> do when looking at the code.
> >>
> >>     https://www.mercurial-scm.org/wiki/WindowsUTF8Plan
> >>
> >> This is more advanced, but would be nice to have in some form:
> >>
> >>     https://bz.mercurial-scm.org/show_bug.cgi?id=4242
> >>
> >> (My recollection of this is that it wants to verify the files upstream  
> >> on
> >> the default path instead of touching anything locally.  I think I end up
> >> using '--config paths.default=' to force it to verify locally.)
> >>
> >> Secondly , are there any other documents that didn't mentioned at
> >>> https://www.mercurial-scm.org/wiki/SummerOfCode/Ideas2016 that would be
> >>> helpful for familiarizing myself with largefile and project subject in
> >>> general?
> >>>
> >>
> >> The wiki is pretty good at a high level:
> >>
> >>     https://www.mercurial-scm.org/wiki/LargefilesExtension
> >>
> >> The "magic" of this extension is mostly that it patches up the matcher
> >> object and hands off to core Mercurial to do most of the work, in most
> >> situations.  e.g. if the user does `hg add --large foo`, the largefiles
> >> add() code changes the matcher to contain '.hglf/foo', remove 'foo', and
> >> then passes it into the core add() function.  (All without touching  
> >> normal
> >> file references, if any, of course.)
> >>
> >> Is there some specific area you are wondering about?  Maybe look for
> >> commits that start with 'largefiles:'.  I fixed a series of bugs about a
> >> year ago or so, and tried to leave enough in the commit comment to  
> >> explain
> >> what was wrong or how something works.
> >>
> >> Unfortunately, I don't know anything about the wire protocol to help you
> >> there.
> >>
> >>
> >> I would be interested in any piece of advice what should i do, how to  
> >> start
> >>> working on the project and all relevant information as well.
> >>>
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp


More information about the Mercurial-devel mailing list