[PATCH 1 of 5] util: add an LRU cache dict
Greg Ward
greg at gerg.ca
Fri Feb 8 16:38:43 CST 2013
On 08 February 2013, Siddharth Agarwal said:
> # HG changeset patch
> # User Siddharth Agarwal <sid0 at fb.com>
> # Date 1360351263 0
> # Node ID 642bf4f92100a41182de15646e37db70b391bfeb
> # Parent 08e00496e7b3bda8db3fbe7084a013d77e4932d2
> util: add an LRU cache dict
>
> This is the bare minimum dictionary implementation needed for an upcoming
> patch.
Tests? This should be doc-test-able.
> diff --git a/mercurial/util.py b/mercurial/util.py
> --- a/mercurial/util.py
> +++ b/mercurial/util.py
> @@ -211,6 +211,30 @@ except AttributeError:
> del self[i]
> break
>
> +class lrucachedict(object):
> + '''cache most recent get or sets to this dictionary'''
> + def __init__(self, size):
> + self._cache = {}
> + self._size = {}
It's a dict...
> + self._order = deque()
> +
> + def __getitem__(self, key):
> + self._order.remove(key)
> + self._order.append(key)
> + return self._cache[key]
> +
> + def __setitem__(self, key, value):
> + if key not in self._cache:
> + if len(self._cache) > self._size:
...but you're comparing it to an int???
Also, wouldn't _maxsize be a better name?
Is dequeue the right data structure here? I recently implemented an
LRU cache using heapq to record the "fetch time" of each entry.
__getitem__() incremented the age of the whole cache, counted in
fetches, and set the fetch time of the fetched key to 0. So with every
fetch, every other key becomes a bit older. I *think* that adds O(1)
overhead to __getitem__().
Then __setitem__() can lookup the oldest key in O(lg N) time when the
cache is full.
How does that compare to dequeue? What about memory use?
Greg
--
Greg Ward http://www.gerg.ca
<greg at gerg.ca> @gergdotca
More information about the Mercurial-devel
mailing list