Keyed Collection in Python?

Keyed Collection in Python? - python

Is there any equivalent to KeyedCollection in Python, i.e. a set where the elements have (or dynamically generate) their own keys?
i.e. the goal here is to avoid storing the key in two places, and therefore dictionaries are less than ideal (hence the question).

You can simulate that very easily:
class KeyedObject(object):
def get_key(self):
raise NotImplementedError("You must subclass this before you can use it.")
class KeyedDict(dict):
def append(self, obj):
self[obj.get_key()] = obj
Now you can use a KeyedDict instead of dict with subclasses of KeyedObject (where get_key return a valid key based on some object property).

Given your constraints, everyone trying to implement what you're looking for using a dict is barking up the wrong tree. Instead, you should write a list subclass that overrides __getitem__ to provide the behavior you want. I've written it so it tries to get the desired item by index first, then falls back to searching for the item by the key attribute of the contained objects. (This could be a property if the object needs to determine this dynamically.)
There's no way to avoid a linear search if you don't want to duplicate something somewhere; I am sure the C# implementation does exactly the same thing if you don't allow it to use a dictionary to store the keys.
class KeyedCollection(list):
def __getitem__(self, key):
if isinstance(key, int) or isinstance(key, slice):
return list.__getitem__(key)
for item in self:
if getattr(item, "key", 0) == key:
return item
raise KeyError('item with key `%s` not found' % key)
You would probably also want to override __contains__ in a similar manner so you could say if "key" in kc.... If you want to make it even more like a dict, you could also implement keys() and so on. They will be equally inefficient, but you will have an API like a dict, that also works like a list.

#Mehrdad said:
Because semantically, it doesn't make as much sense. When an object
knows its key, it doesn't make sense to put it in a dictionary -- it's
not a key-value pair. It's more of a semantic issue than anything
else.
With this constraint, there is nothing in Python that does what you want. I suggest you use a dict and not worry about this level of detail on the semantics. #Gabi Purcaru's answer shows how you can create an object with the interface you want. Why get bothered about how it's working internally?
It could be that C#'s KeyedCollection is doing the same thing under the covers: asking the object for its key and then storing the key for fast access. In fact, from the docs:
By default, the KeyedCollection(Of TKey, TItem) includes a lookup
dictionary that you can obtain with the Dictionary property. When an
item is added to the KeyedCollection(Of TKey, TItem), the item's key
is extracted once and saved in the lookup dictionary for faster
searches. This behavior is overridden by specifying a dictionary
creation threshold when you create the KeyedCollection(Of TKey,
TItem). The lookup dictionary is created the first time the number of
elements exceeds that threshold. If you specify –1 as the threshold,
the lookup dictionary is never created.

I'm not much of a C#'er, but I think dictionaries is what you need.
http://docs.python.org/tutorial/datastructures.html#dictionaries
http://docs.python.org/tutorial/datastructures.html
Or maybe lists:
http://docs.python.org/library/functions.html#list

Why not simply use a dict? If the key already exists, a reference to the key will be used in the dict; it won't be senselessly duplicated.
class MyExample(object):
def __init__(self, key, value):
self.key = key
self.value = value
m = MyExample("foo", "bar")
d = {}
d[m.key] = m
first_key = d.keys()[0]
first_key is m.key # returns True
If the key doesn't already exist, a copy of it will be saved, but I don't see that as a problem.
def lame_hash(s):
h = 0
for ch in s:
h ^= ord(ch)
return h
d = {}
d[lame_hash(m.key)] = m
print d # key value is 102 which is stored in the dict
lame_hash(m.key) in d # returns True

I'm not sure if this is what you meant, but this dictionary will create it's own keys as you add to it...
class KeyedCollection(dict):
def __init__(self):
self.current_key = 0
def add(self, item):
self[self.current_key] = item
abc = KeyedCollection()
abc.add('bob')
abc.add('jane')
>>> abc
{0: 'bob', 1: 'jane'}

How about a set()? The elements can have their own k

To go a little more in detail that the already correct answer from #Gabi Purcaru's answer, here a class that do the same as gabi one's but that also check for correct given type on key and value (as the TKey and TValue of the .net KeyedCollection).
class KeyedCollection(MutableMapping):
"""
Provides the abstract base class for a collection (:class:`MutableMappinp`) whose keys are embedded in the values.
"""
__metaclass__ = abc.ABCMeta
_dict = None # type: dict
def __init__(self, seq={}):
self._dict = dict(seq)
#abc.abstractmethod
def __is_type_key_correct__(self, key):
"""
Returns: The type of keys in the collection
"""
pass
#abc.abstractmethod
def __is_type_value_correct__(self, value):
"""
Returns: The type of values in the collection
"""
pass
#abc.abstractmethod
def get_key_for_item(self, value):
"""
When implemented in a derivated class, extracts the key from the specified element.
Args:
value: the element from which to extract the key (of type specified by :meth:`type_value`)
Returns: The key of specified element (of type specified by :meth:`type_key`)
"""
pass
def __assert_type_key(self, key, arg_name='key'):
if not self.__is_type_key_correct__(key) :
raise ValueError("{} type is not correct".format(arg_name))
def __assert_type_value(self, value, arg_name='value'):
if not self.__is_type_value_correct__(value) :
raise ValueError("{} type is not correct".format(arg_name))
def add(self, value):
"""
Adds an object to the KeyedCollection.
Args:
value: The object to be added to the KeyedCollection (of type specified by :meth:`type_value`).
"""
key = self.get_key_for_item(value)
self._dict[key] = value
# Implements abstract method __setitem__ from MutableMapping parent class
def __setitem__(self, key, value):
self.__assert_type_key(key)
self.__assert_type_value(value)
if value.get_key() != key:
raise ValueError("provided key does not correspond to the given KeyedObject value")
self._dict[key] = value
# Implements abstract method __delitem__ from MutableMapping parent class
def __delitem__(self, key):
self.__assert_type_key(key)
self._dict.pop(key)
# Implements abstract method __getitem__ from MutableMapping parent class (Mapping base class)
def __getitem__(self, key):
self.__assert_type_key(key)
return self._dict[key]
# Implements abstract method __len__ from MutableMapping parent class (Sized mixin on Mapping base class)
def __len__(self):
return len(self._dict)
# Implements abstract method __iter__ from MutableMapping parent class (Iterable mixin on Mapping base class)
def __iter__(self):
return iter(self._dict)
pass
# Implements abstract method __contains__ from MutableMapping parent class (Container mixin on Mapping base class)
def __contains__(self, x):
self.__assert_type_key(x, 'x')
return x in self._dict

Related

How to implement a secondary custom method for object slicing, other than getitem in Python

I am looking to implement a custom method in my class which helps users slice based on index. The primary slicing will be based on dictionary key. I want to implement it similar to how Pandas does it, using df.iloc[n]
here's my code:
class Vector:
def __init__(self, map_object: dict):
self.dictionary = map_object
def __getitem__(self, key):
data = self.dictionary[key]
return data
def iloc(self, n):
key = list(self.dictionary)[n]
return self.dictionary[key]
However, if then write object.iloc[3] after creating the object, I get an error saying 'method' object is not subscriptable. So how can I implement this?

The [ ] syntax requires a proper object with a __getitem__ method. In order to have a "slice method", use a property that returns a helper which supports slicing.
The helper simply holds a reference to the actual parent object, and defines a __getitem__ with the desired behaviour:
class VectorIloc:
def __init__(self, parent):
self.parent = parent
# custom logic for desired "iloc" behaviour
def __getitem__(self, item):
key = list(self.parent.dictionary)[item]
return self.parent[key]
On the actual class, merely define the desired "method" as a property that returns the helper or as an attribute:
class Vector:
def __init__(self, map_object: dict):
self.dictionary = map_object
# if .iloc is used often
# self.iloc = VectorIloc(self)
def __getitem__(self, key):
return self.dictionary[key]
# if .iloc is used rarely
#property
def iloc(self):
return VectorIloc(self)
Whether to use a property or an attribute is an optimisation that trades memory for performance: an attribute constructs and stores the helper always, while a property constructs it only on-demand but on each access. A functools.cached_property can be used as a middle-ground, creating the attribute on first access.
The property is advantageous when the helper is used rarely per object, and especially if it often is not used at all.
Now, when calling vector.iloc[3], the vector.iloc part provides the helper and the [3] part invoces the helper's __getitem__.
>>> vector = Vector({0:0, 1: 1, 2: 2, "three": 3})
>>> vector.iloc[3]
3

I was looking for this implementation which I'm pretty used to in Pandas. However, after searching a lot, I could not find any suitable answer. So I went looking through the Pandas source code and found that the primary requirement for implementing this are as follows:
Create the method with #property decorator, so that it accepts the slice object without throwing the above error
Create a second class to slice based on the index, pass self to this class, and return this class from the method
My final code ended up looking something like this:
class TimeSeries:
def __init__(self, data: dict):
self.data = data
def __getitem__(self, key):
data = self.data[key]
return data
#property
def iloc(self):
return Slicer(self)
class Slicer:
def __init__(self, obj):
self.time_series = obj
def __getitem__(self, n):
key = list(self.time_series.data)[n]
return self.time_series[key]
With the classes defined this way, I could write the following code:
>>> ts = TimeSeries({'a': 1, 'b': 2, 'c': 3, 'd': 4})
>>> print("value of a:", ts['a'])
value of a: 1
>>> print("value at position 0:", ts.iloc[0])
value at position 0: 1

How to subclass a dictionary so it supports generic type hints?

How can a dictionary be subclassed such that the subclass supports generic type hinting? It needs to behave like a dictionary in every way and support type hints of the keys and values. The subclass will add functions that access and manipulate the dictionary data. For example, it will have a valueat(self, idx:int) function that returns the dictionary value at a given index.
It doesn't require OrderedDict as its base class, but the dictionary does need to have a predictable order. Since OrderedDict maintains insertion order and supports type hints, it seems like a reasonable place to start.
Here's what I tried:
from collections import OrderedDict
class ApplicationSpecificDict(OrderedDict[str, int]):
...
However, it fails with the error:
TypeError: 'type' object is not subscriptable
Is this not supported in Python 3.7+, or am I missing something?

The typing package provides generic classes that correspond to the non-generic classes in collections.abc and collections. These generic classes may be used as base classes to create user-defined generic classes, such as a custom generic dictionary.
Examples of generic classes corresponding to types in collections.abc:
typing.AbstractSet(Sized, Collection[T_co])
typing.Container(Generic[T_co])
typing.Mapping(Sized, Collection[KT], Generic[VT_co])
typing.MutableMapping(Mapping[KT, VT])
typing.MutableSequence(Sequence[T])
typing.MutableSet(AbstractSet[T])
typing.Sequence(Reversible[T_co], Collection[T_co])
Examples of generic classes corresponding to types in collections:
typing.DefaultDict(collections.defaultdict, MutableMapping[KT, VT])
typing.OrderedDict(collections.OrderedDict, MutableMapping[KT, VT])
typing.ChainMap(collections.ChainMap, MutableMapping[KT, VT])
typing.Counter(collections.Counter, Dict[T, int])
typing.Deque(deque, MutableSequence[T])
Implementing a custom generic dictionary
There are many options for implementing a custom generic dictionary. However, it is important to note that unless the user-defined class explicitly inherits from Mapping or MutableMapping, static type checkers like mypy will not consider the class as a mapping.
Example user-defined generic dictionary
from collections import abc # Used for isinstance check in `update()`.
from typing import Dict, Iterator, MutableMapping, TypeVar
KT = TypeVar('KT')
VT = TypeVar('VT')
class MyDict(MutableMapping[KT, VT]):
def __init__(self, dictionary=None, /, **kwargs) -> None:
self.data: Dict[KT, VT] = {}
if dictionary is not None:
self.update(dictionary)
if kwargs:
self.update(kwargs)
def __contains__(self, key: KT) -> bool:
return key in self.data
def __delitem__(self, key: KT) -> None:
del self.data[key]
def __getitem__(self, key: KT) -> VT:
if key in self.data:
return self.data[key]
raise KeyError(key)
def __len__(self) -> int:
return len(self.data)
def __iter__(self) -> Iterator[KT]:
return iter(self.data)
def __setitem__(self, key: KT, value: VT) -> None:
self.data[key] = value
#classmethod
def fromkeys(cls, iterable: Iterable[KT], value: VT) -> "MyDict":
"""Create a new dictionary with keys from `iterable` and values set
to `value`.
Args:
iterable: A collection of keys.
value: The default value. All of the values refer to just a single
instance, so it generally does not make sense for `value` to be a
mutable object such as an empty list. To get distinct values, use
a dict comprehension instead.
Returns:
A new instance of MyDict.
"""
d = cls()
for key in iterable:
d[key] = value
return d
def update(self, other=(), /, **kwds) -> None:
"""Updates the dictionary from an iterable or mapping object."""
if isinstance(other, abc.Mapping):
for key in other:
self.data[key] = other[key]
elif hasattr(other, "keys"):
for key in other.keys():
self.data[key] = other[key]
else:
for key, value in other:
self.data[key] = value
for key, value in kwds.items():
self.data[key] = value

I posted on this question which yours may be a dupe of, but I will include it here as well because I found both of these questions when I was googling how to do this.
Basically, you need to use the typing Mapping generic
This is the generic annotation that dict uses so you can define other types like MyDict[str, int].
How to:
import typing
from collections import OrderedDict
# these are generic type vars to tell mutable-mapping
# to accept any type vars when creating a sub-type of your generic dict
_KT = typing.TypeVar("_KT") # key type
_VT = typing.TypeVar("_VT") # value type
# `typing.MutableMapping` requires you to implement certain functions like __getitem__
# You can get around this by just subclassing OrderedDict first.
# Note: The generic you're subclassing needs to come BEFORE
# the `typing.MutableMapping` subclass or accessing indices won't work.
class ApplicationSpecificDict(
OrderedDict,
typing.MutableMapping[_KT, _VT]
):
"""Your special dict"""
...
# Now define the key, value types for sub-types of your dict
RequestDict = ApplicationSpecificDict[str, typing.Tuple[str, str]]
ModelDict = ApplicationSpecificDict[str, typing.Any]
Now use you custom types of your sub-typed dict:
from my_project.custom_typing import ApplicationSpecificDict # Import your custom type
def make_request() -> ApplicationSpecificDict:
request = ApplicationSpecificDict()
request["test"] = ("sierra", "117")
return request
print(make_request())
Will output as { "test": ("sierra", "117") }

Creating default dict where default value = key? [duplicate]

Yep! I know you cannot understand by the title.
Take for example the below code.
class Room(object):
def __init__(self):
self.numbers = []
self.identify = None #for now?
def getRoom(self):
#here I need to implement so that,
# self.identify is current indent this class is called!
return self.identify
room = defualtdict(Room)
print room['Train'].getRoom()
print room['Hospital'].getRoom()
Excepted output.
#>>Train
#>>Hospital
Any such feature supported in defaultdict, so that I can do that?
Once the class of room 'something' is called, inside the class, I need a code so that, self.room is 'something' which is called!

The default factory of collections.defaultdict (any callable) does not accept arguments.
If default_factory is not None, it is called without arguments to
provide a default value for the given key, this value is inserted in
the dictionary for the key, and returned.
In other words, defaultdict does not pass any information to the default_factory.
Subclass defaultdict to customize the default __missing__ hook to call the default_factory (Room class constructor) with missing key as a parameter:
from collections import defaultdict
class mydefaultdict(defaultdict):
def __missing__(self, key):
self[key] = new = self.default_factory(key)
return new
The constructor of Room will then look like
class Room(object):
def __init__(self, identity):
self.numbers = []
self.identify = identity
You'll need to use mydefaultdict instead of defaultdict from now on. Example:
room = mydefaultdict(Room)
print(room['Train'].getRoom()) # Train
print(room['Hospital'].getRoom()) # Hospital
While this works, I suggest you to re-think the way you store/access data.

Extract (not known beforehand) attributes from objects in a list

I have a class whose attributes are not known beforehand:
class Event():
def __init__(self, **kwargs):
for key, value in kwargs.items():
setattr(self, key, value)
and another one which is basically a list of objects Event:
class Collection(list):
def __init__(self):
self.members = []
def add(self,new):
try:
self.members.extend(new)
except TypeError:
self.members.append(new)
Let's say now that I define 3 objects Event:
a = Event(name="a",value=1)
b = Event(name="b",value=2)
c = Event(name="c",other=True)
And I create a Collection from them:
col = Collection()
col.add([a,b,c])
What I want is to be able to print out all the values of the objects in the list for a given attribute (if the attribute does not exist for an object, it should return None or any other pre-defined value). For example:
print col.name #should return ["a","b","c"]
print col.value #should return [1,2,None]
I have read the following answer: Extract list of attributes from list of objects in python
But that doesn't work here since the name of my attribute is not known by advance, and some might not even be defined. How should I define my class Collection(), or maybe even re-think everything to achieve my goal ?

This is a variation of "I want to create dynamic variable names". The solution here is the same: use a dictionary.
class Event(object):
def __init__(self, **kwargs):
self.attributes = dict(kwargs)
Your Collection class will need a custom __getattr__ method, so that it can look up values in its Event list instead.
class Collection(object):
# assume self.events is a list of Event objects
def __getattr__(self, name):
return [event.attributes.get(name) for event in self.events]
You could stick with your current implementation of Event, and have Collection look at event.__dict__ instead of event.attributes. I don't recall, though, if __dict__ might contain anything else besides the attributes you explicitly set. I'd err on the side of caution.

You can just override the __getattr__ method of the Collection class, which is called when an attribute is accessed. In order to access to unknown set of attributes you can use event.__dict__. So, a possible solution is like this:
def __getattr__(self, name):
return [m.__dict__.get(name) for m in self.members]

Define a python dictionary with immutable keys but mutable values

Well, the question is in the title: how do I define a python dictionary with immutable keys but mutable values? I came up with this (in python 2.x):
class FixedDict(dict):
"""
A dictionary with a fixed set of keys
"""
def __init__(self, dictionary):
dict.__init__(self)
for key in dictionary.keys():
dict.__setitem__(self, key, dictionary[key])
def __setitem__(self, key, item):
if key not in self:
raise KeyError("The key '" +key+"' is not defined")
dict.__setitem__(self, key, item)
but it looks to me (unsurprisingly) rather sloppy. In particular, is this safe or is there the risk of actually changing/adding some keys, since I'm inheriting from dict?
Thanks.

Consider proxying dict instead of subclassing it. That means that only the methods that you define will be allowed, instead of falling back to dict's implementations.
class FixedDict(object):
def __init__(self, dictionary):
self._dictionary = dictionary
def __setitem__(self, key, item):
if key not in self._dictionary:
raise KeyError("The key {} is not defined.".format(key))
self._dictionary[key] = item
def __getitem__(self, key):
return self._dictionary[key]
Also, you should use string formatting instead of + to generate the error message, since otherwise it will crash for any value that's not a string.

The problem with direct inheritance from dict is that it's quite hard to comply with the full dict's contract (e.g. in your case, update method won't behave in a consistent way).
What you want, is to extend the collections.MutableMapping:
import collections
class FixedDict(collections.MutableMapping):
def __init__(self, data):
self.__data = data
def __len__(self):
return len(self.__data)
def __iter__(self):
return iter(self.__data)
def __setitem__(self, k, v):
if k not in self.__data:
raise KeyError(k)
self.__data[k] = v
def __delitem__(self, k):
raise NotImplementedError
def __getitem__(self, k):
return self.__data[k]
def __contains__(self, k):
return k in self.__data
Note that the original (wrapped) dict will be modified, if you don't want that to happen, use copy or deepcopy.

How you prevent someone from adding new keys depends entirely on why someone might try to add new keys. As the comments state, most dictionary methods that modify the keys don't go through __setitem__, so a .update() call will add new keys just fine.
If you only expect someone to use d[new_key] = v, then your __setitem__ is fine. If they might use other ways to add keys, then you have to put in more work. And of course, they can always use this to do it anyway:
dict.__setitem__(d, new_key, v)
You can't make things truly immutable in Python, you can only stop particular changes.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keyed Collection in Python? - python

Is there any equivalent to KeyedCollection in Python, i.e. a set where the elements have (or dynamically generate) their own keys? i.e. the goal here is to avoid storing the key in two places, and therefore dictionaries are less than ideal (hence the question).

I'm not much of a C#'er, but I think dictionaries is what you need. http://docs.python.org/tutorial/datastructures.html#dictionaries http://docs.python.org/tutorial/datastructures.html Or maybe lists: http://docs.python.org/library/functions.html#list

How about a set()? The elements can have their own k

Related

How to implement a secondary custom method for object slicing, other than getitem in Python

How to subclass a dictionary so it supports generic type hints?

Creating default dict where default value = key? [duplicate]

Extract (not known beforehand) attributes from objects in a list

Define a python dictionary with immutable keys but mutable values

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keyed Collection in Python? - python

Is there any equivalent to KeyedCollection in Python, i.e. a set where the elements have (or dynamically generate) their own keys? i.e. the goal here is to avoid storing the key in two places, and therefore dictionaries are less than ideal (hence the question).

I'm not much of a C#'er, but I think dictionaries is what you need. http://docs.python.org/tutorial/datastructures.html#dictionaries http://docs.python.org/tutorial/datastructures.html Or maybe lists: http://docs.python.org/library/functions.html#list

How about a set()? The elements can have their own k

Related

How to implement a secondary custom method for object slicing, other than __getitem__ in Python

How to subclass a dictionary so it supports generic type hints?

Creating default dict where default value = key? [duplicate]

Extract (not known beforehand) attributes from objects in a list

Define a python dictionary with immutable keys but mutable values

Categories

Resources

How to implement a secondary custom method for object slicing, other than getitem in Python