How to delete specific item from lists in a dictionary? - python

I have a dictionary that contains keys and the values are a list of many ints and floats.
For example:
{"key1": [6,4,3.2,0.04...], "key2": [17,0.9,50.79...]}
All the lists has the same length. I want to delete the 2nd item from each list (for example 4 in key1 and 0.9 in key2).
How can I do that?

Try this in just one line:
d = {"key1": [6, 4, 3.2, 0.04], "key2": [17, 0.9, 50.79]}
result = {k: [j for i, j in enumerate(v) if i != 1] for k, v in d.items()}
The result will be:
{'key1': [6, 3.2, 0.04], 'key2': [17, 50.79]}

dd = {"k1":[1,2,3,4],"k2":[11,22,33,44]}
from pprint import pp
pp({k:[v[0]]+v[2:]for k,v in dd.items()}) #[v[0]] - needed because in this case returns only one element and not list
Result:
{'k1': [1, 3, 4], 'k2': [11, 33, 44]}

Related

Filter a dictionary of lists

I have a dictionary of the form:
{"level": [1, 2, 3],
"conf": [-1, 1, 2],
"text": ["here", "hel", "llo"]}
I want to filter the lists to remove every item at index i where an index in the value "conf" is not >0.
So for the above dict, the output should be this:
{"level": [2, 3],
"conf": [1, 2],
"text": ["hel", "llo"]}
As the first value of conf was not > 0.
I have tried something like this:
new_dict = {i: [a for a in j if a >= min_conf] for i, j in my_dict.items()}
But that would work just for one key.
try:
from operator import itemgetter
def filter_dictionary(d):
positive_indices = [i for i, item in enumerate(d['conf']) if item > 0]
f = itemgetter(*positive_indices)
return {k: list(f(v)) for k, v in d.items()}
d = {"level": [1, 2, 3], "conf": [-1, 1, 2], "text": ["-1", "hel", "llo"]}
print(filter_dictionary(d))
output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
I tried to first see which indices of 'conf' are positive, then with itemgetter I picked those indices from values inside the dictionary.
More compact version + without temporary list using generator expression instead:
def filter_dictionary(d):
f = itemgetter(*(i for i, item in enumerate(d['conf']) if item > 0))
return {k: list(f(v)) for k, v in d.items()}
Here's a one-liner:
dct = {k: [x for i, x in enumerate(v) if d['conf'][i] > 0] for k, v in d.items()}
Output:
>>> dct
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
With sample data:
d = {"level":[1,2,3], "conf":[-1,1,2], "text":["here","hel","llo"]
I would keep the indexes of valid elements (those greater than 0) with:
kept_keys = [i for i in range(len(my_dict['conf'])) if my_dict['conf'][i] > 0]
And then you can filter each list checking if the index of a certain element in the list is contained in kept_keys:
{k: list(map(lambda x: x[1], filter(lambda x: x[0] in kept_keys, enumerate(my_dict[k])))) for k in my_dict}
Output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
The structure of the data you're describing sounds like it might be more naturally modelled as a pandas DataFrame: you are essentially viewing your data as a 2-D grid, and you want to filter out rows of that grid based on the value in one column.
The following snippet will do what you need using a DataFrame as an intermediate representation:
import pandas as pd
data = {"level":[1,2,3], "conf":[-1,1,2], "text":["here","hel","llo"]}
df = pd.DataFrame(data)
df = df.loc[df["conf"] > 0]
result = df.to_dict(orient="list")
Output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
However, note that if you represent your data as a DataFrame in the first place, and keep it in that form when you're done, this is simplified to,
data = pd.DataFrame({
"level":[1,2,3],
"conf":[-1,1,2],
"text":["here","hel","llo"],
})
result = data.loc[data["conf"] > 0]
Output:
level conf text
1 2 1 hel
2 3 2 llo
Which is terser, more expressive, and (on large inputs) more performant than any "pure dict" solution.
If the other operations you want to do on this data are similar (in the sense of really being '2D array' operations), it is likely that they will also be more naturally expressed in terms of DataFrames, and so keeping your data as a DataFrame is likely to be advantageous vs converting back to a dict.
I solved it with this:
from typing import Dict, List, Any, Set
d = {"level":[1,2,3], "conf":[-1,1,2], "text":["-1", "hel", "llo"]}
# First, we create a set that stores the indices which should be kept.
# I chose a set instead of a list because it has a O(1) lookup time.
# We only want to keep the items on indices where the value in d["conf"] is greater than 0
filtered_indexes = {i for i, value in enumerate(d.get('conf', [])) if value > 0}
def filter_dictionary(d: Dict[str, List[Any]], filtered_indexes: Set[int]) -> Dict[str, List[Any]]:
filtered_dictionary = d.copy() # We'll return a modified copy of the original dictionary
for key, list_values in d.items():
# In the next line the actual filtering for each key/value pair takes place.
# The original lists get overwritten with the filtered lists.
filtered_dictionary[key] = [value for i, value in enumerate(list_values) if i in filtered_indexes]
return filtered_dictionary
print(filter_dictionary(d, filtered_indexes))
Output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
You can have a function which works out which indexes to keep and reformulate each list with only those indexes:
my_dict = {"level":[1,2,3], "conf":[-1,1,2],'text':["-1","hel","llo"]}
def remove_corresponding_items(d, key):
keep_indexes = [idx for idx, value in enumerate(d[key]) if value>0]
for key, lst in d.items():
d[key] = [lst[idx] for idx in keep_indexes]
remove_corresponding_items(my_dict, 'conf')
print(my_dict)
Output as requested
Here's a numpy way of doing it:
dct = {"level":[1,2,3], "conf":[-1,1,2], "text":["here","hel","llo"]}
dct = {k: np.array(v) for k, v in d.items()}
dct = {k: v[a['conf'] > 0].tolist() for k, v in a.items()}
Output:
>>> dct
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
Lots of good answers. Here's another 2-pass approach:
mydict = {"level": [1, 2, 3], "conf": [-1, 1, 2], 'text': ["-1", "hel", "llo"]}
for i, v in enumerate(mydict['conf']):
if v <= 0:
for key in mydict.keys():
mydict[key][i] = None
for key in mydict.keys():
mydict[key] = [v for v in mydict[key] if v is not None]
print(mydict)
Output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
a = {"level":[1,2,3,4], "conf": [-1,1,2,-1],"text": ["-1","hel","llo","test"]}
# inefficient solution
# for k, v in a.items():
# if k == "conf":
# start_search = 0
# to_delete = [] #it will store the index numbers of the conf that you want to delete(conf<0)
# for element in v:
# if element < 0:
# to_delete.append(v.index(element,start_search))
# start_search = v.index(element) + 1
#more efficient and elegant solution
to_delete = [i for i, element in enumerate(a["conf"]) if element < 0]
for position in list(reversed(to_delete)):
for k, v in a.items():
v.pop(position)
and the result will be
>>> a
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}
Try this, simple and easy to understand, especially for beginners:
a_dict = {"level": [1, 2, 3, 4, 5, 8], "conf": [-1, 1, -1, -2], "text": ["-1", "hel", "llo", "ai", 0, 9]}
# iterate backwards over the list keeping the indexes
for index, item in reversed(list(enumerate(a_dict["conf"]))):
if item <= 0:
for lists in a_dict.values():
del lists[index]
print(a_dict)
Output:
{'level': [2, 5, 8], 'conf': [1], 'text': ['hel', 0, 9]}
I believe this will work:
For each list, we will filter the values where conf is negative, and after that we will filter conf itself.
d = {"level":[1,2,3], "conf":[-1,1,2], "text":["-1","hel","llo"]}
for key in d:
if key != "conf":
d[key] = [d[key][i] for i in range(len(d[key])) if d["conf"][i] >= 0]
d["conf"] = [i for i in d["conf"] if i>=0]
print(d)
A simpler solution will be (exactly the same but using list comprehension, so we don't need to do it separately for conf and the rest:
d = {"level":[1,2,3], "conf":[-1,1,2], "text":["-1","hel","llo"]}
d = {i:[d[i][j] for j in range(len(d[i])) if d["conf"][j] >= 0] for i in d}
Output:
{'level': [2, 3], 'conf': [1, 2], 'text': ['hel', 'llo']}

Extract values from class instances inside a dictionary

I have this code:
class B():
def __init__(self, valueA, valueB ):
self.valueA = valueA
self.valueB = valueB
def __repr__(self):
return 'B({0},{1})'.format(self.valueA, self.valueB)
My data is :
thedata = {'a': [B(1, 0.1),B(2, 0.2)],
'b': [B(3, 0.3),B(4, 0.4)]}
What I want to do is extract the a and b attributes from above dictionary to 2 new dictionaries according the key.
So, I want :
thedata_a_valueA = {'a':[1, 2]}
thedata_a_valueB = {'a':[0.1, 0.2]}
thedata_b_valueA = {'b':[3, 4]}
thedata_b_valueB = {'b':[0.3, 0.4]}
and finally I want 2 dictionaries:
newdict_valueA = {'a':[1, 2], 'b':[3,4]}
newdict_valueB = {'a':[0.1, 0.2], 'b':[0.3,0.4]}
One solution is to use lists but I must create quite a few lists, loop over them, append etc
Is there any cleaner/faster solution working on thedata ?
Something like this will work:
# Starting from two empty dictionaries
newdict_valueA, newdict_valueB = {}, {}
# Iterate over the key/value pairs of the data
for key, value in thedata.items():
# assign to the same key of each result dictionary a list of
# the valueA for each B item in the value of the original dictionary
newdict_valueA[key] = [item.valueA for item in value]
newdict_valueB[key] = [item.valueB for item in value]
newdict_valueA:
{'a': [1, 2], 'b': [3, 4]}
newdict_valueB:
{'a': [0.1, 0.2], 'b': [0.3, 0.4]}
If I got your question right, you can use comprehensions:
newdict_valueA = { k : [ b.valueA for b in l ] for k, l in thedata.items() }
newdict_valueB = { k : [ b.valueB for b in l ] for k, l in thedata.items() }
You can have all of the expected items in a list for each key in your dictionary. You just need to loop over thedata's items and for each key create a dictionary contain separate items for valueA and valueB. The dict.setdefault() attribute would be nice choice for this task.
In [18]: d = {}
In [19]: for i, j in thedata.items():
for instance in j:
d.setdefault(i, {}).setdefault('valueA', []).append(instance.valueA)
d.setdefault(i, {}).setdefault('valueB', []).append(instance.valueB)
In [20]: d
Out[21]:
{'a': {'valueB': [0.1, 0.2], 'valueA': [1, 2]},
'b': {'valueB': [0.3, 0.4], 'valueA': [3, 4]}}
Note that as a more cleaner way you can use collections.defaultdict() instead of the dict.setdefault() method:
In [33]: d = defaultdict(lambda: defaultdict(list))
In [34]: for i, j in thedata.items():
for instance in j:
d[i]['valueA'].append(instance.valueA)
d[i]['valueB'].append(instance.valueB)
....:
In [35]: d
Out[35]: defaultdict(<function <lambda> at 0x7fb5888786a8>,
{'a': defaultdict(<class 'list'>,
{'valueB': [0.1, 0.2], 'valueA': [1, 2]}),
'b': defaultdict(<class 'list'>,
{'valueB': [0.3, 0.4], 'valueA': [3, 4]})})

Update dictionary items with a for loop

I would like update a dictionary items in a for loop here is what I have:
>>> d = {}
>>> for i in range(0,5):
... d.update({"result": i})
>>> d
{'result': 4}
But I want d to have following items:
{'result': 0,'result': 1,'result': 2,'result': 3,'result': 4}
As mentioned, the whole idea of dictionaries is that they have unique keys.
What you can do is have 'result' as the key and a list as the value, then keep appending to the list.
>>> d = {}
>>> for i in range(0,5):
... d.setdefault('result', [])
... d['result'].append(i)
>>> d
{'result': [0, 1, 2, 3, 4]}
Keys have to be unique in a dictionnary, so what you are trying to achieve is not possible. When you assign another item with the same key, you simply override the previous entry, hence the result you see.
Maybe this would be useful to you?
>>> d = {}
>>> for i in range(3):
... d['result_' + str(i)] = i
>>> d
{'result_0': 0, 'result_1': 1, 'result_2': 2}
You can modify this to fit your needs.
PHA in dictionary the key cant be same :p in your example
{'result': 0,'result': 1,'result': 2,'result': 3,'result': 4}
you can use list of multiplw dict:
[{},{},{},{}]
You can't have different values for the same key in your dictionary. One option would be to number the result:
d = {}
for i in range(0,5):
result = 'result' + str(i)
d[result] = i
d
>>> {'result0': 0, 'result1': 1, 'result4': 4, 'result2': 2, 'result3': 3}
d = {"key1": [8, 22, 38], "key2": [7, 3, 12], "key3": [5, 6, 71]}
print(d)
for key, value in d.items():
value_new = [sum(value)]
d.update({key: value_new})
print(d)
>>> d = {"result": []}
>>> for i in range(0,5):
... d["result"].append(i)
...
>>> d
{'result': [0, 1, 2, 3, 4]}

Is there any pythonic way to combine two dicts (making a list for common values)?

For example, I have two dicts.
A = {'a':1, 'b':10, 'c':2}
B = {'b':3, 'c':4, 'd':10}
I want a result like this:
{'a':1, 'b': [10, 3], 'c':[2, 4], 'd':10}
If a key appears in both the dicts, I want to list of both the values.
I'd make all values lists:
{k: filter(None, [A.get(k), B.get(k)]) for k in A.viewkeys() | B}
using dictionary view objects.
Demo:
>>> A = {'a':1, 'b':10, 'c':2}
>>> B = {'b':3, 'c':4, 'd':10}
>>> {k: filter(None, [A.get(k), B.get(k)]) for k in A.viewkeys() | B}
{'a': [1], 'c': [2, 4], 'b': [10, 3], 'd': [10]}
This at least keeps your value types consistent.
To produce your output, you need to use the set intersection and symmetric differences between the two dictionaries:
dict({k: [A[k], B[k]] for k in A.viewkeys() & B},
**{k: A.get(k, B.get(k)) for k in A.viewkeys() ^ B})
Demo:
>>> dict({k: [A[k], B[k]] for k in A.viewkeys() & B},
... **{k: A.get(k, B.get(k)) for k in A.viewkeys() ^ B})
{'a': 1, 'c': [2, 4], 'b': [10, 3], 'd': 10}
In Python 3, dict.keys() is a dictionary view, so you can just replace all .viewkeys() calls with .keys() to get the same functionality there.
I would second the notion of Martijn Pieters that you problably want to have the same type for all the values in your result dict.
To give a second option:
you could also use the defaultdict to achieve your result quite intuitively.
a defaultdict is like a dict, but it has a default constructor that is called if the key doesn't exist yet.
so you would go:
from collections import defaultdict
A = {'a':1, 'b':10, 'c':2}
B = {'b':3, 'c':4, 'd':10}
result = defaultdict(list)
for d in [A, B]:
for k, v in d.items():
result[k].append(v)
then in a later stage you still easily add more values to your result.
you can also switch to
defaultdict(set)
if you don't want duplicate values

Filter a dict of dict

I new in Python and I am not sure it is a good idea to use dict of dict but here is my question.
I have a dict of dict and I want to filter by the key of the inside dict:
a ={ 'key1' : {'id1' :[0,1,2] , 'id2' :[0,1,2], 'id3' :[4,5,6]}
'key2' : {'id3' :[0,1,2] , 'id4' :[0,1,2]}
'key3' : {'id3' :[0,1,2] , 'id1' :[4,5,6]}
}
For exemple , I want to filter by 'id1' to have :
result = { 'key1' : {'id1' :[0,1,2] }
'key3' : {'id1' :[4,5,6]}
}
I have tried the filter method by I get all the value:
r = [('key1' ,{'id1' :[0,1,2] , 'id2' :[0,1,2], 'id3' :[4,5,6]})
('key3' , {'id3' :[0,1,2] , 'id1' :[4,5,6]})
]
Furthermore the filter method returns a list and I want to keep the format as a dict.
Thanks in advance
Try this:
>>> { k: v['id1'] for k,v in a.items() if 'id1' in v }
{'key3': [4, 5, 6], 'key1': [0, 1, 2]}
For Python 2.x you might prefer to use iteritems() instead of items() and you'll still need a pretty recent python (2.7 I think) for a dictionary comprehension: for older pythons use:
dict((k, v['id1']) for k,v in a.iteritems() if 'id1' in v )
If you want to extract multiple values then I think you are best to just write the loops out in full:
def query(data, wanted):
result = {}
for k, v in data.items():
v2 = { k2:v[k2] for k2 in wanted if k2 in v }
if v2:
result[k] = v2
return result
giving:
>>> query(a, ('id1', 'id2'))
{'key3': {'id1': [4, 5, 6]}, 'key1': {'id2': [0, 1, 2], 'id1': [0, 1, 2]}}
According to the precision you gave to Duncan, here is another filtering on a list using dictionary comprehension:
>>> my_list = ['id1', 'id2']
>>> {k1 : {k2: v2 for (k2, v2) in a[k1].iteritems() if k2 in my_list} for k1 in a}
{'key3': {'id1': [4, 5, 6]}, 'key2': {}, 'key1': {'id2': [0, 1, 2], 'id1': [0, 1, 2]}}
EDIT: you can also remove empty values with another dict compreehension, but that "begins" to be difficult to read... :-)
>>> {k3: v3 for k3, v3 in {k1 : {k2: v2 for (k2, v2) in a[k1].iteritems() if k2 in my_list} for k1 in a}.iteritems() if v3}
{'key3': {'id1': [4, 5, 6]}, 'key1': {'id2': [0, 1, 2], 'id1': [0, 1, 2]}}
You can do with a dictionary comprehension:
def query(data, query):
return {key : {query : data[key][query]}
for key in data if query in data[key]}
You have to look at each entry of the dictionary, which can cost a lot of time if you have many entries or do this a lot. A database with a index can speed this up.
field = 'id1'
dict( (k,{field: d[field]}) for k,d in a.items() if field in d)

Categories