how to remove dictionary element by outlier values Python

how to remove dictionary element by outlier values Python - python

Suppose my dictionary contains > 100 elements and one or two elements have values different than other values; most values are the same (12 in the below example). How can I remove these a few elements?
Diction = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
I want a dictionary object:
Diction = {1:12,2:12,4:12,5:12,6:12,7:12}

It may be a bit slow because of the looping (especially as the size of the dictionary gets very large) and have to use numpy, but this will work
import numpy as np
Diction = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
dict_list = []
for x in Diction:
dict_list.append(Diction[x])
dict_array = np.array(dict_list)
unique, counts = np.unique(dict_array, return_counts=True)
most_common = unique[np.argmax(counts)]
new_Diction = {}
for x in Diction:
if Diction[x] == most_common:
new_Diction[x] = most_common
print(new_Diction)
Output
{1: 12, 2: 12, 4: 12, 5: 12, 6: 12, 7: 12}

d = {1:12,2:12,3:23,4:12,5:12,6:12,7:12,8:2}
new_d = {}
unique_values = []
unique_count = []
most_occurence = 0
# Find unique values
for k, v in d.items():
if v not in unique_values:
unique_values.append(v)
# Count their occurrences
def count(dict, unique_value):
count = 0
for k, v in d.items():
if v == unique_value:
count +=1
return count
for value in unique_values:
occurrences = count(d, value)
unique_count.append( (value, occurrences) )
# Find which value has most occurences
for occurrence in unique_count:
if occurrence[1] > most_occurence:
most_occurence = occurrence[0]
# Create new dict with keys of most occurred value
for k, v in d.items():
if v == most_occurence:
new_d[k] = v
print(new_d)
Nothing fancy, but direct to the point. There should be many ways to optimize this.
Output: {1: 12, 2: 12, 4: 12, 5: 12, 6: 12, 7: 12}

Related

How can I get multiple shared items between two dictionaries in Python?

length_word = {'pen':3, 'bird':4, 'computer':8, 'mail':4}
count_word = {'pen':10, 'bird':50, 'computer':3, 'but':45, 'blackboard': 12, 'mail':12}
intersection = length_word.items() - count_word.items()
common_words = {intersection}
Err: TypeError: unhashable type: 'set'
I wish to get this dictionary:
outcome = {'pen':10, 'bird':50, 'computer':3, 'mail':12}
Thanks.

You should use .keys() instead of .items().
Here is a solution:
length_word = {'pen':3, 'bird':4, 'computer':8, 'mail':4}
count_word = {'pen':10, 'bird':50, 'computer':3, 'but':45, 'blackboard': 12, 'mail':12}
intersection = count_word.keys() & length_word.keys()
common_words = {i : count_word[i] for i in intersection}
#Output:
{'computer': 3, 'pen': 10, 'mail': 12, 'bird': 50}

intersection = count_word.keys() & length_word.keys()
outcome = dict((k, count_word[k]) for k in intersection)

try getting the intersection (common keys). one you have the common keys access those keys from the count_words.
res = {x: count_word.get(x, 0) for x in set(count_word).intersection(length_word)}
res:
{'bird': 50, 'pen': 10, 'computer': 3, 'mail': 12}

Just another dict comp:
outcome = {k: v for k, v in count_word.items() if k in length_word}

Using for loop check if the key is present in both the dictionary. If so then add that key, value pair to new dictionary.
length_word = {'pen':3, 'bird':4, 'computer':8, 'mail':4}
count_word = {'pen':10, 'bird':50, 'computer':3, 'but':45, 'blackboard': 12, 'mail':12}
my_dict = {}
for k, v in count_word.items():
if k in length_word.keys():
my_dict[k] = v
print(my_dict)

Merge random number of dicts in list

The task is to create a list of a random number of dicts (from 2 to 10)
dict's random numbers of keys should be letter, dict's values should
be a number (0-100), example: [{'a': 5, 'b': 7, 'g': 11}, {'a': 3, 'c': 35, 'g': 42}]
get a previously generated list of dicts and create one common dict:
if dicts have same key, we will take max value, and rename key with dict number with max value
if key is only in one dict - take it as is,
example: {'a_1': 5, 'b': 7, 'c': 35, 'g_2': 42}
I've written the following code:
from random import randint, choice
from string import ascii_lowercase
final_dict, indexes_dict = {}, {}
rand_list = [{choice(ascii_lowercase): randint(0, 100) for i in range(len(ascii_lowercase))} for j in range(randint(2, 10))]
for dictionary in rand_list:
for key, value in dictionary.items():
if key not in final_dict:
final_dict.update({key: value}) # add first occurrence
else:
if value < final_dict.get(key):
#TODO indexes_dict.update({:})
continue
else:
final_dict.update({key: value})
#TODO indexes_dict.update({:})
for key in indexes_dict:
final_dict[key + '_' + str(indexes_dict[key])] = final_dict.pop(key)
print(final_dict)
I only need to add some logic in order to keep indexes of final_dict values (created the separated dict for it).
I'm wondering if exists some more Pythonic way in order to solve such tasks.

This approach seems completely reasonable.
I, personally, would probably go around this way, however:
final_dict, tmp_dict = {}, {}
#Transform from list of dicts into dict of lists.
for dictionary in rand_list:
for k, v in dictionary.items():
tmp_dict.setdefault(k, []).append(v)
#Now choose only the biggest one
for k, v in tmp_dict.items():
if len(v) > 1:
final_dict[k+"_"+str(v.index(max(v))+1)] = max(v)
else: final_dict[k] = v[0]

You will need some auxiliary data structure to keep track of unrepeated keys. This uses collections.defaultdict and enumerate to aid the task:
from collections import defaultdict
def merge(dicts):
helper = defaultdict(lambda: [-1, -1, 0]) # key -> max, index_max, count
for i, d in enumerate(dicts, 1): # start indexing at 1
for k, v in d.items():
helper[k][2] += 1 # always increase count
if v > helper[k][0]:
helper[k][:2] = [v, i] # update max and index_max
# build result from helper data structure
result = {}
for k, (max_, index, count) in helper.items():
key = k if count == 1 else "{}_{}".format(k, index)
result[key] = max_
return result
>>> merge([{'a': 5, 'b': 7, 'g': 11}, {'a': 3, 'c': 35, 'g': 42}])
{'a_1': 5, 'b': 7, 'g_2': 42, 'c': 35}

Extract x previous key-value pairs given a key from a dictionary

I have a dictionary like this:
dict_test ={1: 111, 2: 2222, 3:333, 4:4444, 5:5555, 6:6666,
7: 777, 8: 8888, 9:9999, 10:100010101}
and would like to create a subset of the dictionary, that takes the previous four values given the key 8 for instance. So the resulting expected dictionary would look like this:
dict_new ={4:4444, 5:5555, 6:6666, 7: 777, 8: 8888}
I tried to write a more general function below, where I can more generally determine how many previous values I should look back.
def get_x_prev_entries(dictionary: dict, key: str, prev: int):
if key in dictionary:
token = object()
keys = [token]*(prev*-1) + sorted(dictionary) + [token]*diff
print('keys' + str(keys))
new_dict = []
newkeys = []
new_prev= prev
# extract all keys that are between 0 and the specified difference
while new_prev is not 0:
new_prev -= 1
if len(newkeys) == 0:
newkeys= newkeys
else:
newkeys = newkeys.append(keys[keys.index(key)-new_diff])
print(newkeys)
print(new_diff)
new_dict = {k:v for k, v in dictionary.items() if k in newkeys}
return new_dict
else:
print('Key not found')
So to create my desired dictionary I would ideally enter
get_x_prev_entries(dict_test, 8, 4)
but at this moment I only get an empty dictionary returned. Any advice would be appreciated. Thanks!

Using an order dict
from collections import OrderedDict as od
dict_test ={1: 111, 2: 2222, 3:333, 4:4444, 5:5555, 6:6666, 7: 777, 8: 8888, 9:9999, 10:100010101}
od_dict = od(dict_test)
def get_previous_keys(od_dict, prev=4, given=8):
if given not in od_dict:
return
k, v = [], []
for i in range(given-prev, given):
k.append(i)
v.append(od_dict[i])
return dict(zip(k,v))
print(get_previous_keys(od_dict))
{4: 4444, 5: 5555, 6: 6666, 7: 777}

Function Return List as Dictionary?

My function below is taking a list of values and returning the counts of duplicates. I managed to make it count and print, but my task is to return it as a dictionary. I've been struggling to return in the correct format, any advice?
def counts(values):
d = {}
for val in values:
d.setdefault(val,0)
d[val] += 1
for val, count in d.items():
d = ("{} {}".format(val,count))
return d
counts([1,1,1,2,3,3,3,3,5]) # Should return → {1: 3, 2: 1, 3: 4, 5: 1}

Just return the created dictionary:
def counts(values):
d = {}
for val in values:
d.setdefault(val,0)
d[val] += 1
return d
Yields:
>>> counts([1,1,1,2,3,3,3,3,5])
{1: 3, 2: 1, 3: 4, 5: 1}
Of course, as Moses points out, a Counter is built for this so just use that instead:
from collections import Counter
def counts(values):
return dict(Counter(values))

Check for unique values in a dictionary and return a list

I've been struggling with this exercise for a couple of days now, each approximation I find, have a new problem, the idea is to find those unique values on a dictionary, and return a list with the keys
For example:
if aDictionary = {1: 1, 3: 2, 6: 0, 7: 0, 8: 4, 10: 0} then your the function should return [1, 3, 8], as the values 1,2 and 4 only appear once.
This is what I've tried so far:
def existsOnce(aDict):
counting = {}
tempList = []
for k in aDict.keys():
print k,
print aDict[k]
print 'values are:'
for v in aDict.values():
print v,
counting[v] = counting.get(v,0)+1
print counting[v]
tempNumbers = counting[v]
tempList.append(tempNumbers)
print tempList
If I go this way, I can point and delete those that are bigger than one, but the problem persists, I will have one zero, and I don't want it as was not unique in the original list.
def existsOnce2(aDict):
# import Counter module in the top with `from collections import Counter`
c = Counter()
for letter in 'here is a sample of english text':
c[letter] += 1
if c[letter] == 1:
print c[letter],':',letter
I tried to go this way with integers and check which ones appear from first time, but cannot translate it to dictionary or keep going from here. Also I'm not sure if importing modules are allowed in the answer and surely have to be a way to do it without external modules.
def existsOnce3(aDict):
vals = {}
for i in aDict.values():
for j in set(str(i)):
vals[j] = 1+ vals.get(j,0)
print vals
'''till here I get a counter of how many times a value appears in the original dictionary, now I should delete those bigger than 1'''
temp_vals = vals.copy()
for x in vals:
if vals[x] > 1:
print 'delete this: ', 'key:',x,'value:', vals[x]
temp_vals.pop(x)
else:
pass
print 'temporary dictionary values:', temp_vals
'''till here I reduced down the values that appear once, 1, 2 and 4, now I would need the go back and check the original dictionary and return the keys
Original dictionary: {1: 1, 3: 2, 6: 0, 7: 0, 8: 4, 10: 0}
temp_vals {'1': 1, '2': 1, '4': 1}
keys on temp_vals (1,2,4) are the values associated to the keys I got to retrieve from original dictionary (1,3,8)
'''
print '---'
temp_list = []
for eachTempVal in temp_vals:
temp_list.append(eachTempVal)
print 'temporary list values:', temp_list
''' till here I got a temporary list with the values I need to search in aDict'''
print '---'
for eachListVal in temp_list:
print 'eachListVal:', eachListVal
for k,v in aDict.iteritems():
print 'key:',k,'value:',v
From here I cannot take the values for whatever reason and compare them, I've tried to extract the values with statements like:
if v == eachListVal:
do something
But I'm doing something wrong and cannot access to the values.

You just need to use your vals dict and keep keys from aDict with values that have a count == 1 in vals then calling sorted to get a sorted output list:
def existsOnce3(aDict):
vals = {}
# create dict to sum all value counts
for i in aDict.values():
vals.setdefault(i,0)
vals[i] += 1
# use each v/val from aDict as the key to vals
# keeping each k/key from aDict if the count is 1
return sorted(k for k, v in aDict.items() if vals[v] == 1)
Using a collections.Counter dict to do the counting just call Counter on your values then apply the same logic, just keep each k that has a v count == 1 from the Counter dict:
from collections import Counter
cn = Counter(aDict.values())
print(sorted(k for k,v in aDict.items() if cn[v] == 1))

How about this:
from collections import Counter
my_dict = {1: 1, 3: 2, 6: 0, 7: 0, 8: 4, 10: 0}
val_counter = Counter(my_dict.itervalues())
my_list = [k for k, v in my_dict.iteritems() if val_counter[v] == 1]
print my_list
Result:
[1, 3, 8]

One liner:
>>> aDictionary = {1: 1, 3: 2, 6: 0, 7: 0, 8: 4, 10: 0}
>>> unique_values = [k for k,v in aDictionary.items() if list(aDictionary.values()).count(v)==1]
>>> unique_values
[1, 3, 8]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to remove dictionary element by outlier values Python - python

Related

How can I get multiple shared items between two dictionaries in Python?

Merge random number of dicts in list

Extract x previous key-value pairs given a key from a dictionary

Function Return List as Dictionary?

Check for unique values in a dictionary and return a list

Categories

Resources