Filter a dictionary with a list of strings

Filter a dictionary with a list of strings - python

I have a dictionary where every key and every value is unique. I'd like to be able to filter based on a list of strings. I've seen lot of examples with the key is consistent but not where its unique like in the example below.
thisdict = { "brand": "Ford", "model": "Mustang", "year": 1964}
filt = ["rand", "ar"]
result
{"brand": "Ford","year": 1964}

I assume that the key of the dict should be contained in any filter value. Accordingly, my solution looks like this:
thisdict = { "brand": "Ford", "model": "Mustang", "year": 1964}
filt = ["rand", "ar"]
def matches(filter, value):
return any(x in value for x in filter)
def filter(dict, filt):
return {k: v for k, v in dict.items() if matches(filt, k)}
print(filter(thisdict, filt))
Output:
{'brand': 'Ford', 'year': 1964}
Or shortened:
thisdict = { "brand": "Ford", "model": "Mustang", "year": 1964}
filt = ["rand", "ar"]
filtered = {k: v for k, v in thisdict.items() if any(x in k for x in filt)}
print(filtered)
Output:
{'brand': 'Ford', 'year': 1964}

Use any() function to search for partially matching keys.
# use any to search strings in filt among keys in thisdict
{k:v for k,v in thisdict.items() if any(s in k for s in filt)}
# {'brand': 'Ford', 'year': 1964}

Related

Create a dict without if

I am trying to create a function to generate dictionary but without if.
Example:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
But sometimes year is not available for some models and instead of using if, want to create a function with parameters: brand, model, year. Once year is None or any other attributes is None, then avoid it.
Like:
Year is None
thisdict = {
"brand": "Ford",
"model": "Mustang"
}
What do you think, is that possible to create without if?

I believe you are talking about this. you can use kwargs.
def generate_dict(**kwargs):
return kwargs
generated_dict = generate_dict(name='John', age=30)
# {'name': 'John', 'age': 30}

Iterate over the dictionary keys and remove the keys with value None:
def create_dictionary(brand, model, year = None):
res = {
"brand": brand,
"model": model,
"year": year
}
res = { k:v for k,v in res.items() if v is not None}
return res
print(create_dictionary("Ford", "Mustang"))
Output:
{'brand': 'Ford', 'model': 'Mustang'}

Update Nested Dictionary value using Recursion

I want to update Dict dictionary's value by inp dictionary's values using recursion or loop.
also the format should not change mean use recursion or loop on same format
please suggest a solution that is applicable to all level nesting not for this particular case
dict={
"name": "john",
"quality":
{
"type1":"honest",
"type2":"clever"
},
"marks":
[
{
"english":34
},
{
"math":90
}
]
}
inp = {
"name" : "jack",
"type1" : "dumb",
"type2" : "liar",
"english" : 28,
"math" : 89
}

Another solution, changing the dict in-place:
dct = {
"name": "john",
"quality": {"type1": "honest", "type2": "clever"},
"marks": [{"english": 34}, {"math": 90}],
}
inp = {
"name": "jack",
"type1": "dumb",
"type2": "liar",
"english": 28,
"math": 89,
}
def change(d, inp):
if isinstance(d, list):
for i in d:
change(i, inp)
elif isinstance(d, dict):
for k, v in d.items():
if not isinstance(v, (list, dict)):
d[k] = inp.get(k, v)
else:
change(v, inp)
change(dct, inp)
print(dct)
Prints:
{
"name": "jack",
"quality": {"type1": "dumb", "type2": "liar"},
"marks": [{"english": 28}, {"math": 89}],
}

First, make sure you change the name of the first Dictionary, say to myDict, since dict is reserved in Python as a Class Type.
The below function will do what you are looking for, in a recursive manner.
def recursive_swipe(input_var, updates):
if isinstance(input_var, list):
output_var = []
for entry in input_var:
output_var.append(recursive_swipe(entry, updates))
elif isinstance(input_var, dict):
output_var = {}
for label in input_var:
if isinstance(input_var[label], list) or isinstance(input_var[label], dict):
output_var[label] = recursive_swipe(input_var[label], updates)
else:
if label in updates:
output_var[label] = updates[label]
else:
output_var = input_var
return output_var
myDict = recursive_swipe(myDict, inp)
You may look for more optimal solutions if there are some limits to the formatting of the two dictionaries that were not stated in your question.

reverse nested dicts using python

I already referred these posts here, here and here.
I have a sample dict like as shown below
t = {'thisdict':{
"brand": "Ford",
"model": "Mustang",
"year": 1964
},
'thatdict':{
"af": "jfsak",
"asjf": "jhas"}}
I am trying to reverse a python dictionary.
My code looks like below
print(type(t))
inv = {v:k for k,v in t.items()} # option 1 - error here
print(inv)
frozenset(t.items()) # option 2 - error here
Instead I get the below error
TypeError: unhashable type: 'dict'
Following the suggestion from post above, I tried frozen set, but still I get the same error
I expect my output to be like as below
t = {'thisdict':{
"Ford":"brand",
"Mustang":"model",
1964:"year"
},
'thatdict':{
"jfsak":"af",
"jhas":"asjf"}}

You can use nested dict comprehension:
t = {'thisdict': {"brand": "Ford", "model": "Mustang", "year": 1964},
'thatdict': {"af": "jfsak", "asjf": "jhas"}}
output = {k_out: {v: k for k, v in d.items()} for k_out, d in t.items()}
print(output)
# {'thisdict': {'Ford': 'brand', 'Mustang': 'model', 1964: 'year'}, 'thatdict': {'jfsak': 'af', 'jhas': 'asjf'}}

Robust way to sum all values corresponding to a particular objects property?

I have an array as such.
items = [
{
"title": "title1",
"category": "category1",
"value": 200
},
{
"title": "title2",
"category": "category2",
"value": 450
},
{
"title": "title3",
"category": "category1",
"value": 100
}
]
This array consists of many dictionaries with a property category and value.
What is the robust way of getting an array of category objects with their value summed like:
data= [
{
"category": "category1",
"value": 300
},
{
"category": "category2",
"value": 450
}
]
I'm looking for the best algorithm or way possible for both the small array as well as the huge array. If there is an existing algorithm please point me to the source.
What I tried??
data = []
for each item in items:
if data has a dictionary with dictionary.category == item.category:
data's dictionary.value = data's dictionary.value + item.value
else:
data.push({"category": item.category, "value":item.value})
Note: Any programming language is welcome. Please comment before downvoting.

In javascript, you can use reduce to group the array into an object. Use the category as the property. Use Object.values to convert the object into an array.
var items = [{
"title": "title1",
"category": "category1",
"value": 200
},
{
"title": "title2",
"category": "category2",
"value": 450
},
{
"title": "title3",
"category": "category1",
"value": 100
}
];
var data = Object.values(items.reduce((c, v) => {
c[v.category] = c[v.category] || {category: v.category,value: 0};
c[v.category].value += v.value;
return c;
}, {}));
console.log(data);

What you need is a SQL group by like operation. Usually, those group by operations are handling with hashing algorithms. If all your data could fit in memory (small to large data structures) you can implement it very quickly.
If your data structure is huge, you will need to use intermediate memory (such as hard drive or database).
An easy python approach will be:
data_tmp = {}
for item in items:
if item['category'] not in data_tmp:
data_tmp[item['category']] = 0
data_tmp[item['category']] += item['value']
data = []
for k, v in data_tmp.items():
data.append({
'category': k,
'value': v
})
# done
If you want more pythonic code you can use a defaultdict:
from collections import defaultdict
data_tmp = defaultdict(int)
for item in items:
data_tmp[item['category']] += item['value']
data = []
for k, v in data_tmp.items():
data.append({
'category': k,
'value': v
})
# done

In Python, Pandas is likely to be a more convenient and efficient way of doing this.
import pandas as pd
df = pd.DataFrame(items)
sums = df.groupby("category", as_index=False).sum()
data = sums.to_dict("records")
For the final step, it may be more convenient to leave sums as a dataframe and work with it like that instead of converting back to a list of dictionaries.

Using itertools.groupby
d = []
lista = sorted(items, key=lambda x: x['category'])
for k, g in groupby(lista, key=lambda x: x['category']):
temp = {}
temp['category'] = k
temp['value'] = sum([i['value'] for i in list(g)])
d.append(temp)
print(d)
# [{'category': 'category1', 'value': 300}, {'category': 'category2', 'value': 450}]

Python: Convert multiple columns of CSV file to nested Json

This is my input CSV file with multiple columns, I would like to convert this csv file to a json file with department, departmentID, and one nested field called customer and put first and last nested to this field.
department, departmentID, first, last
fans, 1, Caroline, Smith
fans, 1, Jenny, White
students, 2, Ben, CJ
students, 2, Joan, Carpenter
...
Output json file what I need:
[
{
"department" : "fans",
"departmentID: "1",
"customer" : [
{
"first" : "Caroline",
"last" : "Smith"
},
{
"first" : "Jenny",
"last" : "White"
}
]
},
{
"department" : "students",
"departmentID":2,
"user" :
[
{
"first" : "Ben",
"last" : "CJ"
},
{
"first" : "Joan",
"last" : "Carpenter"
}
]
}
]
my code:
from csv import DictReader
from itertools import groupby
with open('data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['group'], r['groupID'])):
groups.append({
"group": k[0],
"groupID": k[1],
"user": [{k:v for k, v in d.items() if k != 'group'} for d in list(g)]
})
uniquekeys.append(k)
pprint(groups)
My issue is: groupID shows twice in the data, in and out nested json. What I want is group and groupID as grouby key.

The issue was you mixed the names of the keys so this line
"user": [{k:v for k, v in d.items() if k != 'group'} for d in list(g)]
did not strip them properly from your dictionary there was no such key. So nothing was deleted.
I do not fully understand what keys you want so the following example assumes that data.csv looks exactly like in your question department and departmentID but the script converts it to group and groupID
from csv import DictReader
from itertools import groupby
from pprint import pprint
with open('data.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['department'], r['departmentID'])):
groups.append({
"group": k[0],
"groupID": k[1],
"user": [{k:v for k, v in d.items() if k not in ['department','departmentID']} for d in list(g)]
})
uniquekeys.append(k)
pprint(groups)
Output:
[{'group': 'fans',
'groupID': '1',
'user': [{'first': 'Caroline', 'last': 'Smith'},
{'first': 'Jenny', 'last': 'White'}]},
{'group': 'students',
'groupID': '2',
'user': [{'first': 'Ben', 'last': 'CJ'},
{'first': 'Joan', 'last': 'Carpenter'}]}]
I used different keys so it would be really obvious which line does what and easy to customize it for different keys in input or output

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Filter a dictionary with a list of strings - python

Use any() function to search for partially matching keys. # use any to search strings in filt among keys in thisdict {k:v for k,v in thisdict.items() if any(s in k for s in filt)} # {'brand': 'Ford', 'year': 1964}

Related

Create a dict without if

Update Nested Dictionary value using Recursion

reverse nested dicts using python

Robust way to sum all values corresponding to a particular objects property?

Python: Convert multiple columns of CSV file to nested Json

Categories

Resources