Python - Extracting values from a nested list - python

I have a list as shown below:
[{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
I am trying to find how many times type : type_1 occurs and what is the earliest created_at timestamp in that list for type_1

We can achieve this in several steps.
To find the number of times type_1 occurs we can use the built-in filter in tandem with itemgetter.
from operator import itemgetter
def my_filter(item):
return item['type'] == 'type_1'
key = itemgetter('created_at')
items = sorted(filter(my_filter, data), key=key)
print(f"Num records is {len(items)}")
print(f"Earliest record is {key(items[0])}")
Num records is 2
Earliest record is 2020-02-12T17:45:00Z
Conversely you can use a generator-comprehension and then sort the generator.
gen = (item for item in data if item['type'] == 'type_1')
items = sorted(gen, key=key)
# rest of the steps are the same...

You could use list comprehension to get all the sublists you're interested in, then sort by 'created_at'.
l = [{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
ll = [x for x in l if x['type'] == 'type_1']
ll.sort(key=lambda k: k['created_at'])
print(len(ll))
print(ll[0]['created_at'])
Output:
2
02/12/2020 17:45:00

This is one approach using filter and min.
Ex:
data = [{'id': 'id_123',
'type': 'type_1',
'created_at': '2020-02-12T17:45:00Z'},
{'id': 'id_124',
'type': 'type_2',
'created_at': '2020-02-12T18:15:00Z'},
{'id': 'id_125',
'type': 'type_1',
'created_at': '2020-02-13T19:43:00Z'},
{'id': 'id_126',
'type': 'type_3',
'created_at': '2020-02-13T07:00:00Z'}]
onlytype_1 = list(filter(lambda x: x['type'] == 'type_1', data))
print(len(onlytype_1))
print(min(onlytype_1, key=lambda x: x['created_at']))
Or:
temp = {}
for i in data:
temp.setdefault(i['type'], []).append(i)
print(len(temp['type_1']))
print(min(temp['type_1'], key=lambda x: x['created_at']))
Output:
2
{'id': 'id_123', 'type': 'type_1', 'created_at': '2020-02-12T17:45:00Z'}

You can just generate a list of all the type_1s using a list_comprehension, and them use sort with datetime.strptime to sort the values accordingly
from datetime import datetime
# Generate a list with only the type_1s' created_at values
type1s = [val['created_at'] for val in vals if val['type']=="type_1"]
# Sort them based on the timestamps
type1s.sort(key=lambda date: datetime.strptime(date, "%Y-%m-%dT%H:%M:%SZ"))
# Print the lowest value
print(type1s[0])
#'2020-02-12T17:45:00Z'

You can use the following function to get the desired output:
from datetime import datetime
def sol(l):
sum_=0
dict_={}
for x in l:
if x['type']=='type_1':
sum_+=1
dict_[x['id']]=datetime.strptime(x['created_at'], "%Y-%m-%dT%H:%M:%SZ")
date =sorted(dict_.values())[0]
for key,value in dict_.items():
if value== date: id_=key
return sum_,date,id_
sol(l)
This function gives the number of times type ='type_1', corresponding minimum date and its id respectively.
Hope this helps!

Related

Python, sort dict based on external list

I have to sort a dict like:
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
based on the "id" elements, whose order can be found in a list:
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
The trivial way to solve the problem is to use:
tmp = {}
for x in sorting_list:
for k, v in jobs.items():
if v["id"] == x:
tmp.update({k: v})
but I was trying to figure out a more efficient and pythonic way.
I've been trying sorted and lambda functions as key, but I'm not familiar with that yet, so I was unsuccessful so far.
I would use a dictionary as key for sorted:
order = {k:i for i,k in enumerate(sorting_list)}
# {'zeroth': 0, 'first': 1, 'second': 2, 'third': 3, 'fourth': 4, 'fifth': 5}
out = dict(sorted(jobs.items(), key=lambda x: order.get(x[1].get('id'))))
output:
{'elem_00': {'id': 'zeroth'},
'elem_01': {'id': 'first'},
'elem_02': {'id': 'second'},
'elem_03': {'id': 'third'},
'elem_04': {'id': 'fourth'},
'elem_05': {'id': 'fifth'}}
There is a way to sort the dict using lambda as a sorting key:
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
sorted_jobs = dict(sorted(jobs.items(), key=lambda x: sorting_list.index(x[1]['id'])))
print(sorted_jobs)
This outputs
{'elem_00': {'id': 'zeroth'}, 'elem_01': {'id': 'first'}, 'elem_02': {'id': 'second'}, 'elem_03': {'id': 'third'}, 'elem_04': {'id': 'fourth'}, 'elem_05': {'id': 'fifth'}}
I have a feeling the sorted expression could be cleaner but I didn't get it to work any other way.
You can use OrderedDict:
from collections import OrderedDict
sorted_jobs = OrderedDict([(el, jobs[key]['id']) for el, key in zip(sorting_list, jobs.keys())])
This creates an OrderedDict object which is pretty similar to dict, and can be converted to dict using dict(sorted_jobs).
Similar to what is already posted, but with error checking in case id doesn't appear in sorting_list
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
def custom_order(item):
try:
return sorting_list.index(item[1]["id"])
except ValueError:
return len(sorting_list)
jobs_sorted = {k: v for k, v in sorted(jobs.items(), key=custom_order)}
print(jobs_sorted)
The sorted function costs O(n log n) in average time complexity. For a linear time complexity you can instead create a reverse mapping that maps each ID to the corresponding dict entry:
mapping = {d['id']: (k, d) for k, d in jobs.items()}
so that you can then construct a new dict by mapping sorting_list with the ID mapping above:
dict(map(mapping.get, sorting_list))
which, with your sample input, returns:
{'elem_00': {'id': 'zeroth'}, 'elem_01': {'id': 'first'}, 'elem_02': {'id': 'second'}, 'elem_03': {'id': 'third'}, 'elem_04': {'id': 'fourth'}, 'elem_05': {'id': 'fifth'}}
Demo: https://replit.com/#blhsing/WorseChartreuseFonts

From list to nested dictionary

there are list :
data = ['man', 'man1', 'man2']
key = ['name', 'id', 'sal']
man_res = ['Alexandra', 'RST01', '$34,000']
man1_res = ['Santio', 'RST009', '$45,000']
man2_res = ['Rumbalski', 'RST50', '$78,000']
the expected output will be nested output:
Expected o/p:- {'man':{'name':'Alexandra', 'id':'RST01', 'sal':$34,000},
'man1':{'name':'Santio', 'id':'RST009', 'sal':$45,000},
'man2':{'name':'Rumbalski', 'id':'RST50', 'sal':$78,000}}
Easy way would be using pandas dataframe
import pandas as pd
df = pd.DataFrame([man_res, man1_res, man2_res], index=data, columns=key)
print(df)
df.to_dict(orient='index')
name id sal
man Alexandra RST01 $34,000
man1 Santio RST009 $45,000
man2 Rumbalski RST50 $78,000
{'man': {'name': 'Alexandra', 'id': 'RST01', 'sal': '$34,000'},
'man1': {'name': 'Santio', 'id': 'RST009', 'sal': '$45,000'},
'man2': {'name': 'Rumbalski', 'id': 'RST50', 'sal': '$78,000'}}
Or you could manually merge them using dict + zip
d = dict(zip(
data,
(dict(zip(key, res)) for res in (man_res, man1_res, man2_res))
))
d
{'man': {'name': 'Alexandra', 'id': 'RST01', 'sal': '$34,000'},
'man1': {'name': 'Santio', 'id': 'RST009', 'sal': '$45,000'},
'man2': {'name': 'Rumbalski', 'id': 'RST50', 'sal': '$78,000'}}
#save it in 2D array
all_man_res = []
all_man_res.append(man_res)
all_man_res.append(man1_res)
all_man_res.append(man2_res)
print(all_man_res)
#Add it into a dict output
output = {}
for i in range(len(l)):
person = l[i]
details = {}
for j in range(len(key)):
value = key[j]
details[value] = all_man_res[i][j]
output[person] = details
output
The pandas dataframe answer provided by NoThInG makes the most intuitive sense. If you are looking to use only the built in python tools, you can do
info_list = [dict(zip(key,man) for man in (man_res, man1_res, man2_res)]
output = dict(zip(data,info_list))

How can I remove nested keys and create a new dict and link both with an ID?

I have a problem. I have a dict my_Dict. This is somewhat nested. However, I would like to 'clean up' the dict my_Dict, by this I mean that I would like to separate all nested ones and also generate a unique ID so that I can later find the corresponding object again.
For example, I have detail: {...}, this nested, should later map an independent dict my_Detail_Dict and in addition, detail should receive a unique ID within my_Dict. Unfortunately, my list that I give out is empty. How can I remove my slaughtered keys and give them an ID?
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
my_new_dict_list = []
for key in my_Dict.keys():
#print(f"Looking for {key}")
if isinstance(my_Dict[key], dict):
print(f"{key} is nested")
# Add id to nested stuff
my_Dict[key]["__id"] = 1
my_nested_Dict = my_Dict[key]
# Delete all nested from the key
del my_Dict[key]
# Add id to key, but not the nested stuff
my_Dict[key] = 1
my_new_dict_list.append(my_Dict[key])
my_new_dict_list.append(my_Dict)
return my_new_dict_list
nested_dict(my_Dict)
[OUT] []
# What I want
[my_Dict, my_Details_Dict, my_Data_Dict]
What I have
{'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}}}
What I want
my_Dict = {'_key': '1',
'group': 'test',
'data': 18,
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': 22}
my_Data_Dict = {'__id': 18}
my_Detail_Dict = {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}, '__id': 22}
The following code snippet will solve what you are trying to do:
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
# Initializing a dictionary that will store all the nested dictionaries
my_new_dict = {}
idx = 0
for key in my_Dict.keys():
# Checking which keys are nested i.e are dictionaries
if isinstance(my_Dict[key], dict):
# Generating ID
idx += 1
# Adding generated ID as another key
my_Dict[key]["__id"] = idx
# Adding nested key with the ID to the new dictionary
my_new_dict[key] = my_Dict[key]
# Replacing nested key value with the generated ID
my_Dict[key] = idx
# Returning new dictionary containing all nested dictionaries with ID
return my_new_dict
result = nested_dict(my_Dict)
print(my_Dict)
# Iterating through dictionary to get all nested dictionaries
for item in result.items():
print(item)
If I understand you correctly, you wish to automatically make each nested dictionary it's own variable, and remove it from the main dictionary.
Finding the nested dictionaries and removing them from the main dictionary is not so difficult. However, automatically assigning them to a variable is not recommended for various reasons. Instead, what I would do is store all these dictionaries in a list, and then assign them manually to a variable.
# Prepare a list to store data in
inidividual_dicts = []
id_index = 1
for key in my_Dict.keys():
# For each key, we get the current value
value = my_Dict[key]
# Determine if the current value is a dictionary. If so, then it's a nested dict
if isinstance(value, dict):
print(key + " is a nested dict")
# Get the nested dictionary, and replace it with the ID
dict_value = my_Dict[key]
my_Dict[key] = id_index
# Add the id to previously nested dictionary
dict_value['__id'] = id_index
id_index = id_index + 1 # increase for next nested dic
inidividual_dicts.append(dict_value) # store it as a new dictionary
# Manually write out variables names, and assign the nested dictionaries to it.
[my_Details_Dict, my_Data_Dict] = inidividual_dicts

Build a dictionary with single elements or lists as values

I have a list of dictionaries:
mydict = [
{'name': 'test1', 'value': '1_1'},
{'name': 'test2', 'value': '2_1'},
{'name': 'test1', 'value': '1_2'},
{'name': 'test1', 'value': '1_3'},
{'name': 'test3', 'value': '3_1'},
{'name': 'test4', 'value': '4_1'},
{'name': 'test4', 'value': '4_2'},
]
I would like to use it to create a dictionary where the values are lists or single values depending of number of their occurrences in the list above.
Expected output:
outputdict = {
'test1': ['1_1', '1_2', '1_3'],
'test2': '2_1',
'test3': '3_1',
'test4': ['4_1', '4_2'],
}
I tried to do it the way below but it always returns a list, even when there is just one value element.
outputdict = {}
outputdict.setdefault(mydict.get('name'), []).append(mydict.get('value'))
The current output is:
outputdict = {
'test1': ['1_1', '1_2', '1_3'],
'test2': ['2_1'],
'test3': ['3_1'],
'test4': ['4_1', '4_2'],
}
Do what you have already done, and then convert single-element lists afterwards:
outputdict = {
name: (value if len(value) > 1 else value[0])
for name, value in outputdict.items()
}
You can use a couple of the built-in functions mainly itertools.groupby:
from itertools import groupby
from operator import itemgetter
mydict = [
{'name': 'test1', 'value': '1_1'},
{'name': 'test2', 'value': '2_1'},
{'name': 'test1', 'value': '1_2'},
{'name': 'test1', 'value': '1_3'},
{'name': 'test3', 'value': '3_1'},
{'name': 'test4', 'value': '4_1'},
{'name': 'test4', 'value': '4_2'},
]
def keyFunc(x):
return x['name']
outputdict = {}
# groupby groups all the items that matches the returned value from keyFunc
# in our case it will use the names
for name, groups in groupby(mydict, keyFunc):
# groups will contains an iterator of all the items that have the matched name
values = list(map(itemgetter('value'), groups))
if len(values) == 1:
outputdict[name] = values[0]
else:
outputdict[name] = values
print(outputdict)

Dynamically assign obtained results to variables in Python

I have an API response for listing out information of all Volumes. I want to loop through the response and get the value of the name and assign each one of them dynamically to each url.
This is my main API endpoint which returns the following:
[{'source': None, 'serial': '23432', 'created': '2018-11-
12T04:27:14Z', 'name': 'v001', 'size':
456456}, {'source': None, 'serial': '4364576',
'created': '2018-11-12T04:27:16Z', 'name': 'v002',
'size': 345435}, {'source': None, 'serial':
'6445645', 'created': '2018-11-12T04:27:17Z', 'name': 'v003', 'size':
23432}, {'source': None,
'serial': 'we43235', 'created': '2018-11-12T04:27:20Z',
'name': 'v004', 'size': 35435}]
I'm doing this to get the value of 'name'
test_url = 'https://0.0.0.0/api/1.1/volume'
test_data = json.loads(r.get(test_url, headers=headers,
verify=False).content.decode('UTF-8'))
new_data = [{
'name': value['name']
} for value in test_data]
final_data = [val['name'] for val in new_data]
for k in final_data:
print(k)
k prints out all the values in name, but i'm stuck at where i want to be able to use it in assigning different API endpoints. Now, k returns
v001
v002
v003
v004
I want to assign each one of them to different endpoints like below:
url_v001 = test_url + v001
url_v002 = test_url + v002
url_v003 = test_url + v003
url_v004 = test_url + v004
I want this to be dynamically done, because there may be more than 4 volume names returned by my main API.
It wouldn't be good to do that, but the best way is to use a dictionary:
d={}
for k in final_test:
d['url_'+k] = test_url + k
Or much better in a dictionary comprehension:
d={'url_'+k:test_url + k for k in final_test}
And now:
print(d)
Both reproduce:
{'url_v001': 'https://0.0.0.0/api/1.1/volumev001', 'url_v002': 'https://0.0.0.0/api/1.1/volumev002', 'url_v003': 'https://0.0.0.0/api/1.1/volumev003', 'url_v004': 'https://0.0.0.0/api/1.1/volumev004'}
To use d:
for k,v in d.items():
print(k+',',v)
Outputs:
url_v001, https://0.0.0.0/api/1.1/volumev001
url_v002, https://0.0.0.0/api/1.1/volumev002
url_v003, https://0.0.0.0/api/1.1/volumev003
url_v004, https://0.0.0.0/api/1.1/volumev004

Categories