Parsing dictionary and grouping output with Python

Parsing dictionary and grouping output with Python - python

Let's say I have a
dictionary = {
'host_type' : {'public_ip':['ip_address','ip_address','ip_address'],
'private_dns':['dns_name','dns_name','dns_name']}
}
There are some host types, let's say there are 3 host types: master,slave,backup
The output from the dictionary can contain different amount of hosts for each host type. For example, for 2 masters, 6 slaves, 2 backups the dictionary would look like this:
dictionary =
{
'master' : {
'public_ip':['ip_address','ip_address'],
'private_dns': ['dns_name','dns_name']
},
'slave' : {
'public_ip':['ip_address','ip_address', 'ip_address','ip_address','ip_address','ip_address'],
'private_dns': ['dns_name','dns_name','dns_name','dns_name','dns_name','dns_name']
},
'backup' : {
'public_ip':['ip_address','ip_address'],
'private_dns':['dns_name','dns_name']
}
}
Now I want to parse the dictionary and group the hosts in such way that I always have 1 master, 1 backup, 3 slaves. How can I parse such dictionary to achieve similar effect:
master,public_ip,private_dns
backup,public_ip,private_dns
slave,public_ip,private_dns
slave,public_ip,private_dns
slave,public_ip,private_dns
master,public_ip,private_dns
backup,public_ip,private_dns
slave,public_ip,private_dns
slave,public_ip,private_dns
slave,public_ip,private_dns

d = {
'master' : {
'public_ip':['ip_address0M','ip_address1M'],
'private_dns': ['dns_name','dns_name']
},
'slave' : {
'public_ip':['ip_address0s','ip_address1s', 'ip_address2s','ip_address3s','ip_address4s','ip_address5s'],
'private_dns': ['dns_name','dns_name','dns_name','dns_name','dns_name','dns_name']
},
'backup' : {
'public_ip':['ip_address0b','ip_address1b'],
'private_dns':['dns_name','dns_name']
}
}
masterCount = 0
slavecount = 0
backupCount = 0
result = list()
while(masterCount + 1 <= len(d['master']['public_ip']) and slavecount + 3 <= len(d['slave']['public_ip']) and backupCount + 1 <= len(d['backup']['public_ip'])):
result.append([])
tempList = [d['master']['public_ip'][masterCount], d['slave']['public_ip'][slavecount:slavecount+3], d['backup']['public_ip'][backupCount]]
result[masterCount].append(tempList)
masterCount+=1
slavecount+=3
backupCount==1
print(result)
Now result is of the format:
result[index][0] is master
result[index][1] is slave
result[index][2] is backup
[EDIT]
You can do something similar to add the DNS. I have not added it as you mentioned you only wanted the directions.
Output:
[[['ip_address0M', ['ip_address0s', 'ip_address1s', 'ip_address2s'], 'ip_address0b']], [['ip_address1M', ['ip_address3s', 'ip_address4s', 'ip_address5s'], 'ip_address0b']]]

m1 = d['master']['public']
m2 = d['master']['private']
b1 = d['backup']['public']
b2 = d['backup']['private']
s1 = d['slave']['public']
s2 = d['slave']['private']
zip(zip(m1, m2), zip(b1, b2), zip(*[iter(zip(s1, s2))]*3))
There's probably a better solution for resolving all the lists from the dictionary, but this should work.

Related

How to take 2 Tables of data, and come up with combination that fit the restrictions

So, what I would like to do is use these two tables here, and come up with a combination of items from Table 1 that will add up to the total from the combination of Table 2 + 1500 or less, but can never go under the value of Table 2 + 500. Then at the end it should return the combination which will be later used in the rest of the code.
For example lets say we came up with a combination and this combination uses all 4 items in Table 2, we are able to use all of them since it meets the restrictions, and now if we add all the values in Table 2 you get 11,620. Now we have to come up with a combination from Table 1 that has the value that is at least 12,120 but less than 13,120.
If you require more detail about what I'm trying to archive here please let me know!
Restrictions
Each combination can only have up to 4 items
The value of Each item is defined by the "value.
Table 2
[
{
"UAID":143071570,
"assetId":19027209,
"name":"Perfectly Legitimate Business Hat",
"value":10549
},
{
"UAID":143334875,
"assetId":19027209,
"name":"Perfectly Legitimate Business Hat",
"value":10549
},
{
"UAID":1235149469,
"assetId":100425864,
"name":"Deluxe Game Headset",
"value":1795
},
{
"UAID":2756318596,
"assetId":20573078,
"name":"Shaggy",
"value":1565
},
{
"UAID":3499638196,
"assetId":20573078,
"name":"Shaggy",
"value":1565
},
{
"UAID":11002211144,
"assetId":102618797,
"name":"DJ Remix's Goldphones",
"value":7393
},
{
"UAID":50913661583,
"assetId":4390875496,
"name":"Diamond Crystal Circlet",
"value":4886
}
]
Table 2
[
{
"UAID":672099668,
"assetId":60888284,
"name":"DarkAge Ninjas: Dual Kamas",
"value":4461
},
{
"UAID":6599510068,
"assetId":554663566,
"name":"Manicbot 10000",
"value":4319
},
{
"UAID":63414825508,
"assetId":91679217,
"name":"Sailing Hat",
"value":1886
},
{
"UAID":150428091864,
"assetId":8785277745,
"name":"Cincinnati Bengals Super Bowl LVI Helmet",
"value":954
}
]

This is a technique that I've shown multiple times, and nobody else seems to have heard of.
The idea is the same as Finding all possible combinations of numbers to reach a given sum but more complicated since we need to do it repeatedly. It is to create a data structure from which subsets can be easily found that sum to a particular value.
class SubsetSumIter:
def __init__ (self, data):
self.data = data
# Build up an auxilliary data structure to find solutions.
last_index = {0: [-1]}
for i in range(len(data)):
for s in list(last_index.keys()):
new_s = s + data[i]['value']
if new_s in last_index:
last_index[new_s].append(i)
else:
last_index[new_s] = [i]
self.last_index_by_target = last_index
self.targets = sorted(last_index.keys())
def subsets_at_target(self, target, max_i=None):
if max_i is None:
max_i = len(self.data)
for i in self.last_index_by_target[target]:
if i == -1:
yield [] # empty sum
elif max_i <= i:
break # went past our solutions
else:
for answer in self.subsets_at_target(target - self.data[i]["value"], i):
answer.append(self.data[i])
yield answer
def subsets_in_range(self, lower, upper):
i_min = 0
i_max = len(self.targets)
while 1 < i_max - i_min:
i_mid = (i_min + i_max) // 2
if self.targets[i_mid] < lower:
i_min = i_mid
else:
i_max = i_mid
i = i_min + 1
while i < len(self.targets) and self.targets[i] <= upper:
for answer in self.subsets_at_target(self.targets[i]):
yield answer
i = i+1
From this we can create your desired join condition as follows:
def complex_join(table1, table2):
iter1 = SubsetSumIter(table1)
iter2 = SubsetSumIter(table2)
# For each sum from table2
for target in iter2.targets:
# For each combination from table 1 in our desired range
for subset1 in iter1.subsets_in_range(target + 500, target + 1500):
# For each combination from table 2 that gets that target
for subset2 in iter2.subsets_at_target(target):
yield (subset1, subset2)
And to find all 38 solutions to your example:
t1 = [
{
"UAID":143071570,
"assetId":19027209,
"name":"Perfectly Legitimate Business Hat",
"value":10549
},
{
"UAID":143334875,
"assetId":19027209,
"name":"Perfectly Legitimate Business Hat",
"value":10549
},
{
"UAID":1235149469,
"assetId":100425864,
"name":"Deluxe Game Headset",
"value":1795
},
{
"UAID":2756318596,
"assetId":20573078,
"name":"Shaggy",
"value":1565
},
{
"UAID":3499638196,
"assetId":20573078,
"name":"Shaggy",
"value":1565
},
{
"UAID":11002211144,
"assetId":102618797,
"name":"DJ Remix's Goldphones",
"value":7393
},
{
"UAID":50913661583,
"assetId":4390875496,
"name":"Diamond Crystal Circlet",
"value":4886
}
]
t2 = [
{
"UAID":672099668,
"assetId":60888284,
"name":"DarkAge Ninjas: Dual Kamas",
"value":4461
},
{
"UAID":6599510068,
"assetId":554663566,
"name":"Manicbot 10000",
"value":4319
},
{
"UAID":63414825508,
"assetId":91679217,
"name":"Sailing Hat",
"value":1886
},
{
"UAID":150428091864,
"assetId":8785277745,
"name":"Cincinnati Bengals Super Bowl LVI Helmet",
"value":954
}
]
for answer in complex_join(t1, t2):
print(answer)
And if you want to get a (possibly large) list at the end you can simply list(complex_join(t1, t2)).

How to combine / concatenate a variable and "object path" in a for loop?

I am working on a script than can transform a .json format into a .idf format for energy simulations in EnergyPlus. As part of this script, I need to create schedules based on a number of points in time and values until that given time. As an example, this is on of the .json elements I am trying to convert:
"BOT": {"SpacesInModel": [
{"IndoorClimateZone": {
"Schedules": {
"PeopleSchedule": {"Timer": [
{ "$numberInt": "0" },
{ "$numberInt": "10" },
{ "$numberInt": "20" },
{ "$numberInt": "24" }
],
"Load": [{ "$numberDouble": "0.5" }, { "$numberInt": "1" }, { "$numberDouble": "0.5" }]
}
I have currently created the following code that reads and writes the required input for the .idf file, but I would like to do this as a for loop rather than a number of if statements
### Define helper function determining the number format
def IntOrDouble(path_string):
try:
return_value = path_string["$numberInt"]
except Exception:
return_value = path_string["$numberDouble"]
return(return_value)
### Create .idf object and loop over .json format extracting schedule inputs
for item in data['BOT']['SpacesInModel']:
InputFile.newidfobject("Schedule:Day:Interval") #.idf object
DailySchedule = InputFile.idfobjects["Schedule:Day:Interval"][-1]
People_Time = item["IndoorClimateZone"]["Schedules"]["PeopleSchedule"]["Timer"]
People_Load = item["IndoorClimateZone"]["Schedules"]["PeopleSchedule"]["Load"]
if len(People_Time) >= 0 and len(People_Time) != 0:
DailySchedule.Time_2 = IntOrDouble(People_Time[0])
if len(People_Time) >= 1 and len(People_Time) != 1:
DailySchedule.Time_2 = IntOrDouble(People_Time[1])
DailySchedule.Value_Until_Time_2 = IntOrDouble(People_Load[0])
if len(People_Time) >= 2 and len(People_Time) != 2:
DailySchedule.Time_3 = IntOrDouble(People_Time[2])
DailySchedule.Value_Until_Time_3 = IntOrDouble(People_Load[1])
if len(People_Time) >= 3 and len(People_Time) != 3:
DailySchedule.Time_4 = IntOrDouble(People_Time[3])
DailySchedule.Value_Until_Time_4 = IntOrDouble(People_Load[2])
if len(People_Time) >= 4 and len(People_Time) != 4:
DailySchedule.Time_5 = IntOrDouble(People_Time[4])
DailySchedule.Value_Until_Time_4 = IntOrDouble(People_Load[3])
My problem is that I do not know how to "concatenate" the variable DailySchedule with the changeable object name /path e.g. Time_1 or Load_4. The index of Time_i Load_i would have to follow the index of the for loop. As of now, this is the closest I got (knowing that this is not a real solution :-) )
for i in range(len(People_Time)):
DailySchedule."Time_{0}".format(i+1) = IntOrDouble(People_Time[i+1])
DailySchedule."Load_{0}".format(i+1) = IntOrDouble(People_Load[i])

You can use pythons "F-strings" to add a variable to the string and square bracket notation to access the item in the dictionary.
for i in range(len(People_Time)):
DailySchedule[f"Time_{i+1}"] = IntOrDouble(People_Time[i+1])
DailySchedule[f"Time_{i}"] = IntOrDouble(People_Load[i])

Kotlin set Array as key for a HashMap

I'm doing a bit of Leetcode, and I'm facing this issue: Group Anagrams, I have a Python background and I can do the following:
res = defaultdic(list)
count = [0] * 26
res[tuple(count)].append(s)
as we can see we can set the tupled array as the key for the dictionary, I want to do the same thing in Kotlin, however, when creating this in Kotlin, I get a different object every time when adding this logic in a for loop.
fun groupAnagrams(strs: Array<String>): List<List<String>> {
val hashMap = hashMapOf<IntArray, ArrayList<String>>()
for (word in strs) {
val array = IntArray(26) { 0 }
for (char in word) {
val charInt = char - 'a'
array[charInt] += 1
}
if (hashMap.containsKey(array)) {
hashMap[array]!!.add(word)
} else {
hashMap[array] = ArrayList<String>().apply { add(word) }
}
}
return hashMap.values.toList()
}
Is this something can be done in Kotlin?

Equality for IntArray is checked based on its reference. You can use a List here in place of IntArray. Two Lists are equal if they contain the same elements.
Modified code will be like this:
fun groupAnagrams(strs: Array<String>): List<List<String>> {
val hashMap = hashMapOf<List<Int>, ArrayList<String>>()
for (word in strs) {
val array = List(26) { 0 }.toMutableList()
for (char in word) {
val charInt = char - 'a'
array[charInt] += 1
}
if (hashMap.containsKey(array)) {
hashMap[array]!!.add(word)
} else {
hashMap[array] = ArrayList<String>().apply { add(word) }
}
}
return hashMap.values.toList()
}

Avoiding the problem you run into (equality of arrays) by using String keys:
fun groupAnagramsWithHashing(strs: Array<String>): List<List<String>> {
val map = hashMapOf<String, MutableList<String>>()
MessageDigest.getInstance("SHA-1").also { sha ->
for (word in strs) {
word.toByteArray().sorted().forEach { sha.update(it) }
val key = sha.digest().joinToString()
map.computeIfAbsent(key) { mutableListOf() }.add(word)
}
}
return map.values.toList()
}
fun main() {
val input = arrayOf("eat", "tea", "tan", "ate", "nat", "bat")
groupAnagramsWithHashing(input).also { println(it) }
// [[eat, tea, ate], [bat], [tan, nat]]
}

Avoid iterating too much time - Algorithm construction

I have a list - memory_per_instance - which looks like the following:
[
{
'mem_used': '14868480',
'rsrc_name': 'node-5b5cf484-g582f'
},
{
'mem_used': '106618880',
'rsrc_name': 'infrastructure-656cf59bbb-xc6bb'
},
{
'mem_used': '27566080',
'rsrc_name': 'infrastructuret-l6fl'
},
{
'mem_used': '215556096',
'rsrc_name': 'node-62lnc'
}
]
Now, here we can see that there is 2 resources groups node and infrastructure.
I would like to create a array of which the final product contains the name of the resource (node or infrastructure) and the mem_used would be the sum of the mem_used.
I was already already able to differentiate the two groups from it, with regex.
From now, how can I create an array - memory_per_group - with a result such has
[
{
'mem_used': '230424576',
'rsrc_name': 'node'
},
{
'mem_used': '134184960',
'rsrc_name': 'infrastructure'
},
]
I could store the name of the rsrc in a tmp variable, so something like:
memory_per_pod_group = []
for item in memory_per_pod_instance:
tmp_rsrc = item['rsrc_name']
if(item['rsrc_name'] == tmp_rsrc):
memory_per_pod_group.append({'rsrc_name':get_group(tmp_rsrc, pod_hash_map), 'mem_used':mem_used})
memory_per_pod_instance.remove(item)
pprint.pprint(memory_per_pod_group)
But then, I would iterate through the list a non-negligeable amount of time.
Would there be a way to be more efficient ?

Well, sure. You only need one iteration:
data = [
{
'mem_used': '14868480',
'rsrc_name': 'node-5b5cf484-g582f'
},
{
'mem_used': '106618880',
'rsrc_name': 'infrastructure-656cf59bbb-xc6bb'
},
{
'mem_used': '27566080',
'rsrc_name': 'infrastructuret-l6fl'
},
{
'mem_used': '215556096',
'rsrc_name': 'node-62lnc'
}
]
def get_group(item):
rsrc_name = item['rsrc_name']
index = rsrc_name.index('-');
return rsrc_name[0:index]
def summary(list):
data = {};
for item in list:
group = get_group(item)
if not (group in data):
data[group] = 0
data[group] += int(item['mem_used'])
result = []
for rsrc_name, mem_used in data.items():
result.append({ 'rsrc_name': rsrc_name, 'mem_used': str(mem_used) })
return result
if __name__ == '__main__':
print(summary(data))
Result:
[{'mem_used': 230424576, 'rsrc_name': 'node'}, {'mem_used': 106618880, 'rsrc_name': 'infrastructure'}, {'mem_used': 27566080, 'rsrc_name': 'infrastructuret'}]
Note, that get_group might be too simple for your use case. The result has three groups since one of the resources has key 'infrastructuret' with a "t" at the end.

You could just iterate trough it a single time and checking with a simple startswith and then appending directly to the dictionary key that you want with a simple increment.
Something like
memory_total = { 'node': 0, 'instance': 0 };
for item in memory_per_instance:
if item['rsrc_name'].startsWith('node'):
memory_total['node'] += item['mem_used']
if item['rsrc_name'].startsWith('infrastructure'):
memory_total['instance'] += item['mem_used']

Logic for building converter using python dictionary values

I have such slice of loaded json tp python dictionary (size_dict):
{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
},
{
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
},
{
"sizeOptionName":"M",
"sizeOptionId":"1530",
"sortOrderNumber":"7095"
}
and I have products with size Id (dictionary_prod):
{
"catalogItemId":"7627712",
"catalogItemTypeId":"3",
"regularPrice":"0.0",
"sizeDimension1Id":"1528",
"sizeDimension2Id":"0",
}
I need to make such as output for any product:
result_dict = {'variant':
[{"catalogItemId":"7627712", ...some other info...,
'sizeName': 'XS', 'sizeId': '1525'}}]}
so I need to convert size ID and add it to new result object
What is the best pythonic way to do this?
I dont know how to get right data from size_dict
if int(dictionary_prod['sizeDimension1Id']) > o:
(result_dict['variant']).append('sizeName': size_dict???)

As Tommy mentioned, this is best facilitated by mapping the size id's to their respective dictionaries.
size_dict = \
[
{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
},
{
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
},
{
"sizeOptionName":"M",
"sizeOptionId":"1530",
"sortOrderNumber":"7095"
}
]
size_id_map = {size["sizeOptionId"] : size for size in size_dict}
production_dict = \
[
{
"catalogItemId":"7627712",
"catalogItemTypeId":"3",
"regularPrice":"0.0",
"sizeDimension1Id":"1528",
"sizeDimension2Id":"0",
}
]
def make_variant(idict):
odict = idict.copy()
size_id = odict.pop("sizeDimension1Id")
odict.pop("sizeDimension2Id")
odict["sizeName"] = size_id_map[size_id]["sizeOptionName"]
odict["sizeId"] = size_id
return odict
result_dict = \
{
"variant" : [make_variant(product) for product in production_dict]
}
print(result_dict)

Your question is a little confusing but it looks like you have a list (size_dict) of dictionaries that contain some infroamtion and you want to do a lookup to find a particular element in the list that contains the SizeOptionName you are interested in so that you can read off the SizeOptionID.
So first you could organsie your size_dict as a dictionary rather than a list - i.e.
sizeDict = {"XS":{
"sizeOptionName":"XS",
"sizeOptionId":"1528",
"sortOrderNumber":"7017"
}, "S": {
"sizeOptionName":"S",
"sizeOptionId":"1529",
"sortOrderNumber":"7047"
}, ...
You could then read off the SizeOptionID you need by doing:
sizeDict[sizeNameYouAreLookingFor][SizeOptionID]
Alternative you could keep your current structure and just search the list of dictionaries that is size_dict.
So:
for elem in size_dict:
if elem.SizeOptionID == sizeYouAreLookingFor:
OptionID = elem.SizeOptionId
Or perhaps you are asking something else?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing dictionary and grouping output with Python - python

Related

How to take 2 Tables of data, and come up with combination that fit the restrictions

How to combine / concatenate a variable and "object path" in a for loop?

Kotlin set Array as key for a HashMap

Avoid iterating too much time - Algorithm construction

Logic for building converter using python dictionary values

Categories

Resources