I had a list of single long string and I wanted to print the output in a particular form.
convert list to a particular json in python
but after conversion order of data changed. How can I maintain the same order?
input_data =
[
"21:15-21:30 IllegalAgrumentsException 1,
21:15-21:30 NullPointerException 2,
22:00-22:15 UserNotFoundException 1,
22:15-22:30 NullPointerException 1
....."
]
Code to covert the data in particular json form:
input_data = input[0] // input is list of single long string.
input_data = re.split(r',\s*', input_data)
output = collections.defaultdict(collections.Counter)
# print(output)
for line in input_data:
time, error, count = line.split(None, 2)
output[time][error] += int(count)
print(output)
response = [
{
"time": time,
"logs": [
{"exception": exception, "count": count}
for (exception, count) in counter.items()
],
}
for (time, counter) in output.items())
]
print(response)
My output :
{
"response": [
{
"logs": [
{
"count": 1,
"exception": "UserNotFoundException"
}
],
"time": "22:45-23:00"
},
{
"logs": [
{
"count": 1,
"exception": "NullPointerException"
}
],
"time": "23:00-23:15"
}...
]
}
so my order is changed but I need my data to be in same order i.e start from 21:15-21:30 and so on.. How can I maintain the same order ?
Your timestamps are already sortable, so if you don't care about the order of individual exceptions, you can just do:
for (time, counter) in sorted(output.items())
which will do a lexicographical sort by time and then by count. You can do sorted(output.items(), key=lambda x: x[0]) if you want just sort by time, or key=lambda x: x[0], -x[1] for by time and count descending.
The data is read into a dictionary, a defaultdict to be precise:
output[time][error] += int(count)
This data structure is grouping the data by time and by error type, which implies that there may be multiple items with the same time and the same error time. There is no way to have the "same order", if the data is regrouped like that.
On the other hand, you probably expect the time to be ordered in the input and even if it is not, you want output ordered by time, yo sou just need to do that, so instead of this:
for (time, counter) in output.items()
do this:
for time in sorted(output)
and then get the counter as
counter = output[time]
EDIT: time is sorted, but not starting at 0:00, sorting by time string is not correct. Instead, sorting the time by the original time order is correct.
Therefore, remember the original time order:
time_order = []
for line in input_data:
time, error, count = line.split(None, 2)
output[time][error] += int(count)
time_order.append(time)
Then later sort by it:
for time in sorted(output, key=time_order.index)
Related
I have below code to count the element in JSON and remove its duplicate.
My problem is when it need to read thousand line of data, this code take long time to finish.
Can anyone help me if we have better way to do this?
#count json element
BattleAmount = []
for i in DATA:
amount = DATA.count(i)
j = copy.deepcopy(i)
j['md']["amount"] = j['md']["amount"] + amount
BattleAmount.append(j)
print("Number of BattleAmount are ", len(BattleAmount))
#remove duplicate
duplicates=[]
for i in BattleAmount:
if BattleAmount.count(i)>1:
if i not in duplicates:
duplicates.append(i)
JSON as this format
[{"_id": {"$oid": "SL"}, "md": {"mana": 24, "rule_set": "Standard", "amount": 12}, "team": {other dict here}
full JSON structure as below
thank you
If you do not care about the order of your elements in BattleAmount you can just use set() function
unique_elements = set(BattleAmount)
I have two dictionaries of data which needs to be compared and fetch the respective data from one dictionary to another:
netname = []
netstatus = []
Dict1:
data1: {
"node1":["id1",["net1","net2"]],
"node2":["id2",["net3","net4"]],
"node3":["id3",["net5","net1"]],
"node4":["id4",["net2","net5"]],
....
....
....
}
Dict2:
data2: {
"detail1":["net1","id1","netone","available"],
"detail2":["net2","id2","nettwo","available"],
"detail3":["net1","id3","netthree","not available"],
"detail4":["net4","id4","netfour","not available"],
"detail5":["net5","id4","netfive","available"],
"detail6":["net2","id2","netsix","available"],
....
....
}
I am trying to get the complete details of each of every node in a tabular format using prettytable:
The code I am trying here is:
for node,values in data1.items():
id = values[0]
networks = values[1]
for network in networks:
if any(any(network in x for x in netlist) for netlist in data2.values()):
if any((network in y for y in data2.values() if y[0] == network and y[1] == id)):
for val in data2.values():
if (val[0] == network and val[1] == id):
nwinfo = netname.append(val[2])
nwstatus = netstatus.append(val[3])
else:
print("node id",id,"is not registered in network",network)
else:
print("Node is not registered in any networks..")
when I executed this code, I am getting false values. Do the any(condition) correct here or do I need to add anything to display the correct values after comparing data1 with data2.
First any() condition in the above script is to check if id of data1 dict is present in entire dict data2
second any() condition in the above script is to check if the id is connected to respective network or not
In the above case, I want to check if the id and the net(n) should be compared properly in dict1 and dict2 and display the respective values.
One way forward would be to reshape your inputs to make the join easier. I would do that with comprehensions based around a new "key" that would be the combination of "id" and "net".
You can uncomment the print statements to get a better look at what the reshaping is doing if you like.
data1 = {
"node1":["id1",["net1","net2"]],
"node2":["id2",["net3","net4"]],
"node3":["id3",["net5","net1"]],
"node4":["id4",["net2","net5"]],
}
data2 = {
"detail1":["net1","id1","netone","available"],
"detail2":["net2","id2","nettwo","available"],
"detail3":["net1","id3","netthree","not available"],
"detail4":["net4","id4","netfour","not available"],
"detail5":["net5","id4","netfive","available"],
"detail6":["net2","id2","netsix","available"],
}
data1_reshaped = [
f"{cell}:{row[0]}"
for row in data1.values() for cell in row[1]
]
#print(data1_reshaped)
data2_reshaped = {
f"{x[0]}:{x[1]}": {"name": x[2], "status": x[3]}
for x in data2.values()
}
#print(data2_reshaped)
netname = []
netstatus = []
## now it is a simple lookup based on an array of keys
for key in data1_reshaped:
match = data2_reshaped.get(key)
if not match:
continue
netname.append(match["name"])
netstatus.append(match["status"])
print(netname)
print(netstatus)
This should give you:
['netone', 'netthree', 'netfive']
['available', 'not available', 'available']
If in the end you want something more like:
{
'netone': 'available',
'netthree': 'not available',
'netfive': 'available'
}
The join step even simpler.
I have a JSON object which I first convert into a dictionary. In this nested dictionary, I have two sections with respective values:
SKU
Page_Limit
Now I want to programmatically add s string separator like "--" after every n-th SKU id. The n depends on the "Page_Limit". Here is the data:
import json
postitemsonpage = {"SKU": [
{
"text": "socks"
}
],
"Page_Limit": [
{
"index": 0
},
{
"index": 2
]}
From here, I am not sure how to bring in the last piece of adding the "Page_limit" element into the output where the "--" appears after every n-th value (based on "index").
Using enumerate with start=1 and a set comprehension, you can get the output you want:
def solution(postitemsonpage):
content = json.loads(json.dumps(postitemsonpage))
limits = {x["index"] for x in content['Page_Limit']}
for count, i in enumerate(content['SKU'], 1):
sku=int(i['id'][1])
text=i['text']
print(sku,text)
if count in limits:
print("--")
Output:
0 socks
1 shoes
--
2 tshirt
3 ring
--
4 bra
5 leggins
This is a straight forward question, How to use python to process the log file (Consider it as a json string for now). Below is the json data:
{
"voltas": {
"ac": [
{
"timestamp":1590761564,
"is_connected":true,
"reconnection_status":"N/A"
},
{
"timestamp":1590761566,
"is_connected":true,
"reconnection_status":"N/A"
},
{
"timestamp":1590761568,
"is_connected":false,
"reconnection_status":"true"
},
{
"timestamp":1590761570,
"is_connected":true,
"reconnection_status":"N/A"
},
{
"timestamp":1590761572,
"is_connected":true,
"reconnection_status":"N/A"
},
{
"timestamp":1590761574,
"is_connected":false,
"reconnection_status":"false"
},
{
"timestamp":1590761576,
"is_connected":false,
"reconnection_status":"true"
}
]
}
}
Since the question is just regarding how to process the json data, I am skipping the discussion about the data in json. Now, what I need is the analysed data as below.
{
"voltas" : [
"ac": {
"number_of_actual_connection_drops": 3,
"time_interval_between_droppings": [4, 8],
"number_of_successful_reconnections": 2,
"number_of_failure_reconnections": 1
}
]
}
This is how the data is analysed:
"number_of_actual_connection_drops": Number of "is_connected" == false.
"time_interval_between_droppings": It is a list which will be populated from the end(append from beginning). We need to pick the time stamp of the item which will have "is_connected":false, and "reconnection_status":"true". In this case last(7th item) block with timestamp = 1590761576. Now we need fo find the timestamp of previous block with "is_connected":false, and "reconnection_status":"true". In this case it's 3rd item with timestamp 1590761568. Now the last item in the list is difference of this timestamps 8. Now the list is [8].
Now the timestamp is 1590761568 and we don't have any previous block with is_connected: false, and reconnection_status: true, so we will take the first items timestamp which is 1590761564 and now the difference is 4. So the list is [4, 8]
"number_of_successful_reconnections": Number of "reconnected_status" = true
"number_of_failure_connections": Number of "reconnected_status" = false
We can achieve this with for loops and some if conditions. I am interested in doing this using functional programming ways (reduce, map, filter) in python.
For simplification I have mentioned only "ac". There will be many items similar to this. Thanks.
I am trying to grab this data and print into a string of text i am having the worst! issues getting this to work.
Here is the source i am working with to get a better understanding i am working on an envirmental controller and my sonoff switch combined
https://github.com/FirstCypress/LiV/blob/master/software/liv/iotConnectors/sonoff/sonoff.py this code works for two pages once completed so ignore the keys for tempature etc
m = json.loads(content)
co2 = m["Value"]
I need the value of "Value" under the "TaskValues" it should be either a 1 or a 0 in almost any case how would i pulled that key in the right form?
"Sensors":[
{
"TaskValues": [
{"ValueNumber":1,
"Name":"Switch",
"NrDecimals":0,
"Value":0
}],
"DataAcquisition": [
{"Controller":1,
"IDX":0,
"Enabled":"false"
},
{"Controller":2,
"IDX":0,
"Enabled":"false"
},
{"Controller":3,
"IDX":0,
"Enabled":"false"
}],
"TaskInterval":0,
"Type":"Switch input - Switch",
"TaskName":"relias",
"TaskEnabled":"true",
"TaskNumber":1
}
],
"TTL":60000
}
You can get it by
m['Sensors'][0]['TaskValues'][0]['Value']
"Value" is nested in your json, as you've mentioned. To get what you want, you'll need to traverse the parent data structures:
m = json.loads(content)
# This is a list
a = m.get('Sensors')
# This is a dictionary
sensor = a[0]
# This is a list
taskvalue = sensor.get('TaskValues')
# Your answer
value = taskvalue[0].get('Value')