Intersection of two Python lists based on condition - python

I want to make intersection of these two python lists:
list_1_begin = ["i", "love", "to", "eat", "fresh", "apples", "yeah", "eat", "fresh"]
list_2_find = ["eat", "fresh"]
And my expected result should look like this:
expected result = ["0", "0", "0", "1", "1", "0", "0", "1", "1"]
This can be done by two for loops, but what if I have first list of 10000 elements and second list of 100 elements, also the phrase can repeat multiple times. Is there any Pythonic way?
Important:
For example:
list_1_begin = ["i", "love", "to", "eat", "the", "fresh", "apples", "yeah", "eat", "fresh"]
list_2_find = ["eat", "fresh"]
Solution should look like this:
expected result = ["0", "0", "0", "0", "0", "0", "0", "0", "1", "1"]
So only if all elements from list_2_find are in the list_1_begin in exact order

To keep it pythonic and efficient convert list_2_find to a set and use a list comprehension:
list_1_begin = ["i", "love", "to", "eat", "fresh", "apples", "yeah", "eat", "fresh"]
list_2_find = ["eat", "fresh"]
set_2_find = set(list_2_find)
result = [str(int(e in set_2_find)) for e in list_1_begin]
print(result)
Output
['0', '0', '0', '1', '1', '0', '0', '1', '1']
As an alternative for formatting a bool as an int, one approach is to use an f-string as follows:
result = [f"{(e in set_2_find):d}" for e in list_1_begin]
Output
['0', '0', '0', '1', '1', '0', '0', '1', '1']
Some additional info on f-string formatting can be found here.
UPDATE
If the matches must be sequential, use:
from itertools import chain
list_1_begin = ["i", "love", "to", "eat", "the", "fresh", "apples", "yeah", "eat", "fresh"]
list_2_find = ["eat", "fresh"]
len_1 = len(list_1_begin)
len_2 = len(list_2_find)
pos = chain.from_iterable([range(e, e + len_2) for e in range(len_1) if list_1_begin[e:e + len_2] == list_2_find])
positions_set = set(pos)
result = [f"{(i in positions_set):d}" for i in range(len_1)]
print(result)
Output
['0', '0', '0', '0', '0', '0', '0', '0', '1', '1']

Related

How to catch Json request when it's empty and stop it from crashing the Code Excecution in python? [duplicate]

I have been trying to seed a django DB with some covid data from an api and get a KeyError for a particular data type - in the source it is a floating_timstamp ("lab_report_date" : "2014-10-13T00:00:00.000"). (edit: not sure if the type is relevant, but trying to be comprehensive here).
I tried doing a more simple API request in python but get the same keyError. Below is my code and the error message.
import requests
response = requests.get("https://data.cityofchicago.org/resource/naz8-j4nc.json")
print(response.json())
The output looks like this:
[
{
"cases_age_0_17": "1",
"cases_age_18_29": "1",
"cases_age_30_39": "0",
"cases_age_40_49": "1",
"cases_age_50_59": "0",
"cases_age_60_69": "0",
"cases_age_70_79": "1",
"cases_age_80_": "0",
"cases_age_unknown": "0",
"cases_asian_non_latinx": "1",
"cases_black_non_latinx": "0",
"cases_female": "1",
"cases_latinx": "1",
"cases_male": "3",
"cases_other_non_latinx": "0",
"cases_total": "4",
"cases_unknown_gender": "0",
"cases_unknown_race_eth": "1",
"cases_white_non_latinx": "1",
"deaths_0_17_yrs": "0",
"deaths_18_29_yrs": "0",
"deaths_30_39_yrs": "0",
"deaths_40_49_yrs": "0",
show more (open the raw output data in a text editor) ...
"hospitalizations_unknown_gender": "3",
"hospitalizations_unknown_race_ethnicity": "16",
"hospitalizations_white_non_latinx": "135"
}
]
So far so good, but if I try to extract the problem key, i get the KeyError:
report_date = []
for i in response.json():
ls = i['lab_report_date']
report_date.append(ls)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/var/folders/h3/5wlbmz0s3jb978hyhtvf9f4h0000gn/T/ipykernel_2163/2095152945.py in <module>
1 report_date = []
2 for i in response.json():
----> 3 ls = i['lab_report_date']
4 report_date.append(ls)
KeyError: 'lab_report_date'
This issue occurs with or without using a for loop. I've gotten myself real turned around, so apologies if there are any errors or omissions in my code.
Because there's an item in the array response.json() that does not contain a key lab_report_date. That happens when the backend data is not so clean.
So what you need to do is to use try-except code block to handle this exception. The following code runs well now.
import requests
response = requests.get("https://data.cityofchicago.org/resource/naz8-j4nc.json")
print("The total length of response is %s" % len(response.json()))
report_date = []
for i in response.json():
try:
ls = i['lab_report_date']
report_date.append(ls)
except:
print("There is an item in the response containing no key lab_report_date:")
print(i)
print("The length of report_date is %s" % len(report_date))
The output of the above code is as follows.
The total length of response is 592
There is an item in the response containing no key lab_report_date:
{'cases_total': '504', 'deaths_total': '1', 'hospitalizations_total': '654', 'cases_age_0_17': '28', 'cases_age_18_29': '116', 'cases_age_30_39': '105', 'cases_age_40_49': '83', 'cases_age_50_59': '72', 'cases_age_60_69': '61', 'cases_age_70_79': '25', 'cases_age_80_': '14', 'cases_age_unknown': '0', 'cases_female': '264', 'cases_male': '233', 'cases_unknown_gender': '7', 'cases_latinx': '122', 'cases_asian_non_latinx': '15', 'cases_black_non_latinx': '116', 'cases_white_non_latinx': '122', 'cases_other_non_latinx': '30', 'cases_unknown_race_eth': '99', 'deaths_0_17_yrs': '0', 'deaths_18_29_yrs': '0', 'deaths_30_39_yrs': '0', 'deaths_40_49_yrs': '1', 'deaths_50_59_yrs': '0', 'deaths_60_69_yrs': '0', 'deaths_70_79_yrs': '0', 'deaths_80_yrs': '0', 'deaths_unknown_age': '0', 'deaths_female': '0', 'deaths_male': '1', 'deaths_unknown_gender': '0', 'deaths_latinx': '0', 'deaths_asian_non_latinx': '0', 'deaths_black_non_latinx': '0', 'deaths_white_non_latinx': '1', 'deaths_other_non_latinx': '0', 'deaths_unknown_race_eth': '0', 'hospitalizations_age_0_17': '30', 'hospitalizations_age_18_29': '78', 'hospitalizations_age_30_39': '74', 'hospitalizations_age_40_49': '96', 'hospitalizations_age_50_59': '105', 'hospitalizations_age_60_69': '111', 'hospitalizations_age_70_79': '89', 'hospitalizations_age_80_': '71', 'hospitalizations_age_unknown': '0', 'hospitalizations_female': '310', 'hospitalizations_male': '341', 'hospitalizations_unknown_gender': '3', 'hospitalizations_latinx': '216', 'hospitalizations_asian_non_latinx': '48', 'hospitalizations_black_non_latinx': '208', 'hospitalizations_white_non_latinx': '135', 'hospitalizations_other_race_non_latinx': '31', 'hospitalizations_unknown_race_ethnicity': '16'}
The length of report_date is 591
You can use the dict get method to read the data from json response like below :-
report_date = []
for i in response.json():
if type(i) == dict: # Just check the type to avoid the runtime error.
ls = i.get('lab_report_date', None)
if ls:
report_date.append(ls)
hi i have a similar issue which is sometimes the response comes empty
from the api request which cause to me a stop in the Code Execution :
i found an easy solution for it now :
let's say you have a :
requestfromapi = requests.get("https://api-server")
if requestfromapi.json()['data']['something'] != KeyError:
print(requestfromapi.json()['data']['something'])
// this will make sure that your code will not stop from executing .

How can I fix/ workaround this KeyError when I try to extract from a json via my python API request?

I have been trying to seed a django DB with some covid data from an api and get a KeyError for a particular data type - in the source it is a floating_timstamp ("lab_report_date" : "2014-10-13T00:00:00.000"). (edit: not sure if the type is relevant, but trying to be comprehensive here).
I tried doing a more simple API request in python but get the same keyError. Below is my code and the error message.
import requests
response = requests.get("https://data.cityofchicago.org/resource/naz8-j4nc.json")
print(response.json())
The output looks like this:
[
{
"cases_age_0_17": "1",
"cases_age_18_29": "1",
"cases_age_30_39": "0",
"cases_age_40_49": "1",
"cases_age_50_59": "0",
"cases_age_60_69": "0",
"cases_age_70_79": "1",
"cases_age_80_": "0",
"cases_age_unknown": "0",
"cases_asian_non_latinx": "1",
"cases_black_non_latinx": "0",
"cases_female": "1",
"cases_latinx": "1",
"cases_male": "3",
"cases_other_non_latinx": "0",
"cases_total": "4",
"cases_unknown_gender": "0",
"cases_unknown_race_eth": "1",
"cases_white_non_latinx": "1",
"deaths_0_17_yrs": "0",
"deaths_18_29_yrs": "0",
"deaths_30_39_yrs": "0",
"deaths_40_49_yrs": "0",
show more (open the raw output data in a text editor) ...
"hospitalizations_unknown_gender": "3",
"hospitalizations_unknown_race_ethnicity": "16",
"hospitalizations_white_non_latinx": "135"
}
]
So far so good, but if I try to extract the problem key, i get the KeyError:
report_date = []
for i in response.json():
ls = i['lab_report_date']
report_date.append(ls)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/var/folders/h3/5wlbmz0s3jb978hyhtvf9f4h0000gn/T/ipykernel_2163/2095152945.py in <module>
1 report_date = []
2 for i in response.json():
----> 3 ls = i['lab_report_date']
4 report_date.append(ls)
KeyError: 'lab_report_date'
This issue occurs with or without using a for loop. I've gotten myself real turned around, so apologies if there are any errors or omissions in my code.
Because there's an item in the array response.json() that does not contain a key lab_report_date. That happens when the backend data is not so clean.
So what you need to do is to use try-except code block to handle this exception. The following code runs well now.
import requests
response = requests.get("https://data.cityofchicago.org/resource/naz8-j4nc.json")
print("The total length of response is %s" % len(response.json()))
report_date = []
for i in response.json():
try:
ls = i['lab_report_date']
report_date.append(ls)
except:
print("There is an item in the response containing no key lab_report_date:")
print(i)
print("The length of report_date is %s" % len(report_date))
The output of the above code is as follows.
The total length of response is 592
There is an item in the response containing no key lab_report_date:
{'cases_total': '504', 'deaths_total': '1', 'hospitalizations_total': '654', 'cases_age_0_17': '28', 'cases_age_18_29': '116', 'cases_age_30_39': '105', 'cases_age_40_49': '83', 'cases_age_50_59': '72', 'cases_age_60_69': '61', 'cases_age_70_79': '25', 'cases_age_80_': '14', 'cases_age_unknown': '0', 'cases_female': '264', 'cases_male': '233', 'cases_unknown_gender': '7', 'cases_latinx': '122', 'cases_asian_non_latinx': '15', 'cases_black_non_latinx': '116', 'cases_white_non_latinx': '122', 'cases_other_non_latinx': '30', 'cases_unknown_race_eth': '99', 'deaths_0_17_yrs': '0', 'deaths_18_29_yrs': '0', 'deaths_30_39_yrs': '0', 'deaths_40_49_yrs': '1', 'deaths_50_59_yrs': '0', 'deaths_60_69_yrs': '0', 'deaths_70_79_yrs': '0', 'deaths_80_yrs': '0', 'deaths_unknown_age': '0', 'deaths_female': '0', 'deaths_male': '1', 'deaths_unknown_gender': '0', 'deaths_latinx': '0', 'deaths_asian_non_latinx': '0', 'deaths_black_non_latinx': '0', 'deaths_white_non_latinx': '1', 'deaths_other_non_latinx': '0', 'deaths_unknown_race_eth': '0', 'hospitalizations_age_0_17': '30', 'hospitalizations_age_18_29': '78', 'hospitalizations_age_30_39': '74', 'hospitalizations_age_40_49': '96', 'hospitalizations_age_50_59': '105', 'hospitalizations_age_60_69': '111', 'hospitalizations_age_70_79': '89', 'hospitalizations_age_80_': '71', 'hospitalizations_age_unknown': '0', 'hospitalizations_female': '310', 'hospitalizations_male': '341', 'hospitalizations_unknown_gender': '3', 'hospitalizations_latinx': '216', 'hospitalizations_asian_non_latinx': '48', 'hospitalizations_black_non_latinx': '208', 'hospitalizations_white_non_latinx': '135', 'hospitalizations_other_race_non_latinx': '31', 'hospitalizations_unknown_race_ethnicity': '16'}
The length of report_date is 591
You can use the dict get method to read the data from json response like below :-
report_date = []
for i in response.json():
if type(i) == dict: # Just check the type to avoid the runtime error.
ls = i.get('lab_report_date', None)
if ls:
report_date.append(ls)
hi i have a similar issue which is sometimes the response comes empty
from the api request which cause to me a stop in the Code Execution :
i found an easy solution for it now :
let's say you have a :
requestfromapi = requests.get("https://api-server")
if requestfromapi.json()['data']['something'] != KeyError:
print(requestfromapi.json()['data']['something'])
// this will make sure that your code will not stop from executing .

cleaning dict keys in nested dict of dicts & lists of dicts

I have a nested dict with list of dicts as well, and some of my keys have special chars. What is the best way to remove those special chars from the keys.
The below that I have attempted works on dicts of dicts, but how can i extend it to take care of list of dicts as well.
>>> a={"#pipeline": "start", "#args": "-vv", "#start": "1598331637", "#info": {"#pipeline_stage": "tasks","#taskbegin": [{"#task": "1", "#time": "1598331638"}, {"#task": "2", "#time": "1598331638"}, {"#task": "3", "#time": "1598331638"}]}}
>>> a
{'#pipeline': 'start', '#args': '-vv', '#start': '1598331637', '#info': {'#pipeline_stage': 'tasks', '#taskbegin': [{'#task': '1', '#time': '1598331638'}, {'#task': '2', '#time': '1598331638'}, {'#task': '3', '#time': '1598331638'}]}}
>>> def _clean_keys(d):
... return {''.join(filter(str.isalnum, k)): _clean_keys(v) for k, v in d.items()} if isinstance(d, dict) else d
...
>>> _clean_keys(a)
{'pipeline': 'start', 'args': '-vv', 'start': '1598331637', 'info': {'pipelinestage': 'tasks', 'taskbegin': [{'#task': '1', '#time': '1598331638'}, {'#task': '2', '#time': '1598331638'}, {'#task': '3', '#time': '1598331638'}]}}
>>>
As you can see, the taskbegin list is not cleaned.
Using recursion
Ex:
a={"#pipeline": "start", "#args": "-vv", "#start": "1598331637", "#info": {"#pipeline_stage": "tasks","#taskbegin": [{"#task": "1", "#time": "1598331638"}, {"#task": "2", "#time": "1598331638"}, {"#task": "3", "#time": "1598331638"}]}}
def _clean_keys(d):
res = {}
if isinstance(d, dict):
for k, v in d.items():
k = ''.join(filter(str.isalnum, k))
if isinstance(v, list): #Check if type of value is list
res[k] = [_clean_keys(i) for i in v] #use recursion
else:
res[k]= _clean_keys(v)
else:
res = d
return res
print(_clean_keys(a))
Output:
{'args': '-vv',
'info': {'pipelinestage': 'tasks',
'taskbegin': [{'task': '1', 'time': '1598331638'},
{'task': '2', 'time': '1598331638'},
{'task': '3', 'time': '1598331638'}]},
'pipeline': 'start',
'start': '1598331637'}
Try this, works fine
Code
def clean_dict(val):
if type(val) == list:
return clean_list(val)
if type(val) == dict:
return {clean(k) : clean_dict(v) for k, v in val.items()}
return val
def clean_list(val):
return [clean_dict(v) for v in val]
def clean(val):
''.join([c for c in val if c.isalnum()])
Output
a={"#pipeline": "start", "#args": "-vv", "#start": "1598331637", "#info": {"#pipeline_stage": "tasks","#taskbe
gin": [{"#task": "1", "#time": "1598331638"}, {"#task": "2", "#time": "1598331638"}, {"#task": "3", "#time": "1
598331638"}]}}
clean_dict(a)
Out[8]:
{'pipeline': 'start',
'args': '-vv',
'start': '1598331637',
'info': {'pipelinestage': 'tasks',
'taskbegin': [{'task': '1', 'time': '1598331638'},
{'task': '2', 'time': '1598331638'},
{'task': '3', 'time': '1598331638'}]}}

How to combine a list and dictionary in Python?

1) How we can combine a dict with list and return the result as JSON?
Have tried to combine list_1(dict) and list_2(list), but getting error. Also, after converting them to strings can combine but could not decode back to JSON format(as expected result below).
2) Also, how to replace a value within JSON and maintain it as JSON?
list_1 = [{'title': 'NEWBOOK', 'downloads': '4', 'views': '88'}]
list_2 = {'title': 'MASTERMIND', 'downloads': '16', 'views': '156'}
list_3 = {
'a': 'b',
'c': 'd',
'e': [{
'f': 'g',
'l': 'm'
}]
}
Script which I have tried as below.
combine = list_1 + list_2
for z in list_3['e']:
list_3 = list_3.replace(z, combine)
Expected_json = json.dumps(list_3)
print(list_3)
Error1:
combine = list_1 + list_2
TypeError: can only concatenate list (not "dict") to list
Error2:
list_3 = list_3.replace(z, combine)
AttributeError: 'dict' object has no attribute 'replace'
Expected result:
list_3 = {
"a": "b",
"c": "d",
"e": [{
"f": "g",
"l": "m"
},
{
"title": "NEWBOOK",
"downloads": "4",
"views": "88"
},
{
"title": "MASTERMIND",
"downloads": "16",
"views": "156"
}
]
}
Simply append to the list in the dictionary
list_3['e'].append(list_2)
list_3['e'].append(list_1[0])
print(list_3)
{
'a':
'b',
'c':
'd',
'e': [{
'f': 'g',
'l': 'm'
}, {
'title': 'MASTERMIND',
'downloads': '16',
'views': '156'
}, {
'title': 'NEWBOOK',
'downloads': '4',
'views': '88'
}]
}
import json
list_1 = [{'title': 'NEWBOOK', 'downloads': '4', 'views': '88'}]
list_2 = {'title': 'MASTERMIND', 'downloads': '16', 'views': '156'}
list_3 = {
'a': 'b',
'c': 'd',
'e': [{
'f': 'g',
'l': 'm'
}]
}
list_3['e'].append(list_1[0])
list_3['e'].append(list_2)
json_list = json.dumps(list_3)
if you want to add more lists to the location you do the following
b= json.loads(json_list)
b['e'].append(your_new_dict)
json_list = json.dumps(b)
if you have no idea what list_1 and list_2 are then you can test for the class type and append them accordingly. Like
if(type(list_1)==list):
list_3['e'].append(list_1[0])
if(type(list_2)==dict):
list_3['e'].append(list_2)
if you dont know at which point in list_3 you want to append the list. you do something like the following. Assuming there is only one list in list_3
for x in list_3.values():
if(type(x)==list):
x.append(list_1[0])
x.append(list_2)

Replace group of value in an array

I have an array, lets say this:
arr = ["60", "DD", "81", "01", "01", "29", "B8", "1B", "00", "30",
"2C", "46", "00", "0A", "81", "02", "0D", "25", "00", "37", "41",
"31", "00", "C2", "7F", "06", "00", "17", "94", "1A", "00", "48",
"06", "05", "00", "5C", "7F", "3E", "87", "FF", "0F", "B8", "0A",
"38", "0C"]
I am trying to replace every occurance of "81", "01"with "81" and "81", "02"with "82". I tried but it not replacing the values appropriately. Here is my code.
import numpy as np
values = np.array(arr)
searchval = ["81", "01"]
N = len(searchval)
possibles = np.where(values == searchval[0])[0]
solns = []
for p in possibles:
check = values[p:p+N]
if np.all(check == searchval):
arr.pop(p+1)
solns.append(p)
print(solns)
It would be great if someone can help me solving this. Thank you.
Given your two character strings, you could convert the list to string and do replacements with str.replace then split to return the transformed list:
s = ' '.join(arr)
s = s.replace('81 01', '81')
s = s.replace('81 02', '82')
print s.split()
# ['60', 'DD', '81', '01', '29', 'B8', '1B', '00', '30', '2C', '46', '00', '0A', '82', '0D', '25', '00', '37', '41', '31', '00', 'C2', '7F', '06', '00', '17', '94', '1A', '00', '48', '06', '05', '00', '5C', '7F', '3E', '87', 'FF', '0F', 'B8', '0A', '38', '0C']
Not very efficient but qiute concise and readable.
For an approach that works just with the list, and would generalize to other kinds of values:
solns = [arr[0]]
for i, entry in enumerate(arr[:-1]):
if entry == '81':
if arr[i+1] == '02':
solns[-1] = '82'
elif not arr[i+1] == '01':
solns.append(arr[i+1])
else:
solns.append(arr[i+1])
Or, if you prefer:
solns = [arr[0]]
for i, entry in enumerate(arr[:-1]):
if entry == '81':
if arr[i+1] == '02':
solns[-1] = '82'
continue
elif arr[i+1] == '01':
continue
solns.append(arr[i+1])

Categories