How to compare between two dictionaries using threads - python

Im currently working on a comparison where I am trying to solve on how I am able to compare between two dictionaries where the first requests does a GET and scrapes the data to a dictionary and then I want to compare to for the next request using the same method and see if there has been any changes on the webpage. I have currently done:
import random
import threading
import time
from concurrent.futures import as_completed
from concurrent.futures.thread import ThreadPoolExecutor
import requests
from bs4 import BeautifulSoup
URLS = [
'https://github.com/search?q=hello+world',
'https://github.com/search?q=python+3',
'https://github.com/search?q=world',
'https://github.com/search?q=i+love+python',
'https://github.com/search?q=sport+today',
'https://github.com/search?q=how+to+code',
'https://github.com/search?q=banana',
'https://github.com/search?q=android+vs+iphone',
'https://github.com/search?q=please+help+me',
'https://github.com/search?q=batman',
]
def doRequest(url):
response = requests.get(url)
time.sleep(random.randint(10, 30))
return response, url
def doScrape(response):
soup = BeautifulSoup(response.text, 'html.parser')
return {
'title': soup.find("input", {"name": "q"})['value'],
'repo_count': soup.find("span", {"data-search-type": "Repositories"}).text.strip()
}
def checkDifference(parsed, url):
def threadPoolLoop():
with ThreadPoolExecutor(max_workers=1) as executor:
future_tasks = [
executor.submit(
doRequest,
url
) for url in URLS]
for future in as_completed(future_tasks):
response, url = future.result()
if response.status_code == 200:
checkDifference(doScrape(response), url)
while True:
t = threading.Thread(target=threadPoolLoop, )
t.start()
print('Joining thread and waiting for it to finish...')
t.join()
My problem is that I do not know how I can print out whenever there has been a change for either title or/and repo_count? (The whole point will be that I will run this script 24/7 and I always want it to print out whenever there has been a change)

If you're looking for a simple method to compare two dictionaries, there are a few different options.
Some good resources to begin:
mCoding: zipping together Python dicts
StackOverflow: Comparing two dictionaries and checking how many (key, value) pairs are equal
Let's start with two dictionaries to compare 👇 Some added elements, some removed, some changed, some same.
dict1 = {
"value_2": 2,
"value_3": 3,
"value_4": 4,
"value_5": "five",
"value_6": "six",
}
dict2 = {
"value_1": 1,
"value_2": 2,
"value_4": 4
}
You could probably use the unittest library. Like this:
>>> from unittest import TestCase
>>> TestCase().assertDictEqual(dict1, dict1) # <-- No output, because they are the same
>>> TestCase().assertDictEqual(dict1, dict2) # <-- Will raise error and display elements which are different
AssertionError: {'value_2': 2, 'value_3': 3, 'value_4': 4, 'value_5': 'five', 'value_6': 'six'} != {'value_1': 1, 'value_2': 3, 'value_4': 4}
- {'value_2': 2, 'value_3': 3, 'value_4': 4, 'value_5': 'five', 'value_6': 'six'}
+ {'value_1': 1, 'value_2': 3, 'value_4': 4}
But the challenge there is that it will raise an error when they are different; which is probably not what you're looking for. You simply want to see when they are different.
Another method is the deepdiff library. Like this:
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> pprint(DeepDiff(dict1, dict2))
{'dictionary_item_added': [root['value_1']],
'dictionary_item_removed': [root['value_3'], root['value_5'], root['value_6']],
'values_changed': {"root['value_2']": {'new_value': 3, 'old_value': 2}}}
Or, you could easily craft your own functions. Like this 👇 (functions copied from here)
>>> from pprint import pprint
>>> def compare_dict(d1, d2):
... return {k: d1[k] for k in d1 if k in d2 and d1[k] == d2[k]}
>>> pprint(compare_dict(dict1, dict2))
{'value_4': 4}
>>> def dict_compare(d1, d2):
... d1_keys = set(d1.keys())
... d2_keys = set(d2.keys())
... shared_keys = d1_keys.intersection(d2_keys)
... added = d1_keys - d2_keys
... removed = d2_keys - d1_keys
... modified = {o: {"old": d1[o], "new": d2[o]} for o in shared_keys if d1[o] != d2[o]}
... same = set(o for o in shared_keys if d1[o] == d2[o])
... return {"added": added, "removed": removed, "modified": modified, "same": same}
>>> pprint(dict_compare(dict1, dict2))
{'added': {'value_6', 'value_3', 'value_5'},
'modified': {'value_2': {'old': 2, 'new': 3}},
'removed': {'value_1'},
'same': {'value_4'}}

Related

How edit python dict in example?

I have a dict:
my_dict = {'some.key' : 'value'}
and i want to change it like this:
result = {'some' : {'key' : 'value'}}
how i can do this?
I need to this to create nested classes using dicts:
example:
my_dict = {'nested.key' : 'value'}
class Nested:
key : str
class MyDict:
nested : Nested
if you need this for real use, and not as a coding exercise, you can install extradict and use extradict.NestedData:
In [1]: from extradict import NestedData
In [2]: a = NestedData({'some.key' : 'value'})
In [3]: a["some"]
Out[3]: {'key': <str>}
In [4]: a["some"]["key"]
Out[4]: 'value'
In [5]: a.data
Out[5]: {'some': {'key': 'value'}}
(disclaimer: I am the package author)
Not quite sure if I understand your question, but would
result = {key.split('.')[0]: {key.split('.')[1]: value} for key, value in my_dict.items()}
do the trick?
I hope this function will help you
def foo(obj):
result = {}
for k, v in obj.items():
keys = k.split('.')
caret = result
for i in range(len(keys)):
curr_key = keys[i]
if i == len(keys) - 1:
caret[curr_key] = v
else:
caret.setdefault(curr_key, {})
caret = caret[curr_key]
return result
with recurtion it could look like this (having all keys unique is essential):
my_dict = {'key0' : 'value0',
'nested.key' : 'value',
'nested1.nested1.key1' : 'value1',
'nested2.nested2.nested2.key2' : 'value2'}
def func(k,v):
if not '.' in k: return {k:v}
k1,k = k.split('.',1)
return {k1:func(k,v)}
res = {}
for k,v in my_dict.items():
res.update(func(k,v))
>>> res
'''
{'key0': 'value0',
'nested': {'key': 'value'},
'nested1': {'nested1': {'key1': 'value1'}},
'nested2': {'nested2': {'nested2': {'key2': 'value2'}}}}

Apply json patch to a Mongoengine document

I'm trying to apply a json-patch to a Mongoengine Document.
I'm using these json-patch library: https://github.com/stefankoegl/python-json-patch and mongoengine 0.14.3 with python 3.6.3
This is my actual code:
json_patch = JsonPatch.from_string(jp_string)
document = Document.objects(id=document_id)
json_documents = json.loads(document.as_pymongo().to_json())
json_patched_document = json_patch.apply(json_documents[0])
Document.objects(id=document_id).first().delete()
Document
.from_json(json.dumps(json_patched_document))
.save(force_insert=True)
Is there a better way to save an edited json document?
I've enhanced a little bit the code:
json_patch = JsonPatch.from_string(jp_string)
document = Document.objects(id=document_id)
json_document = json.loads(document.as_pymongo().to_json())
json_patched_document = json_patch.apply(json_documents[0])
Document
.from_json(json.dumps(json_patched_document), created=True)
.save()
but, is there a way to not convert the document to json?
I had slightly similar problem, the part that I dont wanted the complete Document for saving, I just wanted to update fields which are modified/added.
heres the code I tests on below inputs:
def tryjsonpatch():
doc_in_db = {'foo': 'bar', "name": "aj", 'numbers': [1, 3, 7, 8]}
input = {'foo': 'bar', "name": "dj", 'numbers': [1, 3, 4, 8]}
input2 = {'foo': 'bar', "name": "aj", 'numbers': [1, 3, 7, 8], "extera": "12"}
input3 = {'foo': 'bar', "name": "dj", 'numbers': [1, 3, 4, 8], "extera": "12"}
patch = jsonpatch.JsonPatch.from_diff(doc_in_db, input3)
print("\n***patch***\n", patch)
doc = get_minimal_doc(doc_in_db, patch)
result = patch.apply(doc, in_place=True)
print("\n###result###\n", result,
"\n###present###\n", doc_in_db)
def get_minimal_doc(present, patch):
cur_dc = {}
for change in patch.patch:
if change['op'] not in ("add"):
keys = change['path'].split("/")[1:]
present_move = {}
old_key = 1
first = True
for key in keys:
if key.isdigit(): # old_key represented a array
cur_dc[old_key] = present_move
else:
if first:
cur_dc[key] = {}
first = False
else:
cur_dc[old_key][key] = {}
old_key = key
present_move = present[old_key]
return cur_dc
tryjsonpatch()

How to store Python dictionary with QSettings

The code below stores Python data dictionary using QSettings object.
After reading it back the dictionary comes with all its keys as QString like so:
{PyQt4.QtCore.QString(u'one'): 1, PyQt4.QtCore.QString(u'two'): 2}
I wonder if it would be possible to read the dictionary with a regular string keys like this:
{'one': 1, 'two': 2}
Code:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data').toPyObject()
print data
Python2
Is not possible directly, you have to convert them to a regular dictionary.
d = {}
for k, v in data.items():
d[str(k)] = v
Complete code:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data').toPyObject()
d = {}
for k, v in data.items():
d[str(k)] = v
print(d)
output:
{'two': 2, 'one': 1}
Python3
This problem does not exist since it returns you a regular dictionary, it is no longer necessary to convert it with toPyObject().
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data')
print(data)
output:
{'one': 1, 'two': 2}
original = {PyQt4.QtCore.QString(u'one'): 1, PyQt4.QtCore.QString(u'two'): 2}
converted = {str(k): val for k, v in original.items()}

Python: Append Multiple Values for One Key in Nested Dictionary

I have the below list of tuples:
p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
I want my output to look like:
y = {
"Master":{"number":["01"]},
"Node":{"number":["02", "03"]},
"Server":{"number":["04"]}
}
I have tried the below code:
y = {}
for line in p:
if line[1] in y:
y[line[1]] = {}
y[line[1]]["number"].append(line[0])
else:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
And I get the below error:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
KeyError: 'number'
How do I solve this?
from collections import defaultdict
d = defaultdict(lambda: defaultdict(list))
for v, k in p:
d[k]["number"].append(v)
print(d)
defaultdict(<function <lambda> at 0x7f8005097578>, {'Node': defaultdict(<type 'list'>, {'number': ['02', '03']}), 'Master': defaultdict(<type 'list'>, {'number': ['01']}), 'Server': defaultdict(<type 'list'>, {'number': ['04']})})
without defaultdict:
d = {}
from pprint import pprint as pp
for v, k in p:
d.setdefault(k,{"number":[]})
d[k]["number"].append(v)
pp(d)
{'Master': {'number': ['01']},
'Node': {'number': ['02', '03']},
'Server': {'number': ['04']}}
It's because you don't initialize your dictionary when needed, and you reset it when not needed.
Try this:
p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
y = {}
for (number, category) in p:
if not y.get(category, False):
# initializes your sub-dictionary
y[category] = {"number": []}
# adds the correct number to the sub-dictionary
y[category]["number"].append(number)
Note that using a tuple unpacking for (number, category) in p allows your code to be more readable inside your loop.
You are resetting the dictionary!
for line in p:
if line[1] in y:
#y[line[1]] = {} -- RESET! ["number"] will now disappear.
#.. which leads to error in the next line.
y[line[1]]["number"].append(line[0])
else:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
A more pythonic way of achieving the same thing would be by using a defaultdict as demonstrated in other answers.
Do not assign {} to key when key is already present in y.
y = {}
for line in p:
try:
y[line[1]]["number"].append(line[0])
except:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
OR
Use defaultdict use:-
>>> from collections import defaultdict
>>> p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
>>> d = defaultdict(list)
>>> for k, v in p:
... d[v].append(k)
...
>>> d
defaultdict(<type 'list'>, {'Node': ['02', '03'], 'Master': ['01'], 'Server': ['04']})

Get information from different dict by dict name

I have a data/character_data.py:
CHARACTER_A = { 1: {"level": 1, "name":"Ann", "skill_level" : 1},
2: {"level": 2, "name":"Tom", "skill_level" : 1}}
CHARACTER_B = { 1: {"level": 1, "name":"Kai", "skill_level" : 1},
2: {"level": 2, "name":"Mel", "skill_level" : 1}}
In main.py, I can do this:
from data import character_data as character_data
print character_data.CHARACTER_A[1]["name"]
>>> output: Ann
print character_data.CHARACTER_B[2]["name"]
>>> output: Mel
How do I achieve this?
from data import character_data as character_data
character_type = "CHARACTER_A"
character_id = 1
print character_data.character_type[character_id]["name"]
>>> correct output should be: Ann
I get AttributeError when try use character_type as "CHARACTER_A".
How about this
In [38]: from data import character_data as character_data
In [39]: character_type = "CHARACTER_A"
In [40]: character_id = 1
In [41]: getattr(character_data, character_type)[character_id]["name"]
Out[41]: 'Ann'
You can use locals():
>>> from data.character_data import CHARACTER_A, CHARACTER_B
>>> character_id = 1
>>> character_type = "CHARACTER_A"
>>> locals()[character_type][character_id]["name"]
Ann
Though, think about merging CHARACTER_A and CHARACTER_B into one dict and access this dict instead of locals().
Also, see Dive into Python: locals and globals.
You need to structure your data properly.
characters = {}
characters['type_a'] = {1: {"level": 1, "name":"Ann", "skill_level" : 1},
2: {"level": 2, "name":"Tom", "skill_level" : 1}}
characters['type_b'] = ...
Or, the better solution is to create your own "character" type, and use that instead:
class Character(object):
def __init__(self, type, level, name, skill):
self.type = type
self.level = level
self.name = name
self.skill = skill
characters = []
characters.append(Character('A',1,'Ann',1))
characters.append(Character('A',2,'Tom',1))
characters.append(Character('B',2,'Kai',1)) # and so on
Then,
all_type_a = []
looking_for = 'A'
for i in characters:
if i.type == looking_for:
all_type_a.append(i)
Or, the shorter way:
all_type_a = [i for i in characters if i.type == looking_for]

Categories