Im currently working on a comparison where I am trying to solve on how I am able to compare between two dictionaries where the first requests does a GET and scrapes the data to a dictionary and then I want to compare to for the next request using the same method and see if there has been any changes on the webpage. I have currently done:
import random
import threading
import time
from concurrent.futures import as_completed
from concurrent.futures.thread import ThreadPoolExecutor
import requests
from bs4 import BeautifulSoup
URLS = [
'https://github.com/search?q=hello+world',
'https://github.com/search?q=python+3',
'https://github.com/search?q=world',
'https://github.com/search?q=i+love+python',
'https://github.com/search?q=sport+today',
'https://github.com/search?q=how+to+code',
'https://github.com/search?q=banana',
'https://github.com/search?q=android+vs+iphone',
'https://github.com/search?q=please+help+me',
'https://github.com/search?q=batman',
]
def doRequest(url):
response = requests.get(url)
time.sleep(random.randint(10, 30))
return response, url
def doScrape(response):
soup = BeautifulSoup(response.text, 'html.parser')
return {
'title': soup.find("input", {"name": "q"})['value'],
'repo_count': soup.find("span", {"data-search-type": "Repositories"}).text.strip()
}
def checkDifference(parsed, url):
def threadPoolLoop():
with ThreadPoolExecutor(max_workers=1) as executor:
future_tasks = [
executor.submit(
doRequest,
url
) for url in URLS]
for future in as_completed(future_tasks):
response, url = future.result()
if response.status_code == 200:
checkDifference(doScrape(response), url)
while True:
t = threading.Thread(target=threadPoolLoop, )
t.start()
print('Joining thread and waiting for it to finish...')
t.join()
My problem is that I do not know how I can print out whenever there has been a change for either title or/and repo_count? (The whole point will be that I will run this script 24/7 and I always want it to print out whenever there has been a change)
If you're looking for a simple method to compare two dictionaries, there are a few different options.
Some good resources to begin:
mCoding: zipping together Python dicts
StackOverflow: Comparing two dictionaries and checking how many (key, value) pairs are equal
Let's start with two dictionaries to compare 👇 Some added elements, some removed, some changed, some same.
dict1 = {
"value_2": 2,
"value_3": 3,
"value_4": 4,
"value_5": "five",
"value_6": "six",
}
dict2 = {
"value_1": 1,
"value_2": 2,
"value_4": 4
}
You could probably use the unittest library. Like this:
>>> from unittest import TestCase
>>> TestCase().assertDictEqual(dict1, dict1) # <-- No output, because they are the same
>>> TestCase().assertDictEqual(dict1, dict2) # <-- Will raise error and display elements which are different
AssertionError: {'value_2': 2, 'value_3': 3, 'value_4': 4, 'value_5': 'five', 'value_6': 'six'} != {'value_1': 1, 'value_2': 3, 'value_4': 4}
- {'value_2': 2, 'value_3': 3, 'value_4': 4, 'value_5': 'five', 'value_6': 'six'}
+ {'value_1': 1, 'value_2': 3, 'value_4': 4}
But the challenge there is that it will raise an error when they are different; which is probably not what you're looking for. You simply want to see when they are different.
Another method is the deepdiff library. Like this:
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> pprint(DeepDiff(dict1, dict2))
{'dictionary_item_added': [root['value_1']],
'dictionary_item_removed': [root['value_3'], root['value_5'], root['value_6']],
'values_changed': {"root['value_2']": {'new_value': 3, 'old_value': 2}}}
Or, you could easily craft your own functions. Like this 👇 (functions copied from here)
>>> from pprint import pprint
>>> def compare_dict(d1, d2):
... return {k: d1[k] for k in d1 if k in d2 and d1[k] == d2[k]}
>>> pprint(compare_dict(dict1, dict2))
{'value_4': 4}
>>> def dict_compare(d1, d2):
... d1_keys = set(d1.keys())
... d2_keys = set(d2.keys())
... shared_keys = d1_keys.intersection(d2_keys)
... added = d1_keys - d2_keys
... removed = d2_keys - d1_keys
... modified = {o: {"old": d1[o], "new": d2[o]} for o in shared_keys if d1[o] != d2[o]}
... same = set(o for o in shared_keys if d1[o] == d2[o])
... return {"added": added, "removed": removed, "modified": modified, "same": same}
>>> pprint(dict_compare(dict1, dict2))
{'added': {'value_6', 'value_3', 'value_5'},
'modified': {'value_2': {'old': 2, 'new': 3}},
'removed': {'value_1'},
'same': {'value_4'}}
Related
I have a dict:
my_dict = {'some.key' : 'value'}
and i want to change it like this:
result = {'some' : {'key' : 'value'}}
how i can do this?
I need to this to create nested classes using dicts:
example:
my_dict = {'nested.key' : 'value'}
class Nested:
key : str
class MyDict:
nested : Nested
if you need this for real use, and not as a coding exercise, you can install extradict and use extradict.NestedData:
In [1]: from extradict import NestedData
In [2]: a = NestedData({'some.key' : 'value'})
In [3]: a["some"]
Out[3]: {'key': <str>}
In [4]: a["some"]["key"]
Out[4]: 'value'
In [5]: a.data
Out[5]: {'some': {'key': 'value'}}
(disclaimer: I am the package author)
Not quite sure if I understand your question, but would
result = {key.split('.')[0]: {key.split('.')[1]: value} for key, value in my_dict.items()}
do the trick?
I hope this function will help you
def foo(obj):
result = {}
for k, v in obj.items():
keys = k.split('.')
caret = result
for i in range(len(keys)):
curr_key = keys[i]
if i == len(keys) - 1:
caret[curr_key] = v
else:
caret.setdefault(curr_key, {})
caret = caret[curr_key]
return result
with recurtion it could look like this (having all keys unique is essential):
my_dict = {'key0' : 'value0',
'nested.key' : 'value',
'nested1.nested1.key1' : 'value1',
'nested2.nested2.nested2.key2' : 'value2'}
def func(k,v):
if not '.' in k: return {k:v}
k1,k = k.split('.',1)
return {k1:func(k,v)}
res = {}
for k,v in my_dict.items():
res.update(func(k,v))
>>> res
'''
{'key0': 'value0',
'nested': {'key': 'value'},
'nested1': {'nested1': {'key1': 'value1'}},
'nested2': {'nested2': {'nested2': {'key2': 'value2'}}}}
I'm trying to apply a json-patch to a Mongoengine Document.
I'm using these json-patch library: https://github.com/stefankoegl/python-json-patch and mongoengine 0.14.3 with python 3.6.3
This is my actual code:
json_patch = JsonPatch.from_string(jp_string)
document = Document.objects(id=document_id)
json_documents = json.loads(document.as_pymongo().to_json())
json_patched_document = json_patch.apply(json_documents[0])
Document.objects(id=document_id).first().delete()
Document
.from_json(json.dumps(json_patched_document))
.save(force_insert=True)
Is there a better way to save an edited json document?
I've enhanced a little bit the code:
json_patch = JsonPatch.from_string(jp_string)
document = Document.objects(id=document_id)
json_document = json.loads(document.as_pymongo().to_json())
json_patched_document = json_patch.apply(json_documents[0])
Document
.from_json(json.dumps(json_patched_document), created=True)
.save()
but, is there a way to not convert the document to json?
I had slightly similar problem, the part that I dont wanted the complete Document for saving, I just wanted to update fields which are modified/added.
heres the code I tests on below inputs:
def tryjsonpatch():
doc_in_db = {'foo': 'bar', "name": "aj", 'numbers': [1, 3, 7, 8]}
input = {'foo': 'bar', "name": "dj", 'numbers': [1, 3, 4, 8]}
input2 = {'foo': 'bar', "name": "aj", 'numbers': [1, 3, 7, 8], "extera": "12"}
input3 = {'foo': 'bar', "name": "dj", 'numbers': [1, 3, 4, 8], "extera": "12"}
patch = jsonpatch.JsonPatch.from_diff(doc_in_db, input3)
print("\n***patch***\n", patch)
doc = get_minimal_doc(doc_in_db, patch)
result = patch.apply(doc, in_place=True)
print("\n###result###\n", result,
"\n###present###\n", doc_in_db)
def get_minimal_doc(present, patch):
cur_dc = {}
for change in patch.patch:
if change['op'] not in ("add"):
keys = change['path'].split("/")[1:]
present_move = {}
old_key = 1
first = True
for key in keys:
if key.isdigit(): # old_key represented a array
cur_dc[old_key] = present_move
else:
if first:
cur_dc[key] = {}
first = False
else:
cur_dc[old_key][key] = {}
old_key = key
present_move = present[old_key]
return cur_dc
tryjsonpatch()
The code below stores Python data dictionary using QSettings object.
After reading it back the dictionary comes with all its keys as QString like so:
{PyQt4.QtCore.QString(u'one'): 1, PyQt4.QtCore.QString(u'two'): 2}
I wonder if it would be possible to read the dictionary with a regular string keys like this:
{'one': 1, 'two': 2}
Code:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data').toPyObject()
print data
Python2
Is not possible directly, you have to convert them to a regular dictionary.
d = {}
for k, v in data.items():
d[str(k)] = v
Complete code:
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data').toPyObject()
d = {}
for k, v in data.items():
d[str(k)] = v
print(d)
output:
{'two': 2, 'one': 1}
Python3
This problem does not exist since it returns you a regular dictionary, it is no longer necessary to convert it with toPyObject().
from PyQt4 import QtCore, QtGui
app = QtGui.QApplication([])
settings = QtCore.QSettings('apps', 'settings')
data = {'one': 1, 'two': 2}
settings.setValue('data', data)
data = settings.value('data')
print(data)
output:
{'one': 1, 'two': 2}
original = {PyQt4.QtCore.QString(u'one'): 1, PyQt4.QtCore.QString(u'two'): 2}
converted = {str(k): val for k, v in original.items()}
I have the below list of tuples:
p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
I want my output to look like:
y = {
"Master":{"number":["01"]},
"Node":{"number":["02", "03"]},
"Server":{"number":["04"]}
}
I have tried the below code:
y = {}
for line in p:
if line[1] in y:
y[line[1]] = {}
y[line[1]]["number"].append(line[0])
else:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
And I get the below error:
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
KeyError: 'number'
How do I solve this?
from collections import defaultdict
d = defaultdict(lambda: defaultdict(list))
for v, k in p:
d[k]["number"].append(v)
print(d)
defaultdict(<function <lambda> at 0x7f8005097578>, {'Node': defaultdict(<type 'list'>, {'number': ['02', '03']}), 'Master': defaultdict(<type 'list'>, {'number': ['01']}), 'Server': defaultdict(<type 'list'>, {'number': ['04']})})
without defaultdict:
d = {}
from pprint import pprint as pp
for v, k in p:
d.setdefault(k,{"number":[]})
d[k]["number"].append(v)
pp(d)
{'Master': {'number': ['01']},
'Node': {'number': ['02', '03']},
'Server': {'number': ['04']}}
It's because you don't initialize your dictionary when needed, and you reset it when not needed.
Try this:
p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
y = {}
for (number, category) in p:
if not y.get(category, False):
# initializes your sub-dictionary
y[category] = {"number": []}
# adds the correct number to the sub-dictionary
y[category]["number"].append(number)
Note that using a tuple unpacking for (number, category) in p allows your code to be more readable inside your loop.
You are resetting the dictionary!
for line in p:
if line[1] in y:
#y[line[1]] = {} -- RESET! ["number"] will now disappear.
#.. which leads to error in the next line.
y[line[1]]["number"].append(line[0])
else:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
A more pythonic way of achieving the same thing would be by using a defaultdict as demonstrated in other answers.
Do not assign {} to key when key is already present in y.
y = {}
for line in p:
try:
y[line[1]]["number"].append(line[0])
except:
y[line[1]] = {}
y[line[1]]["number"] = [line[0]]
OR
Use defaultdict use:-
>>> from collections import defaultdict
>>> p = [("01","Master"),("02","Node"),("03","Node"),("04","Server")]
>>> d = defaultdict(list)
>>> for k, v in p:
... d[v].append(k)
...
>>> d
defaultdict(<type 'list'>, {'Node': ['02', '03'], 'Master': ['01'], 'Server': ['04']})
I have a data/character_data.py:
CHARACTER_A = { 1: {"level": 1, "name":"Ann", "skill_level" : 1},
2: {"level": 2, "name":"Tom", "skill_level" : 1}}
CHARACTER_B = { 1: {"level": 1, "name":"Kai", "skill_level" : 1},
2: {"level": 2, "name":"Mel", "skill_level" : 1}}
In main.py, I can do this:
from data import character_data as character_data
print character_data.CHARACTER_A[1]["name"]
>>> output: Ann
print character_data.CHARACTER_B[2]["name"]
>>> output: Mel
How do I achieve this?
from data import character_data as character_data
character_type = "CHARACTER_A"
character_id = 1
print character_data.character_type[character_id]["name"]
>>> correct output should be: Ann
I get AttributeError when try use character_type as "CHARACTER_A".
How about this
In [38]: from data import character_data as character_data
In [39]: character_type = "CHARACTER_A"
In [40]: character_id = 1
In [41]: getattr(character_data, character_type)[character_id]["name"]
Out[41]: 'Ann'
You can use locals():
>>> from data.character_data import CHARACTER_A, CHARACTER_B
>>> character_id = 1
>>> character_type = "CHARACTER_A"
>>> locals()[character_type][character_id]["name"]
Ann
Though, think about merging CHARACTER_A and CHARACTER_B into one dict and access this dict instead of locals().
Also, see Dive into Python: locals and globals.
You need to structure your data properly.
characters = {}
characters['type_a'] = {1: {"level": 1, "name":"Ann", "skill_level" : 1},
2: {"level": 2, "name":"Tom", "skill_level" : 1}}
characters['type_b'] = ...
Or, the better solution is to create your own "character" type, and use that instead:
class Character(object):
def __init__(self, type, level, name, skill):
self.type = type
self.level = level
self.name = name
self.skill = skill
characters = []
characters.append(Character('A',1,'Ann',1))
characters.append(Character('A',2,'Tom',1))
characters.append(Character('B',2,'Kai',1)) # and so on
Then,
all_type_a = []
looking_for = 'A'
for i in characters:
if i.type == looking_for:
all_type_a.append(i)
Or, the shorter way:
all_type_a = [i for i in characters if i.type == looking_for]