Defensive conditions when JSON field is missing in API

Defensive conditions when JSON field is missing in API - python

I am developing a small Python script in order to get the weather data from forecast.io Once I get the JSON document, I call a Class in order to create a new record to be saved in the database. The problem is that some fields (which are also attributes in my Class) are not always informed in the API so I must include some kind of defensive code or the script will break when field is not found.
I've found this answer of #Alex Martelli which seams pretty good: Reading from Python dict if key might not be present
If you want to do something different than using a default value (say,
skip the printing completely when the key is absent), then you need a
bit more structure, i.e., either:
for r in results:
if 'key_name' in r:
print r['key_name']
or
for r in results:
try: print r['key_name']
except KeyError: pass
But I am wondering if I must include an "if" or a "try" on every field I want to save or is there a prettier way to do this? (I want to save 27 fields and 27 "if" seems ugly)
This is the code I have so far:
from datetime import datetime
import tornado.web
import tornado.httpclient
from tornado import gen
from src.db.city import list_cities
from src.db.weather import Weather
from motorengine import *
#gen.coroutine
def forecastio_api():
http_client = tornado.httpclient.AsyncHTTPClient()
base_url = "https://api.forecast.io/forecast/APIKEY"
city yield list_cities()
for city in city:
url = base_url + "/%s,%s" %(str(city.loc[0]), str(city.loc[1]))
response = yield http_client.fetch(url)
json = tornado.escape.json_decode(response.body)
for day in json['daily']['data']:
weather = Weather(city=city,
time = datetime.fromtimestamp(day['time']),
summary = day.get('summary'),
icon = day.get('icon'),
sunrise_time = datetime.fromtimestamp(day.get('sunriseTime')),
sunset_time = datetime.fromtimestamp(day.get('sunsetTime')),
moon_phase = day.get('moonPhase'),
precip_intensity = day.get('precipIntensity'),
precip_intensity_max = day.get('precipIntensityMax'),
precip_intensity_max_time = datetime.fromtimestamp(day.get('precipIntensityMaxTime')),
precip_probability = day.get('precipProbability'),
precip_type = day.get('precipType'),
temperature_min = day.get('temperatureMin'),
temperature_min_time = datetime.fromtimestamp(day.get('temperatureMinTime')),
temperature_max = day.get('temperatureMax'),
temperature_max_time = datetime.fromtimestamp(day.get('temperatureMaxTime')),
apparent_temperature_min = day.get('apparentTemperatureMin'),
apparent_temperature_min_time = datetime.fromtimestamp(day.get('apparentTemperatureMinTime')),
apparent_temperature_max = day.get('apparentTemperatureMax'),
apparent_temperature_max_time = datetime.fromtimestamp(day.get('apparentTemperatureMaxTime')),
dew_point = day.get('dewPoint'),
humidity = day.get('humidity'),
wind_speed = day.get('windSpeed'),
wind_bearing = day.get('windBearing'),
visibility = day.get('visibility'),
cloud_cover = day.get('cloudCover'),
pressure = day.get('pressure'),
ozone = day.get('ozone')
)
weather.create()
if __name__ == '__main__':
io_loop = tornado.ioloop.IOLoop.instance()
connect("DATABASE", host="localhost", port=27017, io_loop=io_loop)
forecastio_api()
io_loop.start()
and this is the Weather Class using Motornegine:
from tornado import gen
from motorengine import Document
from motorengine.fields import DateTimeField, DecimalField, ReferenceField, StringField
from src.db.city import City
class Weather(Document):
__collection__ = 'weather'
__lazy__ = False
city = ReferenceField(reference_document_type=City)
time = DateTimeField(required=True)
summary = StringField()
icon = StringField()
sunrise_time = DateTimeField()
sunset_time = DateTimeField()
moon_phase = DecimalField(precision=2)
precip_intensity = DecimalField(precision=4)
precip_intensity_max = DecimalField(precision=4)
precip_intensity_max_time = DateTimeField()
precip_probability = DecimalField(precision=2)
precip_type = StringField()
temperature_min = DecimalField(precision=2)
temperature_min_time = DateTimeField()
temperature_max = DecimalField(precision=2)
temperature_max_time = DateTimeField()
apparent_temperature_min = DecimalField(precision=2)
apparent_temperature_min_time = DateTimeField()
apparent_temperature_max = DecimalField(precision=2)
apparent_temperature_max_time = DateTimeField()
dew_point = DecimalField(precision=2)
humidity = DecimalField(precision=2)
wind_speed = DecimalField(precision=2)
wind_bearing = DecimalField(precision=2)
visibility = DecimalField(precision=2)
cloud_cover = DecimalField(precision=2)
pressure = DecimalField(precision=2)
ozone = DecimalField(precision=2)
create_time = DateTimeField(auto_now_on_insert=True)
#gen.coroutine
def create(self):
yield self.save()

You can check Schematics. This library helps you define objects that can be easily populated from dicts(you can easily turn json to python dict). It allows you to define validation rules on each property. The object will throw ModelValidationError error when some properties are missing or in the wrong format. Schematics allows you to add default values and a lot more nice stuff when you define your models.

Related

How to access data from custom_export in Otree?

I am trying to use ExtraModel and custom_export to export data from live_pages. However, when I go on devserver and check the data tab, it is nowhere to be found. If I download the excel (bottom right of the page) the new variables are not included in the data.
Where can I find the data from the custom export? Or am I defining the function wrong? Any help greatly appreciated.
See MWE below
from otree.api import *
import random
doc = """
Your app description
"""
class C(BaseConstants):
NAME_IN_URL = 'mwe_export'
PLAYERS_PER_GROUP = None
NUM_ROUNDS = 1
NUM_EMPLOYERS = 3
class Subsession(BaseSubsession):
pass
class Group(BaseGroup):
pass
class Player(BasePlayer):
pass
class Offer(ExtraModel):
group = models.Link(Group)
sender = models.Link(Player)
wage = models.IntegerField()
effort = models.IntegerField()
job_id = models.IntegerField()
information_type = models.StringField()
# FUNCTIONS
def to_dict(offer: Offer):
return dict(sender=offer.sender.id_in_group,
wage=offer.wage,
effort=offer.effort,
job_id=offer.job_id,
information_type=offer.information_type)
# PAGES
class MyPage(Page):
#staticmethod
def js_vars(player: Player):
return dict(my_id=player.id_in_group)
#staticmethod
def live_method(player: Player, data):
print(data)
group = player.group
job_id = random.randint(1, 1000)
wage = data['wage']
effort = data['effort']
information_type = data['information_type']
if data['information_type'] == 'offer':
offer = Offer.create(group=group,
sender=player,
job_id=job_id,
wage=wage,
effort=effort,
information_type=information_type)
print(offer)
print(to_dict(offer))
return {0: to_dict(offer)}
page_sequence = [MyPage]
def custom_export(players):
yield ['session.code', 'participant_code', 'id_in_session']
offers = Offer.filter()
for offer in offers:
player = offer.sender
participant = player.participant
yield [participant.code, participant.id_in_session, offer.job_id, offer.wage, offer.effort]

In the menu at the top of the admin page there is also a "Data" item. The custom export for your app should be available there under the heading "Per-app".

Cannot access ListField elements from another python code

Hello I want to share a mongodb between two django apps (monitor , manager) using one application called common.
I can create database instances in the manager application easily but when accessing the book authors i cannot.
it return this error
mongoengine.errors.FieldDoesNotExist: The fields "{'_id'}" do not
exist on the document "author"
models.py
from mongoengine import *
class author(Document):
name = StringField(required = True)
meta = {'abstract': True , 'allow_inheritance':True}
class book(Document):
name = StringField(required = True)
authors = ListField(ReferenceField(author))
meta = {'abstract': True , 'allow_inheritance':True}
manager.py
from mongoengine import *
from models import *
class author(author):
def rand(self):
print("i am useless")
class book(book):
def rand2(self):
print("i am also useless")
if __name__ == "__main__":
connect('test', host = '0.0.0.0',port = 27017)
a1 = author(name = "Charef")
a1.save()
a2 = author(name = "hamid")
a2.save()
a3 = author(name = "djoudi")
a3.save()
a4 = author(name = "cheb khaled")
a4.save()
book1_authors = [a1,a2,a4]
book2_authors = [a1,a3]
book1 = book(name = "Hello Django", authors = book1_authors)
book1.save()
book2 = book(name = "Hello python", authors = book2_authors)
book2.save()
monitor
from mongoengine import *
from models import *
class author(author):
def say_hi(self):
print("Hi, my name is {} and this is my book".format(self.name))
class book(book):
def book_info(self):
for author in self.authors:
print(author.say_hi())
print("done !! ")
if __name__ == "__main__":
connect("test",host = "0.0.0.0", port = 27017)
s_book = book.objects()[0]
print(s_book.name)
print(len(s_book.authors))

Use unique names for the different classes (e.g. BaseBook for the abstract class and Book for the concrete class). Some internals of mongoengine relies on the uniqueness of the class names so it's not a good idea to go against that.
Using the following works:
class BaseAuthor(Document):
name = StringField(required=True)
meta = {'abstract': True , 'allow_inheritance':True}
class BaseBook(Document):
name = StringField(required=True)
authors = ListField(ReferenceField(BaseAuthor))
meta = {'abstract': True , 'allow_inheritance':True}
class Author(BaseAuthor):
def rand(self):
print("i am useless")
class Book(BaseBook):
def rand2(self):
print("i am also useless")
Also, if possible use the same Book/Author classes in both monitor and manager

How to speed up writing in a database?

I have a function which search for json files in a directory, parse the file and write data in the database. My problem is writing in database, because it take around 30 minutes. Any idea how can I speed up writting in a database? I have few quite big files to parse, but parsing the file is not a problem it take around 3 minutes. Currently I am using sqlite but in the future I will change it to PostgreSQL.
Here is my function:
def create_database():
with transaction.atomic():
directory = os.fsencode('data/web_files/unzip')
for file in os.listdir(directory):
filename = os.fsdecode(file)
with open('data/web_files/unzip/{}'.format(filename.strip()), encoding="utf8") as f:
data = json.load(f)
cve_items = data['CVE_Items']
for i in range(len(cve_items)):
database_object = DataNist()
try:
impact = cve_items[i]['impact']['baseMetricV2']
database_object.severity = impact['severity']
database_object.exp_score = impact['exploitabilityScore']
database_object.impact_score = impact['impactScore']
database_object.cvss_score = impact['cvssV2']['baseScore']
except KeyError:
database_object.severity = ''
database_object.exp_score = ''
database_object.impact_score = ''
database_object.cvss_score = ''
for vendor_data in cve_items[i]['cve']['affects']['vendor']['vendor_data']:
database_object.vendor_name = vendor_data['vendor_name']
for description_data in cve_items[i]['cve']['description']['description_data']:
database_object.description = description_data['value']
for product_data in vendor_data['product']['product_data']:
database_object.product_name = product_data['product_name']
database_object.save()
for version_data in product_data['version']['version_data']:
if version_data['version_value'] != '-':
database_object.versions_set.create(version=version_data['version_value'])
My models.py:
class DataNist(models.Model):
vendor_name = models.CharField(max_length=100)
product_name = models.CharField(max_length=100)
description = models.TextField()
date = models.DateTimeField(default=timezone.now)
severity = models.CharField(max_length=10)
exp_score = models.IntegerField()
impact_score = models.IntegerField()
cvss_score = models.IntegerField()
def __str__(self):
return self.vendor_name + "-" + self.product_name
class Versions(models.Model):
data = models.ForeignKey(DataNist, on_delete=models.CASCADE)
version = models.CharField(max_length=50)
def __str__(self):
return self.version
I will appreciate if you can give me any advice how can I improve my code.

Okay, given the structure of the data, something like this might work for you.
This is standalone code aside from that .objects.bulk_create() call; as commented in the code, the two classes defined would actually be models within your Django app.
(By the way, you probably want to save the CVE ID as an unique field too.)
Your original code had the misassumption that every "leaf entry" in the affected version data would have the same vendor, which may not be true. That's why the model structure here has a separate product-version model that has vendor, product and version fields. (If you wanted to optimize things a little, you might deduplicate the AffectedProductVersions even across DataNists (which, as an aside, is not a perfect name for a model)).
And of course, as you had already done in your original code, the importing should be run within a transaction (transaction.atomic()).
Hope this helps.
import json
import os
import types
class DataNist(types.SimpleNamespace): # this would actually be a model
severity = ""
exp_score = ""
impact_score = ""
cvss_score = ""
def save(self):
pass
class AffectedProductVersion(types.SimpleNamespace): # this too
# (foreign key to DataNist here)
vendor_name = ""
product_name = ""
version_value = ""
def import_item(item):
database_object = DataNist()
try:
impact = item["impact"]["baseMetricV2"]
except KeyError: # no impact object available
pass
else:
database_object.severity = impact.get("severity", "")
database_object.exp_score = impact.get("exploitabilityScore", "")
database_object.impact_score = impact.get("impactScore", "")
if "cvssV2" in impact:
database_object.cvss_score = impact["cvssV2"]["baseScore"]
for description_data in item["cve"]["description"]["description_data"]:
database_object.description = description_data["value"]
break # only grab the first description
database_object.save() # save the base object
affected_versions = []
for vendor_data in item["cve"]["affects"]["vendor"]["vendor_data"]:
for product_data in vendor_data["product"]["product_data"]:
for version_data in product_data["version"]["version_data"]:
affected_versions.append(
AffectedProductVersion(
data_nist=database_object,
vendor_name=vendor_data["vendor_name"],
product_name=product_data["product_name"],
version_name=version_data["version_value"],
)
)
AffectedProductVersion.objects.bulk_create(
affected_versions
) # save all the version information
return database_object # in case the caller needs it
with open("nvdcve-1.0-2019.json") as infp:
data = json.load(infp)
for item in data["CVE_Items"]:
import_item(item)

associate temporary values with appengine model querysets

The following is my model:
I have two tables match and team:
class Match(DictModel):
date_time = db.DateTimeProperty()
team1 = db.StringProperty()
team2 = db.StringProperty()
venue = db.StringProperty()
result = db.IntegerProperty()
class Team(DictModel):
tslug = db.StringProperty()
name = db.StringProperty()
matches_played = db.IntegerProperty()
matches_won = db.IntegerProperty()
rating = db.FloatProperty()
At runtime, when a post request is made to one of the handler functions, i want to dynamically associate a team rating with the queryset of Match and send the value, this is how i try to do:
matches = Match.all()
matches.filter('date_time <=', end)
matches.filter('date_time >=', start)
match_dict = functs.create_dict(matches)
self.response.out.write(match_dict)
and i have a custom function to get fetch the rating from the current team, it is as follows:
def to_dict(self):
return dict([(p, unicode(getattr(self, p))) for p in self.properties()])
def create_dict(matches):
lst = []
for m in matches:
t1 = m.team1
t2 = m.team2
te1 = Team.all().filter("name =", t1).get()
te2 = Team.all().filter("name =", t2).get()
m.setattr('rating1', te1.rating)
m.setattr('rating2', te2.rating)
lst.append(m)
data_dict = json.dumps([l.to_dict() for l in lst])
return data_dict
Trouble: i get error in setattr in place of setattr i also tried m.rating1 = te1 and m.rating2 = te2 but even that does not seem to work. Everything else is working flawlessly.
Please help thanks!

The syntax is setattr(m, 'rating1', te1.rating1); but this is no different from m.rating1 = te1.rating1.
May we see the trace back?

Retrieving a set of model objects that are indirectly related

I have 4 models:
class TransitLine(models.Model):
name = models.CharField(max_length=32)
class Stop(models.Model):
line = models.ForeignKey(TransitLine, related_name='stops')
class ExitType(models.Model):
name = models.CharField(max_length=32)
people_limit = models.PositiveSmallIntegerField()
class Exits(models.Model):
TOKEN_BOOTH = 0
GATE = 1
REVOLVING_DOOR = 2
EXIT_TYPES = (
(TOKEN_BOOTH, 'Token booth'),
(GATE, 'Gate'),
(REVOLVING_DOOR, 'Revolving door'),
)
name = models.CharField(max_length=32)
stop = models.ForeignKey(Stop, related_name='exits')
type = models.ForeignKey(ExitType, related_name='exits')
I have one TransitLine object. I want to retrieve all the unique ExitType objects that are related to the Stop objects of the TransitLine (that was a mouth full).
Some semi-pseudo code of what I want to do:
tl = TransitLine.objects.get(id=1)
exit_types = []
for s in tl.stops:
exit_types.append([e.type for e in s.exits])
unique_exit_types = list(set(exit_types))
Obviously, prefer to do this in one QuerySet call. Any suggestions?

I would try something like this:
ExitType.objects.filter(exits__stop__line=line).distinct()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Defensive conditions when JSON field is missing in API - python

Related

How to access data from custom_export in Otree?

Cannot access ListField elements from another python code

How to speed up writing in a database?

associate temporary values with appengine model querysets

Retrieving a set of model objects that are indirectly related

Categories

Resources