I am trying to port a web crawler app from .Net to Python. It receives json responses similar to the following:
[
{
"Code": "AAA",
"Date": "/Date(1481875200000)/",
"Value": 12345.00
}
]
This could easily be deserialized by Newtonsoft Json. However I can't seem to deserialize this with Python's built in Json Decoder
from django.db import models
class ItemModel(models.Model):
code = models.CharField(max_length=5)
date = models.DateTimeField()
value = models.IntegerField(default=0)
import json
parsed_data = json.loads(json_data, encoding='utf-8')
new_model=ItemModel()
new_model.code = parsed_data["Code"]
new_model.date = parsed_data["Date"]
new_model.value = parsed_data["Value"]
new_model.save()
which gives
ValidationError: [u"'/Date(1481875200000)/' value has an invalid
format. It must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."]
Edit: Now I know this is caused because of assigning a string to a Date Time field
Is there a way to try parse this data to the django model? - as I have no way to modify the json response. Also is this the right way to do this? as the code seems intuitively iffy to me.
You need to implement a custom decoder for 'Date' field.
import json
from datetime import datetime
def parseMyData(dct):
if 'Date' in dct:
timestamp = int(dct['Date'][6:-2])
dct['Date'] = datetime.fromtimestamp(timestamp)
return dct
jdata = '''{
"Code": "AAA",
"Date": "/Date(14818752000)/",
"Value": 12345.00
}
'''
json.loads(jdata, object_hook=parseMyData)
and returns
{u'Code': u'AAA',
'Date': datetime.datetime(2439, 8, 3, 10, 0),
u'Value': 12345.0}
Related
I want to store key-value JSON data in aws DynamoDB where key is a date string in YYYY-mm-dd format and value is entries which is a python dictionary. When I used boto3 client to save data there, it saved it as a data type object, which I don't want. My purpose is simple: Store JSON data against a key which is a date, so that later I will query the data by giving that date. I am struggling with this issue because I did not find any relevant link which says how to store JSON data and retrieve it without any conversion.
I need help to solve it in Python.
What I am doing now:
item = {
"entries": [
{
"path": [
{
"name": "test1",
"count": 1
},
{
"name": "test2",
"count": 2
}
],
"repo": "test3"
}
],
"date": "2022-10-11"
}
dynamodb_client = boto3.resource('dynamodb')
table = self.dynamodb_client.Table(table_name)
response = table.put_item(Item = item)
What actually saved:
[{"M":{"path":{"L":[{"M":{"name":{"S":"test1"},"count":{"N":"1"}}},{"M":{"name":{"S":"test2"},"count":{"N":"2"}}}]},"repo":{"S":"test3"}}}]
But I want to save exactly the same JSON data as it is, without any conversion at all.
When I retrieve it programmatically, you see the difference of single quote, count value change.
response = table.get_item(
Key={
"date": "2022-10-12"
}
)
Output
{'Item': {'entries': [{'path': [{'name': 'test1', 'count': Decimal('1')}, {'name': 'test2', 'count': Decimal('2')}], 'repo': 'test3'}], 'date': '2022-10-12} }
Sample picture:
Why not store it as a single attribute of type string? Then you’ll get out exactly what you put in, byte for byte.
When you store this in DynamoDB you get exactly what you want/have provided. Key is your date and you have a list of entries.
If you need it to store in a different format you need to provide the JSON which correlates with what you need. It's important to note that DynamoDB is a key-value store not a document store. You should also look up the differences in these.
I figured out how to solve this issue. I have two column name date and entries in my dynamo db (also visible in screenshot in ques).
I convert entries values from list to string then saved it in db. At the time of retrival, I do the same, create proper json response and return it.
I am also sharing sample code below so that anybody else dealing with the same situation can have atleast one option.
# While storing:
entries_string = json.dumps([
{
"path": [
{
"name": "test1",
"count": 1
},
{
"name": "test2",
"count": 2
}
],
"repo": "test3"
}
])
item = {
"entries": entries_string,
"date": "2022-10-12"
}
dynamodb_client = boto3.resource('dynamodb')
table = dynamodb_client.Table(<TABLE-NAME>)
-------------------------
# While fetching:
response = table.get_item(
Key={
"date": "2022-10-12"
}
)['Item']
entries_string=response['entries']
entries_dic = json.loads(entries_string)
response['entries'] = entries_dic
print(json.dumps(response))
I have got when I want convert json data in django models.
How can I solve it.
class Persons(models.Model):
rank = models.IntegerField()
employer = models.CharField(max_length=100)
employeesCount = models.IntegerField()
medianSalary = models.IntegerField()
object creater:
for json in json_string:
Persons.objects.create(id=json['rank'], employer=json['employer'], employeesCount=json['employeesCount'], medianSalary=json['medianSalary'])
json reader
f = open('data.json')
json_string = f.read()
f.close()
json file:
[
{
"rank": 1,
"employer": "Walmart",
"employeesCount": 2300000,
"medianSalary": 19177
},
{
"rank": 2,
"employer": "Amazon",
"employeesCount": 566000,
"medianSalary": 38466
}
]
Your code is expecting a dictionary. Convert the json string using the builtin library
import json
your_json = json.loads(f.read())
Th json_strings is just a string, not a list of dictionaries. You can make use of the json module [python-doc] to JSON deserialize it:
from json import load as json_load
with open('data.json') as f:
json_data = json_load(f)
Persons.objects.bulk_create([
Person(
id=record['rank'],
employer=record['employer'],
employeesCount=record['employeesCount'],
medianSalary=record['medianSalary']
)
for record in json_data
])
By using bulk_create you create the objects in bulk in the database, which reduces the number of roundtrips to the database.
I am trying to filter the content from a URL Web API and I am using a GET method to obtain the complete data set, then I apply some filters to that response and I get my desired results, but the complete process of retrieval - filter - display results takes around 3-5 mins which is too much waiting time for users. I want to apply a POST method to filter directly from the URL request instead of retrieving the complete data set, in that way, I will get rid of my custom filters and greatly reduce the waiting time. How can I achieve this?
This is the current code I have:
from django.shortcuts import render
from django.http import JsonResponse
from rest_framework.views import APIView
from rest_framework.response import Response
from collections import Counter
from datetime import datetime, timedelta
import json, urllib.request, dateutil.parser, urllib.parse
class ChartData(APIView)
def get(self, request,format=None):
# Request access to the PO database and parse the JSON object
with urllib.request.urlopen(
"http://10.21.200.98:8081/T/ansdb/api/rows/PO/tickets?User_0001=Pat%20Trevor",
timeout=15) as url:
complete_data_user_0001 = json.loads(url.read().decode())
# Custom filter to the JSON response
# Count the number of times the user has created a PO between two given dates where the PO is not equal to N/A values
Counter([k['user_id'] for k in complete_data_user_0001 if
start_date < dateutil.parser.parse(
k.get('DateTime')) < end_date and
k['PO_value'] != 'N/A'])
return Response(data)
The filters that I would like to apply with a POST method are:
{
"filter": {
"filters": [
{
"field": "CreatedOnDate",
"operator": "gte",
"value": "2017-05-31 00:00:00"
},
{
"field": "CreatedOnDate",
"operator": "lte",
"value": "2017-06-04 00:00:00"
},
{
"field": "Triage_Subcategory",
"operator": "neq",
"value": "N/A"
}
],
"logic": "and"
}
}
The structure of the JSON object is:
[{
user_id : 0001
CreatedOn: "2017-02-16 15:54:48",
Problem: "AVAILABILILTY",
VIP: "YES",
PO_value: N/A
},
{
user_id : 0001
CreatedOn: "2017-01-10 18:14:28",
Problem: "AVAILABILILTY",
VIP: "YES",
PO_value: 00098324
},
...
}]
Any suggestion, approach or piece of code is appreciated.
You have to create POST method and use the urllib post call to your API.
class ChartData(APIView)
def post(self, request,format=None):
---
# Your post call to API
Alternatively you can use requests library to make the POST call
I am trying to save json data as django models instances, I am new to djano-rest-framework
here is my model:
class objective(models.Model):
description = models.CharField(max_length=200)
profile_name = models.CharField(max_length=100)
pid = models.ForeignKey('personal_info')
serializer.py
class objective_Serilaizer(serializers.Serializer):
description = serializers.CharField(max_length=200)
profile_name = serializers.CharField(max_length=100)
pid = serializers.IntegerField()
def restore_object(self, attrs, instance=None):
if instance:
instance.description = attrs.get('description', instance.description)
instance.profile_name = attrs.get('profile_name', instance.profile_name)
instance.pid = attrs.get('pid', instance.pid)
return instance
return objective(**attrs)
json
{
"objective": {
"description": "To obtain job focusing in information technology.",
"profile_name": "Default",
"id": 1
}
}
I tried
>>> stream = StringIO(json)
>>> data = JSONParser().parse(stream)
I am getting following error
raise ParseError('JSON parse error - %s' % six.text_type(exc))
ParseError: JSON parse error - No JSON object could be decoded
Use:
objective_Serilaizer(data=json)
or probably because your json is data on the request object:
objective_Serilaizer(data=request.DATA)
Here's a good walk through from the Django Rest-framework docs.
If you are sure your JSON as a string is correct then it should be easy to parse without going to the lengths you currently are:
>>> import json
>>> stream = """\
... {
... "objective": {
... "description": "To obtain job focusing in information technology.",
... "profile_name": "Default",
... "id": 1
... }
... }"""
>>> json.loads(stream)
{u'objective': {u'profile_name': u'Default', u'description': u'To obtain job focusing in information technology.', u'id': 1}}
So surely the question is how come you aren't able to parse it. Where is that JSON string you quote actually coming from? Once your JSON object is parsed, you need to address the top-level "objective" key to access the individual data elements in the record you want to create.
The problem I'm having is with mixing serialized Django models using django.core.serializers with some other piece of data and then trying to serialize the entire thing using json.dumps.
Example code:
scores = []
for indicator in indicators:
score_data = {}
score_data["indicator"] = serializers.serialize("json", [indicator,])
score_data["score"] = evaluation.indicator_percent_score(indicator.id)
score_data["score_descriptor"] = \
serializers.serialize("json",
[form.getDescriptorByPercent(score_data["score"]),],
fields=("order", "value", "title"))
scores.append(score_data)
scores = json.dumps(scores)
return HttpResponse(scores)
Which returns a list of objects like this:
{
indicator: "{"pk": 345345, "model": "forms.indicator", "fields": {"order": 1, "description": "Blah description.", "elements": [10933, 4535], "title": "Blah Title"}}",
score: 37.5,
score_descriptor: "{"pk": 66666, "model": "forms.descriptor", "fields": {"order": 1, "value": "1.00", "title": "Unsatisfactory"}}"
}
The problem I'm having can be seen in the JSON with the serialized Django models being wrapped in multiple sets of quotations. This makes the it very hard to work with on the client side as when I try to do something like
indicator.fields.order
it evaluates to nothing because the browser thinks I'm dealing with a string of some sort.
Ideally I would like valid JSON without the conflicting quotations that make it unreadable. Something akin to a list of objects like this:
{
indicator: {
pk: 12931231,
fields: {
order: 1,
title: "Blah Title",
}
},
etc.
}
Should I be doing this in a different order, using a different data structure, different serializer?
My solution involved dropping the use of django.core.serializers and instead using django.forms.model_to_dict and django.core.serializers.json.DjangoJSONEncoder.
The resulting code looked like this:
for indicator in indicators:
score_data = {}
score_data["indicator"] = model_to_dict(indicator)
score_data["score"] = evaluation.indicator_percent_score(indicator.id)
score_data["score_descriptor"] = \
model_to_dict(form.getDescriptorByPercent(score_data["score"]),
fields=("order", "value", "title"))
scores.append(score_data)
scores = json.dumps(scores, cls=DjangoJSONEncoder)
The problem seemed to arise from the fact that I was essentially serializing the Django models twice. Once with the django.core.serializers function and once with the json.dumps function.
The solution solved this by converting the model to a dictionary first, throwing it in the dictionary with the other data and then serializing it once using the json.dumps using the DjangoJSONEncoder.
Hope this helps someone out because I couldn't find my specific issue but was able to piece it together using other answer to other stackoverflow posts.