I'm fetching some data from an API on regular interval and wants to store the JSON data into database to access and use later.
From API, I get data in this sample each time:
'{"data": {"cursor": null, "files": {"nodes": [{u'code': u'BOPhmYQg5Vm', u'date': 1482244678,u'counts': 2, u'id': u'1409492981312099686'}, {u'code': u'g5VmBOPhmYQ', u'date': 1482244678,u'counts': 5, u'id': u'1209968614094929813'}]}}}'
I can json_data = json.loads(above_data) and then fetch nodes as nodes_data = json_data["data"]["files"]["nodes"] which gives a list of nodes.
I want to store this nodes data into DB column data = Column(db.Text) of Text type. Each time there are going to be 10-15 values in nodes list.
How do I store? There are multiple nodes and I need it in a way that in future I can append/add more nodes to already available data column in my db.
While I would like to do json.loads(db_data_col) so that I get valid json and can loop over all of nodes to get internal data and use later.
I'm confused on how to store in db and access later in valid json format.
Edit 1: Using Sqlite for testing. Can use PostgresSQL in future. Text type of column is main point.
If you are using Django 1.8 you can create your own model field that can store a json. This class will make sure that you have the right JSON format as well.
import json
from django.db import models
class JsonField(models.TextField):
"""
Stores json-able python objects as json.
"""
def get_db_prep_value(self, value, connection, prepared=False):
try:
return json.dumps(value)
except TypeError:
BAD_DATA.error(
"cannot serialize %s to store in a JsonField", str(value)
)
return ""
def from_db_value(self, value, expression, connection, context):
if value == "":
return None
try:
return json.loads(value)
except TypeError:
BAD_DATA.error("cannot load dictionary field -- type error")
return None
I found a way to store JSON data into DB. Since I'm accessing nodes from remote service which returns a list of nodes on every request, I need to build proper json to store/retrieve from db.
Say API returned json text as : '{"cursor": null, "nodes" = [{"name": "Test1", "value: 1}, {"name": "Test2", "value: 2}, ...]}'
So, first we need to access nodes list as:
data = json.loads(api_data)
nodes = data['nodes']
Now for 1st entry into DB column we need to do following:
str_data = json.dumps({"nodes": nodes})
So, str_data would return a valid string/buffer, which we can store into DB with a "nodes" key.
For 2nd or successive entries into DB column, we will do following:
# get data string from DB column and load into json
db_data = json.loads(db_col_data)
# get new/latest 'nodes' data from api as explained above
# append this data to 'db_data' json as
latest_data = db_data["nodes"] + new_api_nodes
# now add this data back to column after json.dumps()
db_col_data = json.dumps(latest_data)
# add to DB col and DB commit
It is a proper way to load/dump data from DB while adding/removing json and keeping proper format.
Thanks!
Related
Hello I am new to the python world, and I am learning, I am currently developing a WebApp in Django and I am using ajax for sending requests, what happens is that in the view.py I get a JSON, from which I have not been able to extract the attributes individually to send to a SQL query, I have tried every possible way, I appreciate any help in advance.
def profesionales(request):
body_unicode = request.body.decode('utf-8')
received_json = json.loads(body_unicode)
data = JsonResponse(received_json, safe=False)
return data
Data returns the following to me
{opcion: 2, fecha_ini: "2021-02-01", fecha_fin: "2021-02-08", profesional: "168", sede: "Modulo 7", grafico: "2"}
This is the answer I get and I need to extract each of the values of each key into a variable
You can interpret this as dict.
for key in received_json:
print(key,received_json[key])
# do your stuff here
but if it's always a object with same keys (fixed keys), you can access directly:
key_data = received_json[key]
I have the following json file
{"columnwithoutname":"structureet","nofinesset":810001792,"nofinessej":810001784}
{"columnwithoutname":"structureet","nofinesset":670797117,"nofinessej":670010339}
I want to insert it in DynamoDB using Lambda. This is what I did:
def lambda_handler(event, context):
bucket=event['b']
file_key=event['c']
table=event['t']
recList=[]
s3 = boto3.client('s3')
dynamodb = boto3.client('dynamodb')
obj= s3.get_object(Bucket=bucket, Key=file_key)
recList=obj['Body'].read().split('\n')
for row in recList:
response = dynamodb.put_item(TableName='test-abe', Item=row)
But I have this error:
"errorMessage": "Parameter validation failed:\nInvalid type for parameter Item
Apparently I also need to precise the type of each column so it could be accepted. Anyway to do it automatically? I want all columns to be strings. thank you
DynamoDB client expects Item parameter to be a dict (per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.put_item)
When you do recList = obj['Body'].read().split('\n') what you get is a list of str, so passing str to a param expecting dict will obviously fail.
One more thing to consider is that DynamoDB client expects item in a very specific format, with explicitly specified attribute datatypes. If you want to read JSON and simply write it, I suggest using DynamoDB resource, something like this:
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(table_name)
table.put_item(item)
Table.put_item() accepts simple dict with no need to specify data type for every attribute, so you can simply read from file, convert it to dict and send it away (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Table.put_item).
You need to manipulate each row in order to have the 'S' format:
import json
for row in recList:
row_dict = json.loads(row)
ddb_row_dict = {k:{"S": v} for (k,v) in row_dict.items()}
response = dynamodb.put_item(TableName='test-abe', Item=ddb_row_dict)
Inside the expenses collection I have this Json:
{
"_id" : ObjectId("5ad0870d2602ff20497b71b8"),
"Hotel" : {}
}
I want to insert a document or another object if possible inside Hotel using Python.
My Python code:
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client['db']
collection_expenses = db ['expenses']
#insert
d = int(input('Insert how many days did you stay?: '))
founded_expenses = collection_expenses.insert_one({'days':d})
The code above inserts the document inside the collection. What should I change to add the days inside de Hotel object?
Thanks in advance.
Instead of using insert_one, you may want to take a look to the save method, which is a little bit more permissive.
Admitting your document is already created in the collection:
[...]
expenses = db['expenses']
# Find your document
expense = expense.find_one({})
expense["Hotel"] = { "days": d }
# This will either update or save as a new document the expense dict,
# depending on whether or not it already has an _id parameter
expenses.save(expense)
Knowing that find_one will return you None if no such document exist, you may want to upsert a document. You can thus easily do so with save.
I am trying to update an already existing document by ID. My intention is to find the doc by its id, then change its "firstName" with new value coming in "json", then update it into the CouchDB database.
Here is my code:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc_id, doc_rev = self.db.save(doc)
print doc_id, doc_rev
print "Saved"
//"json" is retrieved from PUT request (request.json)
at self.db.save(doc) I'm getting exception as "too many values to unpack".
I am using Bottle framework, Python 2.7 and Couch Query.
How do I update the document by id? what is the right way to do it?
In couchdb-python the db.save(doc) method returns tuple of _id and _rev. You're using couch-query - a bit different project that also has a db.save(doc) method, but it returns a different result. So your code should look like this:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc = self.db.save(doc)
print doc['_id'], doc['_rev']
print "Saved"
I am posting a JSON object back to the server side and retrieving that information through a request. Right now this is my code for my views.py
#csrf_exempt
def save(request):
if request.method == 'POST':
rawdata = request.body
JSONData= json.dumps(rawdata)
return HttpResponse(rawdata)
when I return rawdata my response looks like this:
[{"time_elapsed":"0","volts":"239.3","amps":"19.3","kW":"4.618","kWh":"0","session":"1"},...]
when I return JSONdata my response looks like this:
"[{\"time_elapsed\":\"0\",\"volts\":\"239.1\",\"amps\":\"20.8\",\"kW\":\"4.973\",\"kWh\":\"0\",\"session\":\"1\"},....]
which response is better when trying to insert this data into a sqlite database using Python/Django?
Also how would I start a loop for this do I have to do this kind of code?
conn = sqlite3.connect('sqlite.db')
c = conn.cursor()
c.execute("INSERT STATEMENTS")
I assume I have to do a loop for the INSERT STATEMENTS portion of that code, but I don't have any key to work off of. In my data everything between {} is one row. How do I iterate through this array saying everytime you see {...data...} insert it into a new row?
Here is how I eventually solved my problem. It was a matter of figuring out how to translate the JSON object to something python could recognize and then writing a simple loop to iterate through all the data that was produced.
#csrf_exempt
def save(request):
if request.method == 'POST':
rawdata1 = request.body
rawdata2 = json.loads(rawdata1)
length = len(rawdata2)
for i in range(0,length,1):
x = meterdata(time_elapsed=rawdata2[i]['time_elapsed'], volts=rawdata2[i]['volts'], amps=rawdata2[i]['amps'], kW=rawdata2[i]['kW'], kWh=rawdata2[i]['kWh'], session=rawdata2[i]['session'])
x.save()
return HttpResponse("Success!")
The big differences is the json.loads rather than dumps and in the for loop how to access the newly converted data. The first bracket specifies the row to look in and the second specifies what item to look for. for the longest time I was trying to do data[0][0]. May this help anyone who finds this in the future.
probably if you need to store that data in a db is best to create a model representing it, then you create a ModelForm with associated your model for handling your POST.
In this manner saving the model to the db is trivial and serializing it as a json response is something like
data = serializers.serialize('json',
YourModel.objects.filter(id=id),
fields=('list','of','fields'))
return HttpResponse(data, mimetype='application/json')