how to insert specific json into sqlite database using python 3 - python

Using python 3, I want to download API data, which is returned as JSON, and then I want to insert only specific (columns or fields or whatever?) into a sqlite database. So, here's what I've got and the issues I have:
Using python's request module:
##### import modules
import sqlite3
import requests
import json
headers = {
'Authorization' : 'ujbOsdlknfsodiflksdonosB4aA=',
'Accept' : 'application/json'
}
r = requests.get(
'https://api.lendingclub.com/api/investor/v1/accounts/94837758/detailednotes',
headers=headers
)
Okay, first issue is how I get the requested JSON data into something (a dictionary?) that python can use. Is that...
jason.loads(r.text)
Then I create the table into which I want to insert the specific data:
curs.execute('''CREATE TABLE data(
loanId INTEGER NOT NULL,
noteAmount REAL NOT NULL,
)''')
No problem there...but now, even though the JSON data looks something like this (although there are hundreds of records)...
{
"myNotes": [
{
"loanId":11111,
"noteId":22222,
"orderId":33333,
"purpose":"Debt consolidation",
"canBeTraded":true,
"creditTrend":"DOWN",
"loanAmount":10800,
"noteAmount":25,
"paymentsReceived":5.88,
"accruedInterest":12.1,
"principalPending":20.94,
},
{
"loanId":11111,
"noteId":22222,
"orderId":33333,
"purpose":"Credit card refinancing",
"canBeTraded":true,
"creditTrend":"UP",
"loanAmount":3000,
"noteAmount":25,
"paymentsReceived":7.65,
"accruedInterest":11.92,
"principalPending":19.76,
}]
}
I only want to insert 2 data points into the sqlite database, the "loanId" and the "noteAmount". I believe inserting the data into the database will look something like this (but know this is incorrect):
curs.execute('INSERT INTO data (loanId, noteAmount) VALUES (?,?)', (loanID, noteAmount))
But I am now at a total loss as to how to do that, so I guess I have 2 main issues; getting the downloaded data into something that python can use to then insert specific data into the database; and then how exactly do I insert the data into the database from the object that holds the downloaded data. I'm guessing looping is part of the answer...but from what? Thanks in advance!

As the documentation says:
The sqlite3 module supports two kinds of placeholders: question marks
(qmark style) and named placeholders (named style).
Note that you can even insert all rows at once using executemany.
So in your case:
curs.executemany('INSERT INTO data (loanId, noteAmount) '
'VALUES (:loanId,:noteAmount)', json.loads(...)['myNotes'])

First off, it's js = json.loads(r.text)` so you're very close.
Next, if you want to insert just the loanID and noteAmount fields of each record, then you'll need to loop and do something like
for record in js['myNotes']:
curs.execute('INSERT INTO data (loanId, noteAmount) VALUES (?,?)', (record['loanID'], record['noteAmount']))
If you play with it a bit, you could coerce the JSON into one big INSERT call.

Related

How to replace JSON data in database?

The below JSON data is one of the field "JsonData" in the table "Profiles". In the below JSON data, I need to replace the value "Name" to "Other name" use sqlite in python
{"Id":"jwefwawlct6hlb6vs2ekotettc1dxvfv00d238jmbupfr1fnrz","Name":"CarlRisinger20409#outlook.com,"SaveType":1,"IdOnClould":"j0ZyVflWPD"}
I have executed SELECT JSON_REPLACE(JsonData, "$.Name", "Other name") FROM "Profiles" WHERE name = "CarlRisinger20409#outlook.com" in SQLITE
It showed {"Id":"jwefwawlct6hlb6vs2ekotettc1dxvfv00d238jmbupfr1fnrz","Name":"Other name","SaveType":1,"IdOnClould":"j0ZyVflWPD"} but it can't save on database.
Please know me any method to replace JSON values in the database with python. Thank you.
Since it seems you do it in python and you did not show the code, I suggest you checking if you commit the changes.
con.commit()
check out the sample code

Postgres update returning value giving string instead of json

I am attempting to update a json column in postgres (this is a bulk update using execut values). I am receiving a json object via an API. I insert the entire object into one column that is classified as a json column
CREATE TABLE my_table(
id SERIAL PRIMARY KEY,
event_data json NOT NULL default '{}'::JSON,
createdt timestamp NOT NULL DEFAULT now()
);
My update script looks like this:
UPDATE my_tableAS t
SET event_data = e.event_data::json
FROM (VALUES %s) AS e(id, event_data)
WHERE t.id= e.id
RETURNING *
I do a json.dumps on all json before hand
event_list.append([event['id'], json.dumps(event['data'])])
Once I get the completed rows I handle the data as such:
return json.loads(json.dumps(update_data, default=date_converter))
This all works properly when doing a straight insert into the json value, I dump the values before insert and then do the json.dumps/loads on the returning rows. Everything works fine. Just the update method.
Here is how the data is returned via the api on the update:
[
{
"id": 170152,
"event_data": "{\"commenttxt\": \"Test comment\", \"descrtxt\": \"HELLO WORLD\", \"eventcmpltflg\": false, \"eventcmpltontmflg\": false, \"active\": true}",
"createdt": "2021-03-18T08:34:07Z"
}
]
And this is how I recieve it doing an insert:
[
{
"id": 170152,
"event_data": {
"commenttxt": "Test comment",
"descrtxt": "Test descr",
"eventcmpltflg": false,
"eventcmpltontmflg": false,
"active": true
},
"createdt": "2021-03-18T08:34:07Z"
}
]
If I remove the json.dumps in the event_list.append section I get the error "Can't adapt type of dict".
For some context, I am not replacing individual elements inside the json. I am updating the entire column with a new set of json. I use a different table for tracking changes for historical/audit trails of what has changed. I use a json column because different teams use different values as their needs might be different so rather than using a table with a million columns to handle different teams json seemed best way to manage it.
I appreciate any help.
Ok, so I found the solution. It turns out because I am just returning * that postgres is taking the already dumped values that I am inserting and returning that instead of returning from the table row directly. I had to modify the SQL accordingly
UPDATE my_table AS t
SET event_data = e.event_data::json
FROM (VALUES %s) AS e(id, event_data)
WHERE t.id= e.id
RETURNING t.*
So basically in my RETURNING i had to specify which table it came from and since i renamed my table as "t" it had to be t.* or if you want specific columns t.column_name.
I had assumed that it automatically return the data coming from the table and not the pseudo table created by the FROM statement.

Way to automatically create SnowFlake table based on inferred field types from API Endpoint? (Python)

Say I have a dataframe that has a row like:
{'ID':'123245','Comment':'This is my longer comment','Tax':1.07,'Units':2.0}
Is there a way in Python to do something like:
max([len(str(i)) for i in set(df['Comments'])])
And infer the max varchar and other metadata that I could then construct a SQL query to create that table (in my case, for SnowFlake)?
Since it would take additional logic not mentioned (e.g. try to cast as int, float, datetime, etc.), perhaps this is commonly done in an existing library.
Right now, it takes me some time for each endpoint to manually check across the fields and infer how to make each table in Snowflake, again, manually. Would like to automate this process.
Of course, one aspect of automating this without something more sophisticated like a library is that your max fields now (such as a comment that's 199 characters long) will likely be soon violated by future inputs into those fields if not, say, rounded up to a 'max' varchar such as telling such an algorithm a minimum varchar when it can't convert to float/int/date/etc.
First off, as mentioned in the Snowflake docs, explicitly setting the maximum length of a VARCHAR column has no impact on performance and storage, so don't bother with that.
Regarding your general question, you can use their native Python connector to simply upload the DataFrame to your environment. Matching Python types to Snowflake types is done automatically.
If you want to only create the table without inserting data, upload df.iloc[:0]. And if you want to get the create table SQL, you can use get_ddl. Below is an example implementation.
import pandas as pd
import snowflake.connector
from snowflake.connector.pandas_tools import pd_writer
from snowflake.sqlalchemy import URL
import sqlalchemy
credentials = {**your_snowflake_credentials}
# Create example DataFrame
data = {
"ID": "123245",
"COMMENT": "This is my longer comment",
"TAX": 1.07,
"UNITS": 2,
}
df = pd.DataFrame([data])
# Upload empty DataFrame
df.iloc[:0].to_sql(
"test_table",
sqlalchemy.create_engine(URL(**credentials)),
index=False,
method=pd_writer,
)
# Retrieve the CREATE TABLE statement and drop the temporary table
# (if you really want to)
sql = "select get_ddl('table', 'test_table')"
with snowflake.connector.connect(**credentials) as connection:
with connection.cursor() as cursor:
create_table_sql = cursor.execute(sql).fetchone()[0]
cursor.execute("drop table test_table")
print(create_table_sql)
Output:
CREATE OR REPLACE TABLE TEST_TABLE (
ID VARCHAR(16777216),
COMMENT VARCHAR(16777216),
TAX FLOAT,
UNITS NUMBER(38,0)
);

insert new field in mongodb database

I'm a beginner in mongodb and pymongo and I'm working on a project where I have a students mongodb collection . What I want is to add a new field and specifically an adrress of a student to each element in my collection (the field is obviously added everywhere as null and will be filled by me later).
However when I try using this specific example to add a new field I get a the following syntax error:
client = MongoClient('mongodb://localhost:27017/') #connect to local mongodb
db = client['InfoSys'] #choose infosys database
students = db['Students']
students.update( { $set : {"address":1} } ) #set address field to every column (error happens here)
How can I fix this error?
You are using the update operation in wrong manner. Update operation is having the following syntax:
db.collection.update(
<query>,
<update>,
<options>
)
The main parameter <query> is not at all mentioned. It has to be at least empty like {}, In your case the following query will work:
db.students.update(
{}, // To update the all the documents.
{$set : {"address": 1}}, // Update the address field.
{multi: true} // To do multiple updates, otherwise Mongo will just update the first matching document.
)
So, in python, you can use update_many to achieve this. So, it will be like:
students.update_many(
{},
{"$set" : {"address": 1}}
)
You can read more about this operation here.
The previous answer here is spot on, but it looks like your question may relate more to PyMongo and how it manages updates to collections. https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html
According to the docs, it looks like you may want to use the 'update_many()' function. You will still need to make your query (all documents, in this case) as the first argument, and the second argument is the operation to perform on all records.
client = MongoClient('mongodb://localhost:27017/') #connect to local mongodb
db = client['InfoSys'] #choose infosys database
students = db['Students']
sudents.update_many({}, {$set : {"address":1}})
I solved my problem by iterating through every element in my collection and inserting the address field to each one.
cursor = students.find({})
for student in cursor :
students.update_one(student, {'$set': {'address': '1'}})

how i can make a mongodb query (or find()) with a variable using python?

I need tranfer data from a database in MySQL to MongoDB.
I have a Mysql's query with a few data:
SELECT data FROM table where data BETWEEN r1 AND r2
that i storaged in a list
so my problem is
when i try to find the data in mongodb (the data was in) i made this:
datamongo = collection.find({"data" : data[x]})
and the result is nothing, literaly.
I need to create a loop with the list range and search every data in the list in mongodb.
I tried with find() and find_one() but no one of them works.
BUT all works if i put a constant instead of a variable.
I hope anyone can help me
Here comes the regex part. You need to use that value in a way as explained below
collection.find({"data" : new RegExp(data[x], 'i')})
Or you can also use
collection.find( { 'data' : { '$regex' : data[x], '$options' : 'i' } } )
Note that i in RegExp, this is just for case insensitive comparison. Remove it if you want strict comparison.
Hope it helps.
Thanks

Categories