I have a text field in a SQLite database table that stores JSON. I want my Python script to query all rows and stuff each JSON string into a single JSON object (or string, since I'm going to send it through a ReST API).
What is the best approach for doing so? Note there will be thousands of rows.
Thanks!
Related
I'm trying to create a spark job that can read in 1000's of json files and perform some actions, then write out to file (s3) again.
It is taking a long time and I keep running out of memory. I know that spark tries to infer the schema if one isn't given. The obvious thing to do would be to supply the schema when reading in. However, the schema changes from file to file depending on a many factors that aren't important. There are about 100 'core' columns that are in all files and these are the only ones I want.
Is it possible to write a partial schema that only reads the specific fields I want into spark using pyspark?
At first, It is recommended to have a jsonl file that each of it contains a single json input data. Generally, you can read just a specific set of fields from a big json, but that should not be considered to be Sparks' job. You should have an initial method that converts your json input into an object of a serializable datatype; you should feed that object into your Spark pipeline.
Passing the schema is not an appropriate design, and it is just making the problem more severe. Instead, define a single method and extract the specific fields after loading the data from files. You can use the following link for finding how to extract some considering fields from a json string in python: How to extract specific fields and values from a JSON with python?
I need to import JSON data from an API to MySQL database using python. I can get the json data in python script but no idea how to insert this data to a mysql database.
finally, did it. Saved the JSON file locally first, then parsed it using key values in a for loop & lastly ran a query to insert into MySQL table.
Serializing JSON to string and insert string to DB, and make it Deserializing when you want to use it. I use this method but I don't know if it is optimum
Working With JSON Data in Python
I am trying to bulk insert data to MondoDB without overwriting existing data. I want to insert new data to the database if no match with unique id (sourceID). Looking at the documentation for Pymongo I have written some code but cannot make it work. Any ideas to what I am doing wrong?
db.bulk_write(UpdateMany({"sourceID"}, test, upsert=True))
db is the name of my database, SourceID is the unique ID of the documents that I don't want to overwrite in the existing data, test is the array that I am tying to insert.
Either I don't understand your requirement or you misunderstands the UpdateMany operation. As per documentation, this operation serves for modifying the existing data (those matching the query) and only if no documents match the query, and upsert=True, insert new documents. Are you sure you don't want to use insert_many method?
Also, in your example, the first parameter which should be a filter for update, is not a valid query which has to be in a form {"key": "value"}.
If we have a json format data file which stores all of our database data content, such as table name, row, and column, etc content, how can we use DB-API object to insert/update/delete data from json file into database, such as sqlite, mysql, etc. Or please share if you have better idea to handle it. People said it is good to save database data information into json format, which will be much convenient to work with database in python.
Thanks so much! Please give advise!
There's no magic way, you'll have to write a Python program to load your JSON data in a database. SQLAlchemy is a good tool to make it easier.
I know in javascript there is the stringify command, but is there something like this in python for pyramid applications? Right now I have a view callable that takes an uploaded stl file and parses it into a format as such. data= [[[x1,x2,x3],...],[[v1,v2,v3],...]] How can I convert this into a JSON string so that it can be stored in an SQLite database? Can I insert the javascript stringify command into my views.py file? Is there an easier way to do this?
You can use the json module to do this:
import json
data_str = json.dumps(data)
There are other array representations that can be stored in a database as well (see pickle).
However, if you're actually constructing a database, you should know that's it's considered a violation of basic database principles (first normal form) to store multiple data in a single value in a relational database. What you should do is decompose the array into rows (and possibly separate tables) and store a single value in each "cell". That will allow you to query and analyze the data using SQL.
If you're not trying to build an actual database (if the array is completely opaque to your application and you'll never want to search, sort, aggregate, or report by the values inside the array) you don't need to worry so much about normal form but you may also find that you don't need the overhead of an SQL database.
you also can use cjson, it is faster than json library.
import cjson
json_str = cjson.encode(your_string)