How to save an SVG image in mySQL (from Python 3.6) - python

How can one save a large SVG image in a mySQL table?
My problem is that my SVGs are up to 200K symbols, which appears to be too much to save them in my table.
When trying to save as TEXT, Python (using Python 3.6 with Anaconda), python/sqlalchemy tells me the following:
sqlalchemy.exc.DataError: (pymysql.err.DataError) (1406, "Data too long for column 'cantons_svg' at row 27") [SQL: 'INSERT INTO ...]

I encountered this problem today when I try to store videos into tidb. I am using flask as backend framework and sqlalchemy as ORM, connecting to by database with mysql python connector.
the log is as follow:
sqlalchemy.exc.DataError: (pymysql.err.DataError) (1406, "Data too long for column 'video' at row 1")
[SQL: INSERT INTO videos (user_id, token_id, video) VALUES (%(user_id)s, %(token_id)s, %(video)s)]
I found that there is few advise about this situation, amoung those, one suggest me to see if there are any self-defined storage-type in sqlalchemy. It seems quite complicated.(if anyone find a doc or something that giving a clearly guidance about this, please tell me).
As for me, I just use BLOB type of sqlalchemy to init the database. And use
alter table videos modify column video LongBlob DEFAULT NULL ;
munualy change the column type. This work fine with me.

Related

Bigquery data not getting inserted

I'm using python client library to insert data to big query table. The code is as follows.
client = bigquery.Client(project_id)
errors = client.insert_rows_json(table=tablename,json_rows=data_to_insert)
assert errors == []
There are no errors, but the data is also not getting inserted.
Sample JSON rows:
[{'a':'b','c':'d'},{'a':'f','q':'r'},.....}]
What's the problem? No exception also
client.insert_rows_json method using StreamingInsert .
Inserting data to BigQuery using StreamingInsert will be cause of latency on table preview on BigQuery console.
The data is not appeared immediately. So,
You need to query them to confirm the data inserted.
It can be 2 possible situations:
your data does not match the schema
your table is freshly created, and the update is just not yet available
References:
Related GitHub issue
Data availability
got the answer to my question. The problem was I was inserting one more column data for which data was not there. I found a hack in order to find out if the data is not inserting to bigquery table.
Change the data to newline delimited json with the keys as the column names and values as values you want for that particular column.
bq --location=US load --source_format=NEWLINE_DELIMITED_JSON dataset.tablename newline_delimited_json_file.json. Run this command in you terminal and see if throws any errors. If it throws an error it's likely that something is wrong with your data/table schema.
Change the data/table schema as per the error and retry inserting the same via python.
It's better if the python API throws an error/exception like on the terminal, it would be helpful.

Creating Create Table statements for Redshift by reading Oracle DDL statement in python

I have 5 tables in an Oracle database. I need to create similar structures of them in AWS Redshift. I am using cx_oracle to connect to Oracle and dump the ddl in a csv file. But changing that DDL for each datatype in python to make it run in Redshift is turning out to be a very tedious process.
Is there any easy way to do in Python? Is there any library or function to do this seamlessly.
PS: I tried to use AWS Schema Conversion Tool for this. The tables got created in Redshift, but, with a glitch. Every datatype got doubled in Redshift.
For example: varchar(100) in Oracle became varchar(200) in Redshift
Has anyone faced a similar issue before with SCT?
The cx_OracleTools project and specifically the DescribeObject tool within that project have the ability to extract the DDL from an Oracle database. You may be able to use that.

Importing data from multiple related tables in mySQL to SQLite3 or postgreSQL

I'm updating from an ancient language to Django. I want to keep the data from the old project into the new.
But old project is mySQL. And I'm currently using SQLite3 in dev mode. But read that postgreSQL is most capable. So first question is: Is it better to set up postgreSQL while in development. Or is it an easy transition to postgreSQL from SQLite3?
And for the data in the old project. I am bumping up the table structure from the old mySQL structure. Since it got many relation db's. And this is handled internally with foreignkey and manytomany in SQLite3 (same in postgreSQL I guess).
So I'm thinking about how to transfer the data. It's not really much data. Maybe 3-5.000 rows.
Problem is that I don't want to have same table structure. So a import would be a terrible idea. I want to have the sweet functionality provided by SQLite3/postgreSQL.
One idea I had was to join all the data and create a nested json for each post. And then define into what table so the relations are kept.
But this is just my guessing. So I'm asking you if there is a proper way to do this?
Thanks!
better create the postgres database. write down the python script which take the data from the mysql database and import in postgres database.

Dynamically add columns to Exsisting BigQuery table

Background
I am loading files from local machine to BigQuery.Each file has variable number of fields.So,i am using 'autodetect=true' while running load job.
Issue is,when load job is run for first time and if the destination table doesn't exsist,Bigquery creates the table ,by infering the fields present in our file and that becomes New table's schema.
Now,when i run load job with a different file,which contains some extra (Eg:"Middile Name":"xyz")fields ,bigQuery throws error saying "field doesn't exsist in table")
From this post::BigQuery : add new column to existing tables using python BQ API,i learnt that columns can be added dynamically.However what i don't understand is,
Query
How will my program come to know,that the file being uploaded ,contains extra fields and schema mismatch will occur.(Not a problem ,if table doesn't exsist bcoz. new table will be created).
If my program can somehow infer the extra fields present in file being uploaded,i could add those columns to the exsisting table and then run the load job.
I am using python BQ API.
Any thoughts on how to automate this process ,would be helpful.
You should check schema update options. There is an option named as "ALLOW_FIELD_ADDITION" that will help you.
A naive solution would be:
1.get the target table schema using
service.tables().get(projectId=projectId, datasetId=datasetId, tableId=tableId)
2.Generate schema of your data in the file.
3.Compare the schemas (kind of a "diff") and then add those columns to the target table ,which are extra in your data schema
Any better ideas or approaches would be highly appreciated!

Select Data from Table and Insert into a different DB

I'm using python and psycopg2 to remotely query some psql databases, and I'm trying to figure out the best way to select the data I need from the remote table, and insert it into a table on a separate DB (local application server).
Most of the stuff I've read has directed me to avoid executemany and look toward COPY operations, but I'm unsure how to implement this on a specific select statement as opposed to the entire table. Should I be headed this way or am I completely off?
but I'm unsure how to implement this on a specific select statement as opposed to the entire table
COPY isn't limited to tables, you can use a query as the source as well, check out the examples in the manual, it shows how to use COPY to create a text file based on a query:
http://www.postgresql.org/docs/current/static/sql-copy.html#AEN59055
(3rd example)
Take a look at http://ryrobes.com/featured-articles/using-a-simple-python-script-for-end-to-end-data-transformation-and-etl-part-1/
Granted, this is pulling from Oracle and inserting into SQL Server, but the concepts should be the same.

Categories