I am new to the world of python and have some problems loading data from a csv files unto postgresql.
I have successfully connected to my database from python and created my table. But when I go to load the data from the csv file into the created table in postgresql, I get nothing on the table when I use either the insert function or or copy function and commit.
cur.execute('''COPY my_schema.sheet(id, length, width, change_d, change_h,change_t, change_a, name)
FROM '/private/tmp/data.csv' DELIMITER ',' CSV HEADER;''')
dbase.commit()
I am not sure what I am missing, can anyone please help with this or advise a better way to load csv data using python script
Related
I am trying to automate a job where files are written in gcs using data from queried data in BQ.
I have a bigquery table and I need to export files out to GCS, named according to a particular field.
field1 file_name
w filea
x fileb
y filec
z filed
So in this case, I need to produce 4 csv files, filea.csv, fileb.csv, filec.csv, filed.csv.
Is there a way I can automate this with python, so that if a new file shows up (fileE) in the BQ table, the job can export it to gcs with proper name fileE.csv?
Thank you!
I tried exporting one by one using BQ data export and it worked, but I was looking for a python solution.
Thanks
I have some large (+500 Mbytes) .CSV files that I need to import into a Postgres SQL database.
I am looking for a script or tool that helps me to:
Generate the table columns SQL CREATE code, ideally taking into account the data in the .CSV file in order to create the optimal data types for each column.
Use the header of the .CSV as the name of the column.
It would be perfect if such functionality existed in PostgreSQL or could be added as an add-on.
Thank you very much
you can use this open source tool called pgfutter to create table from your csv file.
git hub link
also postgresql has COPY functionality however copy expect that the table already exists.
so i am building a database for a larger program and do not have much experience in this area of coding (mostly embedded system programming). My task is to import a large excel file into python. It is large so i'm assuming I must convert it to a CSV then truncate it by parsing and then partitioning and then import to avoid my computer crashing. Once the file is imported i must be able to extract/search specific information based on the column titles. There are other user interactive aspects that are simply string based so not very difficult. As for the rest, I am getting the picture but would like a more efficient and specific design. Can anyone offer me guidance on this?
An excel or csv can be read into python using pandas. The data is stored as rows and columns and is called a dataframe. To import data in such a structure, you need to import pandas first and then read the csv or excel into the dataframe structure.
import pandas as pd
df1= pd.read_csv('excelfilename.csv')
This dataframe structure is similar to tables and you can perform joining of different dataframes, grouping of data etc.
I am not sure if this is what you need, let me know if you need any further clarifications.
I would recommend actually loading it into a proper database such as Mariadb or Postgresql. This will allow you to access the data from other applications and it takes the load off of you for writing a database. You can then use a ORM if you would like to interact with the data or simply use plain SQL via python.
read the CSV
df = pd.read_csv('sample.csv')
connect to a database
conn = sqlite3.connect("Any_Database_Name.db") #if the db does not exist, this creates a Any_Database_Name.db file in the current directory
store your table in the database:
df.to_sql('Some_Table_Name', conn)
read a SQL Query out of your database and into a pandas dataframe
sql_string = 'SELECT * FROM Some_Table_Name' df = pd.read_sql(sql_string, conn)
I have a folder which contains a set of file .txt with the same structure.
The folder directory is E:\DataExport
Which contain 1000 .txt file: text1.txt, text2.txt,....
Each txt file contain price data of one. The data is updated on a daily. An example of one .txt file is as below
Ticker,Date/Time,Open,High,Low,Close,Volume
AAA,7/15/2010,19.581,20.347,18.429,18.698,174100
AAA,7/16/2010,19.002,19.002,17.855,17.855,109200
AAA,7/19/2010,19.002,19.002,17.777,17.777,104900
....
My question is, I want to:
Load all the file into mysql data base through a line of code, I can do it one by one in MySql command line but do not know how to import all at once, into one table in MySql (I have already create the table)
Everyday, when I get new data, how could I only update the new data to the table in MySql
I would like to get solution using either Python or MySql. I have tried to google and apply some solution but cannot success, the data do not load into mySQL
You could use python package pandas.
Read data with read_csv into DataFrame and use to_sql method with proper con and schema.
You will have to keep track of what is imported. You could for example keep it in file or database that last imported 54th line on 1032th file. And perform an update that reads the rest and imports that.
I have a lot of data from an excel sheet and I used python to read that data with xlrd and am now outputting all of that data from python. My question is, how do I take that data that I am outputting through python, and upload it on MongoDB. I understand that pymongo must be used, but am not quite sure how to do it. Any help is greatly appreciated.
Let's assume you've read the tutorials but still don't get it...
You'll need to convert your xlrd data into a list of dictionaries, one dictionary for each row in your spreadsheet. Here's a clue: Python Creating Dictionary from excel data
Once you have the list of dictionaries/rows, make sure you have MongoDB running on your machine, then:
from pymongo import MongoClient
db = MongoClient().mydb # create a database called 'mydb'
for row_dict in list_of_rows:
db.rows.save(row_dict) # saves each row in collection called "rows"