How to export table schema into a csv file - python

Basically, I want to export a hive table's schema into a csv file. I can create a datframe and then show its schema but I want to write its schema to a csv file. Seems pretty simple but it wont work.

Incase you wanna do it within Hive console. This is how you do it
hive>
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/user1/file1'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
SELECT * from tablename
And then in Unix
[user1]$
cat file1/* > file1.csv
zip file1 file1.csv

Related

How to copy from CSV file to PostgreSQL table with headers (including special characters) in CSV file?

I have 500 different CSV files in a folder.
I want to take a CSV file and import to Postgres table.
There are unknown number of columns in csv so I do not want to keep opening CSV file, then create table and then import using \copy
I know I can do this:
COPY users FROM 'user_data.csv' DELIMITER ';' CSV HEADER
However, the CSV file is something like:
user_id,5username,pas$.word
1,test,pass
2,test2,query
I have to convert this to postgres, but postgres does not allow column name to start with number or special character like . and $ in the column name.
I want the postgres table to look something like:
user_id ___5username pas______word
1 test pass
2 test2 query
I want to replace special characters with ___ and if column name starts with number, then prefix with ___.
Is there a way to do this? I am open to a Python or Postgres solution.
If pandas is an option for you, try to:
Create data frames from the CSV files using .read_csv()
Save the created data frames into SQL database with .to_sql()
You can also see my tutorial on pandas IO API.

Python: Sqlite3 query output to .csv file

I would like to execute this query:
select datetime(date/1000,'unixepoch','localtime') as DATE, address as RECEIVED, body as BODY from sms;
And save it's output to a .csv file in a specified directory. Usually in Ubuntu terminal it is far more easy to manually give commands to save the output of the above query to a file. But i am not familiar with Python-sqlite3. I would like to know how do i execute this query and save it's output to custom directory in a .csv file. Please help me out !
Quick and dirty:
import sqlite
db = sqlite.connect('database_file')
cursor = db.cursor()
cursor.execute("SELECT ...")
rows = cursor.fetchall()
# Itereate rows and write your CSV
cursor.close()
db.close()
Rows will be a list with all matching records, which you can then iterate and manipulate into your csv file.
If you just want to make a csv file, look at the csv module. The following page should get you going https://docs.python.org/2/library/csv.html
You can also look at the pandas module to help create the file.

Insert data from files into SQLite database

I have an SQLite database on this form:
Table1
Column1 | Column 2 | Column 3 | Column 4
I want to populate this database with data stored in some hundred .out files in this form, where every file has millions of rows:
value1;value2;value3;value4;
2value1;2value2;2value3;2value4;
... etc
Is there a fast way to populate the database with these data? One way would be to read in the data line for line in python and insert, however there probably should be a faster way to just input the whole file?
Bash, SQLite, Python preferrably
SQLite has a .import command.
.import FILE TABLE Import data from FILE into TABLE
You can use it like this (shell).
for f in *.out
do
sqlite3 -separator ';' my.db ".import $f Table1"
done

Reading a SQLite file using Python

I am working on an assignment where in we were provided a bunch of csv files to work on and extract information . I have succesfuly completed that part. As a bonus question we have 1 SQlite file with a .db extension . I wanted to know if any module exists to convert such files to .csv or to read them directly ?
In case such a method doesnt exist , ill probably insert the file into a database and use the python sqlite3 module to extract the data I need.
You can use the sqlite commandline tool to dump table data to CSV.
To export an SQLite table (or part of a table) as CSV, simply set the "mode" to "csv" and then run a query to extract the desired rows of the table.
sqlite> .header on
sqlite> .mode csv
sqlite> .once c:/work/dataout.csv
sqlite> SELECT * FROM tab1;
In the example above, the ".header on" line causes column labels to be
printed as the first row of output. This means that the first row of
the resulting CSV file will contain column labels. If column labels
are not desired, set ".header off" instead. (The ".header off" setting
is the default and can be omitted if the headers have not been
previously turned on.)
The line ".once FILENAME" causes all query output to go into the named
file instead of being printed on the console. In the example above,
that line causes the CSV content to be written into a file named
"C:/work/dataout.csv".
http://www.sqlite.org/cli.html

CSV to Postgres database: "Error: extra data after last expected column"

I am trying to convert a .csv file to a postgres database. I have set up the database with the appropriate number of columns to match the .csv file. I have also taken care to strip all "," (comma characters) from the .csv file.
Here is my command I am typing into psql:
COPY newtable FROM 'path/to/file.csv' CSV HEADER;
I have tried everything I can think of. Any idea how to fix this?

Categories