streamlining spreadsheet to DB copying Python - python

A friend of mine has asked me to write a quick Python script.
He has a small SQLite database (3 tables) and has to copy a bunch of data to it from an excel spreadsheet. The spreadsheet only has 2 data fields but alot of rows of data.
He asked if I would write a quick Python script to transfer the ss data to the db to so he doesn't have to spend a crapload of time manually copying it. I told him that I would do my best.
My question is: Where do I start? what do I need to research for this? Does anyone know if there is a pre-existing module to do this? Im trying to research this myself, but haven't come up with anything concrete yet and am not sure of any other search terms to use to narrow down my search.
Im just hoping someone wont mind giving my some guidance in the right direction.
Blessings and thanks
F

Do you really need Python?
SQLite Database Browser is a freeware, public domain, open source visual tool used to create, design and edit database files compatible with SQLite. Controls and wizards are available for users to:
Import and export tables from/to CSV files
If the file is simple enough (no commas in the content), you can import directly from SQLite:
For simple CSV files, you can use the SQLite shell to import the file into your SQLite database.
If you need to import a complex CSV file and the SQLite shell doesn't handle it, you may want to try a different front end, such as SQLite Database Browser.

Related

Extract webscraped python data to SQLite, excel or xml?

I'm kinda new to Python and webscraping but I'm currently at a point where I need to extract data to a database. Can someone tell me the pros and cons by using sqlite, excel or xml?
I've read that sqlite should be the fastest, so I may go for that database structure, but can someone then tell me what IDE you use to handle sqlite data after I've extracted it from python?
Edit: I hope my post makes sense. I'm currently trying to use a web scraper from here: https://github.com/gingeleski/odds-portal-scraper
Thanks in advance.
For the short term, Excel is a good way to examine your data and prototype analysis and visualizations. It gets old using it for very large datasets, or multiple similar datasets. Basically as soon as you start doing the same thing more than twice or writing VB code you should switch to the pandas/matplotlib solution.
It looks like the scraper you are using already puts the results in an SQLITE database, but if you have your data in a list or dictionary, I'd suggest using pandas to do calculations and matplotlib for visualizations, as that will give you a robust, extensible solution over the long term. It is very easy to read and write data between an SQLITE database and pandas.
A good way of viewing the data in the DB is a must. I'm currently using SQLiteStudio.
When you say IDE, I'm assuming you're looking for a way to view the SQLite data? If so, DBeaver is a free, open source SQL client. You could use this to view the data quite easily.

Using Python / pandaSDMX to process & upload to Postgresql

I am seeking a way to take SDMX files (like here: http://www12.statcan.gc.ca/datasets/Alternative.cfm?PID=105929&EXT=SDMX) and process them into a Postgresql table.
I can use rsdmx (https://cran.r-project.org/web/packages/rsdmx/index.html) for smaller datasets but for large ones we reach a number of limitations in R.
PandaSDMX (https://pandasdmx.readthedocs.io/en/latest/) appears to resolve some of these issues, but I am not experienced in Python and can't seem to get the syntax to work. I'm able to use Response.get() to load a local file as a response object but am not sure where to go to from there.
I know I need to apply the code tables (structure file) but I'm not sure how to do that or make it so I can use odo (http://odo.pydata.org/en/latest/) to send it to postgresql.
Hoping someone can help out or suggest another method to pursue.

update/insert access database using python

Is it possible in python by which I can write a simple .py script to update my access database records or insert new one if any i have on behalf of me? new records are to be pulled from Excel and pushed to be in the database.
MS-Access2010 i am using.
Thanks,
It's definitely possible. You'll probably want to do it with the comtypes module, which allows communication between Windows processes using the Component Object Model (COM).
Here's an example of a script that does that posted in another question.
Getting the information out of Microsoft Excel can be done with a lot of modules, but one I've had success with is openpyxl. Some examples of reading Excel workbooks with it can be found here.

migrating data from tomcat .dbx files

I want to migrate data from an old Tomcat/Jetty website to a new one which runs on Python & Django. Ideally I would like to populate the new website by directly reading the data from the old database and storing them in the new one.
Problem is that the database I was given comes in the form of a bunch of WEB-INF/data/*.dbx and I didn't find any way to read them. So, I have a few questions.
Which format do the WEB-INF/data/*.dbx use?
Is there a python module for directly reading from the WEB-INF/data/*.dbx files?
Is there some external tool for dumpint the WEB-INF/data/*.dbx to an ascii format that will be parsable by python?
If someone has attempted a similar data migration, how does it compare against scraping the data from the old website? (assuming that all important data can be scraped)
Thanks!
The ".dbx" suffix has been used by various softwares over the years so it could be almost anything. The only way to know what you really have here is to browse the source code of the legacy java app (or the relevant doc or ask the author etc).
wrt/ scraping, it's probably going to be a lot of a pain for not much results, depending on the app.

Use Python to load data into Mysql

is it possible to set up tables for Mysql in Python?
Here's my problem, I have bunch of .txt files which I want to load into Mysql database. Instead of creating tables in phpmyadmin manually, is it possible to do the following things all in Python?
Create table, including data type definition.
Load many files one by one. I only know this LOAD DATA LOCAL INFILE command to load one file.
Many thanks
Yes, it is possible, you'll need to read the data from the CSV files using CSV module.
http://docs.python.org/library/csv.html
And the inject the data using Python MySQL binding. Here is a good starter tutorial:
http://zetcode.com/databases/mysqlpythontutorial/
If you already know python it will be easy
It is. Typically what you want to do is use an Object-Retlational Mapping library.
Probably the most widely used in the python ecosystem is SQLAlchemy, but there is a lot of magic going on in it, so if you want to keep a tighter control on your DB schema, or if you are learning about relational DB's and want to follow along what the code does, you might be better off with something lighter like Canonical's storm.
EDIT: Just thought to add. The reason to use ORM's is that they provide a very handy way to manipulate data / interface to the DB. But if all you will ever want to do is to do a script to convert textual data to MySQL tables, than you might get along with something even easier. Check the tutorial linked from the official MySQL website, for example.
HTH!

Categories