More specifically , can i use some bridge for this like first i should copy data to excel fro mongodb and then that excel sheets data could easily be imported into mysql by some scripts like as in Python.
MongoDB does not offer any direct tool to do this, but you have many options to achieve this.
You can:
Write your own tool using your favorite language, that connect to MongoDB & MySQL and copy the data
Use mongoexport to create files and mysqlimport to reimport them into MySQL
Use an ETL (Extract, Transform, Load) that connect to MongoDB, allow you to transform the data and push them into MySQL. You can for example use Talend that has connector for MongoDB, bu you have many other solutions.
Note: Keep in mind that a simple document could contains complex structures such as Array/List, sub-documents, and even an Array of sub-documents. These structures can not be imported directly into a single table record, this is why most of the time you need a small transformation/mapping layer.
Related
I am new to SQL, I am working on a research project, we have years worth of data from different sources summing up to hundreds of terabytes of data. I currently have them parsed as python data frames, I need help to literally set up SQL from scratch, I also need help to compile all our data into a SQL database. Please tell me everythign I need to know about SQL as a beginner?
Probably the easiest to get started with one of the free RDMS options, MySQL (https://www.mysql.com/) or PostgreSQL (https://www.postgresql.org/).
Once you've got that installed and configured, and have created the tables you wish to load, you can go with one of two routes to get your data in.
Either you can install the appropriate python libraries to connect to the server you've installed and then INSERT the data in.
Or, if there is a lot of data, look at dumping the data out into a flat file (.csv) and then use the bulk loader to push it into your tables (this is more hassle, but for larger data sets it will be faster).
I've made an address book app using MySQL in PHP but now I want to make it in Python using a text file. Is it possible to perform CRUD operations with a text file instead of a database?
You can use the shelve module that's in the Python standard library. It basically gives you a dictionary that is easy to save to a file. However, you don't get a lot of relational database features like joining tables; it's just a key-value dictionary.
The documentation for it is at https://docs.python.org/3/library/shelve.html
This doesn't really scale as well as using a database though.
I am trying to export Cassandra table to CSV format using Python. But I couldn't do it. However, I am able to execute 'select' statement from Python. I have used the following code:
from cassandra.cluster import Cluster
cluster = Cluster ()
session = cluster.connect('chandan') ### 'chandan' is the name of the keyspace
## name of the table is 'emp'
session.execute(""" copy emp (id,name) to 'E:\HANA\emp.csv' with HEADER = true """ )
print "Exported to the CSV file"
Please help me in this regard.
This is not working for you because COPY is not a part of CQL.
COPY is a cqlsh-only tool.
You can invoke this via command line or script by using the -e flag:
cqlsh 127.0.0.1 -u username -p password -e "copy chandan.emp (id,name) to 'E:\HANA\emp.csv' with HEADER = true"
Edit 20170106:
export Cassandra table to CSV format using Python
Essentially... How do I export an entire Cassandra table?
I get asked this a lot. The short answer...is DON'T.
Cassandra is best-used to store millions or even billions of rows. It can do this, because it distributes its load (both operational and size) over multiple nodes. What it's not good at, are things like deletes, in-place updates, and unbound queries. I tell people not to do things like full exports (unbound queries) for a couple reasons.
First of all, running an unbound query on a large table in a distributed environment is usually a very bad idea (introducing LOTS of network time and traffic into your query). Secondly, you're taking a large result set that is stored on multiple nodes, and condensing all of that data into a single file...probably also not a good idea.
Bottom line: Cassandra is not a relational database, so why would you treat it like one?
That being said, there are tools out there designed to handle things like this; Apache Spark being one of them.
Please help me to execute the query with session.execute() statement.
If you insist on using Python, then you'll need to do a few things. For a large table, you'll want to query by token range. You'll also want to do that in small batches/pages, so that you don't tip-over your coordinator node. But to keep you from re-inventing the wheel, I'll tell you that there already is a tool (written in Python) that does exactly this: cqlsh COPY
In fact the newer versions of cqlsh COPY have features (PAGESIZE and PAGETIMEOUT) that allow it to avoid timeouts on large data sets. I have used the new cqlsh to successfully export 370 million rows before, so I know it can be done.
Summary: Don't re-invent the wheel. Write a script that uses cqlsh COPY, and leverages all of those things I just talked about.
is it possible to set up tables for Mysql in Python?
Here's my problem, I have bunch of .txt files which I want to load into Mysql database. Instead of creating tables in phpmyadmin manually, is it possible to do the following things all in Python?
Create table, including data type definition.
Load many files one by one. I only know this LOAD DATA LOCAL INFILE command to load one file.
Many thanks
Yes, it is possible, you'll need to read the data from the CSV files using CSV module.
http://docs.python.org/library/csv.html
And the inject the data using Python MySQL binding. Here is a good starter tutorial:
http://zetcode.com/databases/mysqlpythontutorial/
If you already know python it will be easy
It is. Typically what you want to do is use an Object-Retlational Mapping library.
Probably the most widely used in the python ecosystem is SQLAlchemy, but there is a lot of magic going on in it, so if you want to keep a tighter control on your DB schema, or if you are learning about relational DB's and want to follow along what the code does, you might be better off with something lighter like Canonical's storm.
EDIT: Just thought to add. The reason to use ORM's is that they provide a very handy way to manipulate data / interface to the DB. But if all you will ever want to do is to do a script to convert textual data to MySQL tables, than you might get along with something even easier. Check the tutorial linked from the official MySQL website, for example.
HTH!
I want to quickly put data into a sql server database. Is there a possibility to access the bulk copy functionality of SqlBulkCopy from CPython? I know it would be possible from IronPython and I also know that I could create text files which I could load via T-SQL. But I would prefer a solution where I can pass in data directly from CPython.
The flat file solution would be the simplest solution, but if you really need an alternative then you might be able to use ctypes to drive the SQL Server ODBC bulk copy extension.