Is it possible to have a database driver written in pure python that doesn't need an underlying system library/ shared object to connect to a database?
Apologies for the necro-bump, but this still comes up in a google search for pure python drivers. So:
Implementing a database driver in pure python is conceptually quite straight forward, but only if you have the wire protocol it uses documented. Then you (just) write a handler for each type of message to and from the database server in byte format and away you go. The devil is in the detail of course and that's why you have to have the protocol documented unless you are patient enough to reverse engineer it (and handle undocumented changes!)
There is a pure python driver for mssql (called python-tds) and has been for a long time (v1.0 Jan 2013). There are also pure python drivers for postgresql (pg8000) and mysql (can't remember the name). I haven't done an exhaustive search for other databases as I don't generally use them.
Pure python drivers are excellent for cross platform development, using alternative python implementations, or simplifying packaging. I especially like them for putting a python program onto Android. You don't need to worry about how to cross compile db client libraries.
Yes. It is possible to implement python database API as it stated in PEP 249
Even more: such database API implementations exists.
E.g. nuodb-python
Related
I am attempting to make a small mafia style game, and I am using replit. Would there be a way to use a php server (or an html server) as a database that I can connect to from a python project?
HTML nor PHP are databases. The "LAMP" stack uses PHP with MySQL / MariaDB for its database, which might be what you're referring to... However, the "P" there could also be Python ¯\_(ツ)_/¯
What you need in Python is a Data Persistence layer, which could just be a simple CSV / JSON file; however pickle module is easier to work with native Python types.
sqlite module can be used if you want the data to be more portable to other frameworks/languages.
And the final option is to actually run your own database server externally and expose it over a remote TCP / HTTP API connection (I don't think Repl.it supports running Docker containers).
If you have access to the actual machine, you can run something like SQLite on the machine. You are running dangerously close to needing more security and what not though.
Security is important, but if you just want to "play around" running something like SQLite should answer your initial pass.
I hold good experience in working with Perl DBI module.The DBI module acts as single API for multiple databases like Oracle, Postgres, etc.
I have recently started working on Python and I noticed that there are separate API for each databases in Python.
Following are my questions:
1. Isn't there single DB API in Python?
2. If not, isn't this a disadvantage in Python?
There is no Python equivalent to Perl's DBI-centric ecosystem. Instead:
The DBAPI (PEP 249) defines a common low-level interface that relational database drivers are expected to provide.
Some projects like SQLAlchemy Core abstract over multiple drivers, using the common DBAPI interface.
Python's lack of a proper DBI equivalent is less of a disadvantage than it would be in Perl due to the different module system. Assuming you are restricting yourself to a common SQL subset and to the DBAPI instead of using driver-specific extensions, switching to a different driver can be as simple as changing an import, and updating the connection information:
- import somedatabase as db
+ import differentdriver as db
In practice, neither Python's DBAPI nor Perl's DBI will enable you to switch databases at a whim. However, Perl's DBI makes it much easier to write software that works with multiple databases.
I'm currently building a web service using python / flask and would like to build my data layer on top of neo4j, since my core data structure is inherently a graph.
I'm a bit confused by the different technologies offered by neo4j for that case. Especially :
i originally planned on using the REST Api through py2neo , but the lack of transaction is a bit of a problem.
The "embedded database" neo4j doesn't seem to suit my case very well. I guess it's useful when you're working with batch and one-time analytics, and don't need to store the database on a different server from the web server.
I've stumbled upon the neo4django project, but i'm not sure this one offers transaction support (since there are no native client to neo4j for python), and if it would be a problem to use it outside django itself. In fact, after having looked at the project's documentation, i feel like it has exactly the same limitations, aka no transaction (but then, how can you build a real-world service when you can corrupt your model upon a single connection timeout ?). I don't even understand what is the use for that project.
Could anyone could recommend anything ? I feel completely stuck.
Thanks
None of the REST API clients will be able to explicitly support (proper) transactions since that functionality is not available through the Neo4j REST API interface. There are a few alternatives such as Cypher queries and batched execution which all operate within a single atomic transaction on the server side; however, my general approach for client applications is to try to build code which can gracefully handle partially complete data, removing the need for explicit transaction control.
Often, this approach will make heavy use of unique indexing and this is one reason that I have provided a large number of "get_or_create" type methods within py2neo. Cypher itself is incredibly powerful and also provides uniqueness capabilities, in particular through the CREATE UNIQUE clause. Using these, you can make your writes idempotent and you can err on the side of "doing it more than once" safe in the knowledge that you won't end up with duplicate data.
Agreed, this approach doesn't give you transactions per se but in most cases it can give you an equivalent end result. It's certainly worth challenging yourself as to where in your application transactions are truly necessary.
Hope this helps
Nigel
I think neo4django makes use of neo4j-rest-client, that does support transactions through the batch resource in the Neo4j REST interface.
The syntax is quite similar to the one used by Neo4j Python emebedded API:
>>> n = gdb.nodes.create()
>>> n["age"] = 25
>>> n["place"] = "Houston"
>>> n.properties
{'age': 25, 'place': 'Houston'}
>>> with gdb.transaction():
....: n.delete("age")
....:
>>> n.properties
{u'place': u'Houston'}
More information can be found in the neo4j-rest-client documentation about transactions.
My plan is to develop a multi-tier, multi-platform database application.
I would like to consume the data from cocoa/objective c apps, .net apps, and web browsers.
I don’t really know where to start and have been looking a Python, but can’t find if cocoa/objective c apps can consume python data objects.
Can anyone point me in the right direction as to how to achieve my goal?
My requirements are:
Data layer should be platform independent.
Whole system is scalable. Therefore multi tier.
Data access can be from cocoa, .net and web based clients.
You can make python and objective-c work together. Since you can use 100% normal C you can use the Python C interface. It's very tedious though.
There's also PyObjC. This acts as a bridge between Objective-C and Python. The documentation is pretty good and it will be much simpler than using the Python C interface directly.
You could also try using Thrift. Thrift is like Protocol Buffers by Google, but has support for generating Objective-C classes. You will have to write some boiler plate code to convert the data object into a thrift object; but after that is done you can pass information amongst any of the languages thrift supports. Documentation is on the thin side; I wrote a tutorial on using with Objective-C available on the thrift wiki here some time ago, not sure if it us up-to-date though as there have been several releases of thrift since then.
There's an API for Twisted apps to talk to a database in a scalable way: twisted.enterprise.dbapi
The confusing thing is, which database to pick?
The database will have a Twisted app that is mostly making inserts and updates and relatively few selects, and then other strictly-read-only clients that are accessing the database directly making selects.
(The read-only users are not necessarily selecting the data that the Twisted app is inserting; its not as though the database is being used as a message-queue)
My understanding - which I'd like corrected/adviced - is that:
Postgres is a great DB, but almost all the Python bindings - and there is a confusing maze of them - are abandonware
There is psycopg2 for postgres, but that makes a lot of noise about doing its own connection-pooling and things; does this co-exist gracefully/usefully/transparently with the Twisted async database connection pooling and such?
SQLLite is a great database for little things but if used in a multi-user way it does whole-database locking, so performance would suck in the usage pattern I envisage; it also has different mechanisms for typing column values?
MySQL - after the Oracle takeover, who'd want to adopt it now or adopt a fork?
Is there anything else out there?
Scalability
twisted.enterprise.adbapi isn't necessarily an interface for talking to databases in a scalable way. Scalability is a problem you get to solve separately. The only thing twisted.enterprise.adbapi really claims to do is let you use DB-API 2.0 modules without the blocking that normally implies.
Postgres
Yes. This is the correct answer. I don't think all of the Python bindings are abandonware - psycopg2, for example, seems to be actively maintained. In fact, they just added some new bindings for async access which Twisted might eventually offer an interface.
SQLite3 is pretty cool too. You might want to make it possible to use either Postgres or SQLite3 in your app; your unit tests will definitely be happier running against SQLite3, for example, even if you want to deploy against Postgres.
Other?
It's hard to know if another database entirely (something non-relational, perhaps) would fit your application better than Postgres. That depends a lot on the specific data you're going to be storing and the queries you need to run against it. If there are interesting relationships in your database, Postgres does seem like a pretty good answer. If all your queries look like "SELECT foo, bar FROM baz" though, there might be a simpler, higher performance option.
There is the txpostgres library which is a drop in replacement for twisted.enterprise.dbapi, —instead of a thread pool and blocking DB IO, it is fully asynchronous, leveraging the built in async capabilities of psycopg2.
We are using it in production in a big corporation and it's been serving us very well so far. Also, it's actively developed—a bug we reported recently was solved very quickly.
http://pypi.python.org/pypi/txpostgres
https://github.com/wulczer/txpostgres
You could look at nosql databases like mongodb or couchdb with twisted.
Scaling out could be rather easier with nosql based databases than with mysql or postgres.