portable non-relational database - python

I want to experiment/play around with non-relational databases, it'd be best if the solution was:
portable, meaning it doesn't require an installation. ideally just copy-pasting the directory to someplace would make it work. I don't mind if it requires editing some configuration files or running a configuration tool for first time usage.
accessible from python
works on both windows and linux
What can you recommend for me?
Essentially, I would like to be able to install this system on a shared linux server where I have little user privileges.

I recommend you consider BerkelyDB with awareness of the licensing issues.
I am getting very tired of people recommending BerkleyDB without qualification - you can only distribute BDB systems under GPL or some unknown and not publicly visible licensing fee from Oracle.
For "local" playing around where it is not in use by external parties, it's probably a good idea. Just be aware that there is a license waiting to bite you.
This is also a reminder that it is a good idea when asking for technology recommendations to say whether or not GPL is acceptable.
From my own question about a portable C API database, whilst a range of other products were suggested, none of the embedded ones have Python bindings.

Metakit is an interesting non-relational embedded database that supports Python.
Installation requires just copying a single shared library and .py file. It works on Windows, Linux and Mac and is open-source (MIT licensed).

BerkleyDB

If you're used to thinking a relational database has to be huge and heavy like PostgreSQL or MySQL, then you'll be pleasantly surprised by SQLite.
It is relational, very small, uses a single file, has Python bindings, requires no extra priviledges, and works on Linux, Windows, and many other platforms.

Have you looked at CouchDB? It's non-relational, data can be migrated with relative ease and it has a Python API in the form of couchdb-python. It does have some fairly unusual dependencies in the form of Spidermonkey and Erlang though.
As for pure python solutions, I don't know how far along PyDBLite has come but it might be worth checking out nonetheless.

BerkeleyDB : (it seems that there is an API binding to python : http://www.jcea.es/programacion/pybsddb.htm)

Have you looked at Zope Object Database?
Also, SQLAlchemy or Django's ORM layer makes schema management over SQLite almost transparent.
Edit
Start with http://www.sqlalchemy.org/docs/05/ormtutorial.html#define-and-create-a-table
to see how to create SQL tables and how they map to Python objects.
While your question is vague, your comments seem to indicate that you might want to define the Python objects first, get those to work, then map them to relational schema objects via SQLAlchemy.

If you're only coming and going from Python you might think about using Pickle to serialize the objects. Not going to work if you're looking to use other tools to access the same data of course. It's built into python, so you shouldn't have any privileged problems, but it's not a true database so it may not suit the needs of your experiment.

Adding a reference to TinyDB here since this page is showing at the top of many searches. It is a portable non-relational database in python. It stores python dicts into a local json file and makes them available for database ops similar to mongodb. It also has an extension to port to mongodb's commands, the difference being that instead of working on another system server you'll be operating on a local json file.
And unlike the presently chosen answer, it is under a permissive MIT open license.
Links:
TinyDB site
Tinydb on Github
Basic Usage
Usage
Support forum
Implementations

Related

Solution for storing digital asset management metadata and dependencies in Python

I'm planning to write some sort of digital management system to help users to deal with files in VCS (SVN, Perforce...) easily. The main premise is, that all files custom metadata and dependencies are stored alongside real files in VCS and not on separate database server.
But when querying the metadata it would be super slow to load everything from VCS on demand, so I would like to cache all metadata and dependencies locally and just update them incrementally when needed.
I need to write the whole system in Python, since it have to run in several environments that are embedding python.
Theoretically my needs will be fulfilled by nosql embedded graph database with multiprocess access, but sadly I can find anything to match this criteria:
every file can have different metadata structure, so I can't use schemas, thus no SQL db
I need to store dependencies
ability to search metadata and dependencies
several processes need to be able to read the database at once
serverless solution (only local machine will use it)
Python support
Optionally a way to inform connected processes about database update
I would really appreciate if someone more experienced could point me to the right direction. I'm not looking exclusively for one silver-bullet software that would fulfill my needs, it can also be an combination of several solutions. I just don't like reinventing the well, so I would like to use 3rd party solution rather than writing something on my own.
Thank you
ZODB satisfies most of your criteria:
support for arbitrary python constructs
usable mostly like having everything in memory as normal objects
if opened in read-only mode multiple processes can read at the same time (i think)
if using with ZEO (server) many clients can access at the same time with automatic notification on changes. See sample application in ZEO guide
You can load the same database with ZEO or ZODB so you can switch.
There is some tutorial stuff and general info at http://www.zodb.org

Build postgreSQL in Windows with Python

Can i install and build postgreSQL from Source Code in Windows by using Python? Is it solid? Currently in their documentation they have Visual C++ as their only option!
And if it is possible (and reliable) where can i find prime material to use Python to build and customize my postgreSQL? I won't go and learn Visual C++ unless it is the only way. I also have GitBash if that helps...
For people who have no idea on db's. They never say that in documentations because they think it for granted.
IT ALL BOILS DOWN TO: It is another thing building_from_source/installing a db and a totally another thing interacting/working with a db.
There is library for accessing Postgres from python called psycopg. There is a tutorial on its use here.
To the extent that you are just learning about databases, you might also take a look at Sqlite, which is a more lightweight database and included in the python standard library
Since you want to use PostgreSQL you really don't need to compile anything. Simply install PostgreSQL and the windows binaries of psycopg2 (the PostgreSQL client library for python).
If you want to build the PostgreSQL database server - something you only want/need to do if you want to modify PostgreSQL itself, and which is something you probably don't want to do as a beginner - you should use whatever compiler the Postgres developers suggest since they most likely only have project/make files for that compiler.

Python automatic build tool

at my work i write numerous small python scripts for DB management, most scripts use one or two common libraries which i sometimes update,
before distribution i freeze the scripts with cxfreeze and copy over some resources and upload to a server.
i would like to set up some up some system which would allow me to automatically rebuild/freeze all of the scripts
, copy over some files, archive and upload to server.
I'm not sure where even to look for such a system, because most of what if found is complicated server based systems like AHP for compiled languages, while i need something small for a single computer
obviously i can write something fast and dirty in python but it seems illogical that there isn't something ready made for these simple requirements.
please forgive my ignorance, I'm still learning.
Have you considered using make? If you're familiar with it from another language, it might be the easiest. There's also a list on the python wiki here.
Have a look at buildout. From their site:
Buildout is a Python-based build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based. It lets you create a buildout configuration and reproduce the same software later.
We are using it to do exactly what you describe. It is flexible and easy to extend through recipes.

Can I call external *python* functions from google refine?

I'm investigating Google refine to speed up some of my data work -- never used it before this week, but I like a lot of what I see.
My biggest question so far is whether it's possible to call external python functions from Refine. I know you can call jython internally, but that doesn't provide access to C-based python libraries (e.g. lxml), and I have scripts elsewhere that I'd like to integrate, without lots of copy-paste or rewrite hassle.
What options are there for doing this in Refine? I'm willing to get creative -- I just want a stable, re-usable solution.
As Google Refine Wiki says:
lxml will NOT work in Jython, since lxml has C bindings for CPython (regular Python), hence will not work in Refine which is Jython / Java only, and has no CPython interpreter built-in
But you can try Google Refine Python Client Library to create projects and manipulate your data programmatically.
I'm going to mark reclosedev's answer as accepted, but there's still a litle more to the story.
The other answer to this question is that you can set up your own python-based API. For this project, I was able to set up a django app running on a local server. It only took an hour or so to build the API to my existing library.
More hassle than I'd have liked, but it fit the bill for this project without soaking up too much time.

I need a beginners guide to setting up windows for python development

I currently work with .NET exclusively and would like to have a go at python. To this end I need to set up a python development environment. I guide to this would be handy. I guess I would be doing web development so will need a web server and probably a database. I also need pointers to popular ORM's, an MVC framework, and a testing library.
One of my main criteria with all this is that I want to understand how it works, and I want it to be as isolated as possible. This is important as i am wary of polluting what is a working .NET environment with 3rd party web and database servers. I am perfectly happy using SQLite to start with if this is possible.
If I get on well with this I am also likely to want to set up automated build and ci server (On a virtual machine, probably ubuntu). Any suggestions for these would be useful.
My ultimate aim if i like python is to have similar sorts of tools that i have available with .NET and to really understand the build and deployment of it all. To start with I will settle for a simple development environment that is as isolated as possible and will be easy to remove if I don't like it. I don't want to use IronPython as I want the full experience of developing a python solution using the tools and frameworks that are generally used.
It's not that hard to set up a Python environment, and I've never had it muck up my .NET work. Basically, install Python --- I'd use 2.6 rather than 3.0, which is not yet broadly accepted --- and add it to your PATH, and you're ready to go with the language. I wouldn't recommend using a Ubuntu VM as your development environment; if you're working on Windows, you might as well develop on Windows, and I've had no significant problems doing so. I go back and forth from Windows to Linux with no trouble.
If you have an editor that you're comfortable with that has basic support for Python, I'd stick with it. If not, I've found Geany to be a nice, light, easy-to-use editor with good Python support, though I use Emacs myself because I know it; other people like SCITE, NotePad++, or any of a slew of others. I'd avoid fancy IDEs for Python, because they don't match the character of the language, and I wouldn't bother with IDLE (included with Python), because it's a royal pain to use.
Suggestions for libraries and frameworks:
Django is the standard web framework, but it's big and you have to work django's way; I prefer CherryPy, which is also actively supported, but is light, gives you great freedom, and contains a nice, solid webserver that can be replaced easily with httpd.
Django includes its own ORM, which is nice enough; there's a standalone one for Python, though, which is even nicer: SQL Alchemy
As far as a testing library goes, pyunit seems to me to be the obvious choice
Good luck, and welcome to a really fun language!
EDIT summary: I originally recommended Karrigell, but can't any more: since the 3.0 release, it's been continuously broken, and the community is not large enough to solve the problems. CherryPy is a good substitute if you like a light, simple framework that doesn't get in your way, so I've changed the above to suggest it instead.
Well, if you're thinking of setting up an Ubuntu VM anyway, you might as well make that your development environment. Then you can install Apache and MySQL or Postgres on that VM just via the standard packaging tools (apt-get install), and there's no danger of polluting your Windows environment.
You can either do the actual development on your Windows machine via your favourite IDE, using the VM as a networked drive and saving the code there, or you can just use the VM as a full desktop environment and do everything there, which is what I would recommend.
Install the pre-configured ActivePython release from activestate.
Among other features, it includes the PythonWin IDE (Windows only) which makes it easy to explore Python interactively.
The recommended reference is Dive Into Python, mentioned many times on similar SO discussions.
You should install python 2.4, python 2.5, python 2.6 and python 3.0, and add to your path the one you use more often (Add c:\Pythonxx\ and c:\Pythonxx\Scripts).
For every python 2.x, install easy_install; Download ez_setup.py and then from the cmd:
c:\Python2x\python.exe x:\path\to\ez_setup.py
c:\Python2x\Scripts\easy_install virtualenv
Then each time you start a new project create a new virtual environment to isolate the specific package you needs for your project:
mkdir <project name>
cd <project name>
c:\Python2x\Scripts\virtualenv --no-site-packages .\v
It creates a copy of python and its libraries in .v\Scripts and .\v\Lib. Every third party packages you install in that environment will be put into .\v\Lib\site-packages. The -no-site-packages don't give access to the global site package, so you can be sure all your dependencies are in .\v\Lib\site-packages.
To activate the virtual environment:
.\v\Scripts\activate
For the frameworks, there are many. Django is great and very well documented but you should probably look at Pylons first for its documentions on unicode, packaging, deployment and testing, and for its better WSGI support.
For the IDE, Python comes with IDLE which is enough for learning, however you might want to look at Eclipse+PyDev, Komodo or Wingware Python IDE. Netbean 6.5 has beta support for python that looks promising (See top 5 python IDE).
For the webserver, you don't need any; Python has its own and all web framework come with their own. You might want to install MySql or ProgreSql; it's often better to develop on the same DB you will use for production.
Also, when you have learnt Python, look at Foundations of Agile Python Development or Expert Python Programming.
Using Python on Windows
SO: Python tutorial for total beginners?
Take a look at Pylons, read about WSGI and Paste.
There's nice introductory Google tech talk about them: ReUsable Web Components with Python and Future Python Web Development.
Here's my answer to similar question:
Django vs other Python web frameworks?
NOTE: I included a lot of links to frameworks, projects and what-not, but as a new user I was limited to 1 link per answer. If someone else with enough reputation to edit wants/can edit them into this answer instead of the footnotes, I'd be grateful.
There are some Python IDE's such as Wing IDE[1], I believe some people use Eclipse[2] with a python plugin[3] as well. A lot of people in the #python channel of FreeNode seem to prefer vim, emacs, nano and similar text editors in favor of IDE's. My personal preffered editor is Vim, but if you've mostly done .NET development on windows, presumably with the usual Visual X IDE's, vim and emacs will probably cause you culture shock and you'd be better of using an IDE.
Nearly all python web frameworks* support the WSGI standard[4], most of the large web servers have some sort of plugin to support WSGI, the others support WSGI via fast cgi or plain cgi.
The Zope[5] and Django[6] frameworks have their own ORM's, of other ORM's the two most well known appear to be SQL Alchemy[7] and SQL Object[8]. I only have experience with the former, but both support all possible sane database choices, including SQLite which is installed together with Python and hence perfectly suited to testing and experimenting without polluting your .NET environment with 3rd part web servers and database servers.
The builtin unittest[9] and pyunit[10] frameworks seem to be the preffered solutions for unit testing, but I don't have much experience with these.
bpython[11] and ipython[12] offer enhanced interactive python shells which can greatly help speed up and testing small bits of code and hence worth looking in to.
As for a list of well known and often used web frameworks, look into the following frameworks**:
Twisted[13] is a generic networking framework, which supports almost every single protocol under the sun.
Pylons[14] is light-weight framework aimed at being as flexible as possible and leaving all the choices about what ORM, templating language and what-not to you.
CherryPy[15] tries to provide an interface to expose Python objects to the web.
Django[6] attempts to be an all-in-one solution, builtin template system, ORM, admin pages and internationalization. While the previous frameworks have more DIY wiring together various frameworks work involved with them.
Zope[5] is aimed to be suitable for large enterprise applications, I've heard nothing but good things about it, but consensus seems to be that for smaller you're probably better off with one of the simpler and smaller frameworks.
TurboGears[16] is the framework I know the least about, but it seems to be mostly competition for Django.
This is everything I can think of right now, I'll edit and add stuff if I can think of it. I hope this helps you some in the wonderful world of python.
* - The main exception would be Apache's mod_python, which you should avoid for exactly that reason, use mod_wsgi instead.
** - Word of warning, I have not personally used these frameworks this is just a very short impression I have gotten from talking to other people about each framework, it may be wildly inaccurate. (If anyone has any corrections, do comment and I'll try to edit and fix this answer).
(The http:// is missing since they're recognized as links otherwise)
[1] www.wingware.com/
[2] www.eclipse.org/
[3] pydev.sourceforge.net/
[4] wsgi.org/wsgi/
[5] www.zope.org/
[6] www.djangoproject.com/
[7] www.sqlalchemy.org/
[8] www.sqlobject.org/
[9] docs.python.org/library/unittest.html
[10] pyunit.sourceforge.net/pyunit.html
[11] www.bpython-interpreter.org/
[12] ipython.scipy.org/
[13] twistedmatrix.com/trac/
[14] pylonshq.com/
[15] www.cherrypy.org/
[16] turbogears.org/
Environment?
Here is the simplest solution:
Install Active Python 2.6. Its the Python itself, but comes with some extra handy useful stuff, like DiveintoPython chm.
Use Komodo Edit 5. It is among the good free editor you can use for Python.
Use IDLE. Its the best simplest short snippet editor, with syntax highlighting and auto complete unmatched by most other IDEs. It comes bundled with python.
Use Ipython. Its a shell that does syntax highlighting and auto complete, bash functions, pretty print, logging, history and many such things.
Install easy_install and/or pip for installing various 3rd party apps easily.
Coming from Visual Studio and .Net it will sound a lot different, but its an entirely different world.
For the framework, django works the best. Walk thro the tutorial and you will be impressed enough. The documentation rocks. The community, you have to see for yourself, to know how wonderful it is!!
Python has build in SQL like database and web server, so you wouldn't need to install any third party apps. Remember Python comes with batteries included.
If you've worked with Eclipse before you could give Pydev a try

Categories