I have some experience with Python-Django including its REST framework, a reasonable understanding of geographic information (and what GIS is about) and about databases, and I would like to develop a "geo-aware" REST service on my Windows machine based on Django. The application shall be limited to the REST service; visual exploration and other stuff shall be developed independently. Side-remark: once my application is running, it will be ported onto a Linux machine and PostGIS will then be used instead of SpatialLite.
After several hours of web-searching, I still haven't come up with a good "Quickstart" guide. There are many tutorials and docs about various aspects related to my task, but they either refer to Linux, or their installation instructions are outdated. Here is what I have done so far:
1) I use miniconda and Pycharm
2) I created a new virtual environment like so:
conda create -n locations pip numpy requests
activate locations
conda install -c conda-forge django djangorestframework djangorestframework-gis
3) I set-up the new Django project and my application and performed a database migration:
python [path-to..]\django-admin.py startproject locations
cd locations
python [path-to..]\django-admin.py startapp myapp
cd ..
python manage.py migrate
4) I added "rest_framework" and "myapp.apps.MyAppConfig" to the APPLICATIONS in settings.py
5) I stopped reading the general django-restframework tutorial and began searching for django-restframework-gis specific information. What I understood from this is that I need to enhance my SQLite database to become a SpatialLite database. Windows binaries for SpatialLite are available at gaia-sins -- but which of these do I really need? I downloaded the mod_spatialite-4.3.0a-win-x86.7z file and unpacked it, and I added SPATIALITE_LIBRARY_PATH= '[path-to..]\mod_spatialite-4.3.0a-win-x86\mod_spatialite.dll' to my settings.py.
What comes next?
Specific questions:
1) Do I really need to upgrade my SQLite database if I am not planning to store geospatial information but merely build a REST service to deliver information in GeoJSON which is coming from other sources (for example weather model output in netcdf data format)? Or would it suffice to describe my Django model in this case and simply ignore any database-related issues?
2) What is the minimum code to get the basic "wiring" done? This could be an extremely simple service which would accept a lat/lon coordinate as parameter in the URL and return this location in GeoJSON format. Such code should highlight the differences between using the "normal" django-restframework from the gis version. Once I have this, I will probably find my way through the existing documentation (for example Miguel Grinberg Tutorial or GitHub description).
OK, after another day of searching and experimenting, I acknowledge that this has been the wrong question to ask - therefore I answer myself and close this issue.
Apparently, I have been to naive about setting up a "geo-aware" service, thinking that I can get away with a special datatype or two for coordinates, and a special kind of serializer for GeoJSON - and, if really necessary, with a geo-enabled database.
Turns out, that what I want to do in the end, is a GeoDjango application (even if I will use only a tiny fraction of what GeoDjango can do), and so the GeoDjango docs are the place to start from, and in particular their installation guide.
The story isn't over yet as I am still having troubles to load the required libraries from Django, but the direction is clearer now.
More specifically, the issue I ran into wasn't primarily a SpatiaLite issue. I was able to install SpatiaLite and enhance an existing sqlite database by running SELECT load_extension('mod_spatialite'); SELECT InitSpatialMetaData(); (see also this post. Django (python manage.py check) complained about not finding the gdal library, and once it found it, it was the wrong version. The GeoDjango docs report that this is indeed the most common problem when installing GeoDjango. It would be helpful if the error messages from ctypes were a bit more verbose to make it easier to search for solutions. It took several hops across various web sites and an extra print statement in the ctypes init.py file, before I found out that one needs to match the version of gdal, the version of python, and the compiler (DLL hell this was called by someone).
Another part of the confusion arises from the manifold dependencies among the various required "geo packages". For example, SpatiaLite already comes with a gdal library, so why the need for installing gdal separately? Indeed, the GeoDjango docs recommend to work with OSGEO4W, because this program suite bundles everything together. Yet, this is not trivial if one starts from a system where OSGEO4W and Python/Django have been installed independently. This is the situation I start from. I installed OSGEO4W primarily to work with QGIS, and I installed Python and Django for other tasks. The realisation that the two must be linked for a GeoDjango application only came afterwards. I might need to start from scratch, but it would be good to know if people have been successful in a Windows 10, x64 environment with Python >= 3.4 recently.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 months ago.
Improve this question
I have yet to come across an answer that makes me WANT to start using virtual environments. I understand how they work, but what I don’t understand is how can someone (like me) have hundreds of Python projects on their drive, almost all of them use the same packages (like Pandas and Numpy), but if they were all in separate venv’s, you’d have to pip install those same packages over and over and over again, wasting so much space for no reason. Not to mention if any of those also require a package like tensorflow.
The only real benefit I can see to using venv’s in my case is to mitigate version issues, but for me, that’s really not as big of an issue as it’s portrayed. Any project of mine that becomes out of date, I update the packages for it.
Why install the same dependency for every project when you can just do it once for all of them on a global scale? I know you can also specify —-global-dependencies or whatever the tag is when creating a new venv, but since ALL of my python packages are installed globally (hundreds of dependencies are pip installed already), I don’t want the new venv to make use of ALL of them? So I can specify only specific global packages to use in a venv? That would make more sense.
What else am I missing?
UPDATE
I’m going to elaborate and clarify my question a bit as there seems to be some confusion.
I’m not so much interested in understanding HOW venv’s work, and I understand the benefits that can come with using them. What I’m asking is:
Why would someone with (for example) have 100 different projects that all require tensorflow to be installed into their own venv’s. That would mean you have to install tensorflow 100 separate times. That’s not just a “little” extra space being wasted, that’s a lot.
I understand they mitigate dependency versioning issues, you can “freeze” packages with their current working versions and can forget about them, great. And maybe I’m just unique in this respect, but the versioning issue (besides the obvious difference between python 2 and 3) really hasn’t been THAT big of an issue. Yes I’ve run into it, but isn’t it better practise to keep your projects up to date with the current working/stable versions than to freeze them with old, possibly no longer supported versions? Sure it works, but that doesn’t seem to be the “best” option to me either.
To reiterate on the second part of my question, what I would think is, if I have (for example) tensorflow installed globally, and I create a venv for each of my 100 tensorflow projects, is there not a way to make use of the already globally installed tensorflow inside of the venv, without having to install it again? I know in pycharm and possibly the command line, you can use a — system-site-packages argument (or whatever it is) to make that happen, but I don’t want to include ALL of the globally installed dependencies, cuz I have hundreds of those too. Is —-system-site-packages -tensorflow for example a thing?
Hope that helps to clarify what I’m looking for out of this discussion because so far, I have no use for venv’s, other than from everyone else claiming how great they are but I guess I see it a bit differently :P
(FINAL?) UPDATE
From the great discussions I've had with the contributors below, here is a summation of where I think venv's are of benefit and where they're not:
USE a venv:
You're working on one BIG project with multiple people to mitigate versioning issues among the people
You don't plan on updating your dependencies very often for all projects
To have a clearer separation of your projects
To containerize your project (again, for distribution)
Your portfolio is fairly small (especially in the data science world where packages like Tensorflow are large and used quite frequently across all of them as you'd have to pip install the same package to each venv)
DO NOT use a venv:
Your portfolio of projects is large AND requires a lot of heavy dependencies (like tensorflow) to mitigate installing the same package in every venv you create
You're not distributing your projects across a team of people
You're actively maintaining your projects and keeping global dependency versions up to date across all of them (maybe I'm the only one who's actually doing this, but whatever)
As was recently mentioned, I guess it depends on your use case. Working on a website that requires contribution from many people at once, it makes sense to all be working out of one environment, but for someone like me with a massive portfolio of Tensorflow projects, that do not have versioning issues or the need for other team members, it doesn't make sense. Maybe if you plan on containerizing or distributing the project it makes sense to do so on an individual basis, but to have (going back to this example) 100 Tensorflow projects in your portfolio, it makes no sense to have 100 different venv's for all of them as you'd have to install tensorflow 100 times into each of them, which is no different than having to pip install tensorflow==2.2.0 for specific old projects that you want to run, which in that case, just keep your projects up to date.
Maybe I'm missing something else major here, but that's the best I've come up with so far. Hope it helps someone else who's had a similar thought.
I'm a data scientist and sometimes I run into these things called "virtual environments" and I don't get what the use case is? I already have all of these packages and modules and widgets downloaded! Why should I set up a separate place where I manage all of the stuff I'm already managing globally?
Python is a very powerful tool. In this answer consider two such ways to swing the metaphorical hammer:
Data Science
Software Engineering
For a data scientist (working alone) using Python to write a poc for a research paper, make a lstm nn, or predict the price of TSLA dependent on the frequency of Elon Musk's tweets all that really matters is being able to use the best library (tensorflow, pytorch, sklearn, ...) for whatever task they're trying to get done. In whatever directory they're working in when they need it. It is very tempting to use one global Python installation and just use the same stuff everywhere. Frankly, this is probably fine. As it's just one person managing their own space. So the configuration of their machine would be one single Python environment and everything, everywhere uses it. Or if the data scientist wanted to they could have a single directory that contains a virtual environment and some sub directories containing all the scripts (projects) they work on.
Now consider a software engineer who has multiple git repos with complete CI/CD pipelines that each build into separate entities that then get deployed to some cloud environment. Them and the 9 other people on their team need to be able to be sure that they are all making changes that won't break any piece of the code. For example in Python 3.6 the function dict.popitem subtly changed from returning a random element in a dict to LIFO order guaranteed. It's pretty easy to see that that could cause issues if Jerry had implemented a function that relies on the original random nature of the function and Bob implemented a function with the LIFO behavior guaranteed. This team of engineers would have git repos that each contain a single virtual environment (a single isolated Python environment) that allows them to manage dependencies for that "project".
The data scientist has one Python installation/environment that allows them to do whatever.
The engineer has a Python installation and a bunch of environments so that they can work across multiple repos with multiple people and (hopefully) nothing breaks.
I can see where you're coming from with your question. It can seem like a lot of work to set up and maintain multiple virtual environments (venvs), especially when many of your projects might use similar or even the same packages.
However, there are some good reasons for using venvs even in cases where you might be tempted to just use a single global environment. One reason is that it can be helpful to have a clear separation between your different projects. This can be helpful in terms of organization, but it can also be helpful if you need to use different versions of packages in different projects.
If you try to share a single venv among all of your projects, it can be difficult to use different versions of packages in those projects when necessary. This is because the packages in your venv will be shared among all of the projects that use that venv. So, if you need to use a different version of a package in one project, you would need to change the version in the venv, which would then affect all of the other projects that use that venv. This can be confusing and make it difficult to keep track of what versions of packages are being used in which projects.
Another issue with sharing a single venv among all of your projects is that it can be difficult to share your code with others. This is because they would need to have access to the same environment (which contains lots of stuff unrelated to the single project you are trying to share). This can be confusing and inconvenient for them.
So, while it might seem like a lot of work to set up and maintain multiple virtual environments, there are some good reasons for doing so. In most cases, it is worth the effort in order to have a clear separation between your different projects and to avoid confusion when sharing your code with others.
It's the same principle as in monouser vs multiuser, virtualization vs no virtualization, containers vs no containers, monolithic apps vs micro services, etcetera; to avoid conflict, maintain order, easily identify a state of failure, among other reasons as scalability or portability. If necessary apply it, and always keeping in mind KISS philosophy as well, managing complexity, not creating more.
And as you have already mentioned, considering that resources are finite.
Besides, a set of projects that share the same base of dependencies of course that is not the best example of separation necessity.
In addition to that, technology evolve taking into account not redundancy of knowingly base of commonly used resources.
Well, there are a few advantages:
with virtual environments, you have knowledge about your project's dependencies: without virtual environments your actual environment is going to be a yarnball of old and new libraries, dependencies and so on, such that if you want to deploy a thing into somewhere else (which may mean just running it in your new computer you just bought) you can reproduce the environment it was working in
you're eventually going to run into something like the following issue: project alpha needs version7 of library A, but project beta needs library B, which runs on version3 of library A. if you install version3, A will probably die, but you really need to get B working.
it's really not that complicated, and will save you a lot of grief in the long term.
There are several motivations for venvs,
or for their moral equivalent: conda environments.
1. author a package
You create a cool "scrape my favorite site" package
which graphs a timeseries of some widget product.
Naturally it depends on BeautifulSoup.
You happened to have html5lib 1.1 lying around
due to some previous project, so you tested with that.
A user downloads your scrape-widget package from pypi,
happens to have lxml 4.7.1 available, and finds
that scraping crashes when using that library.
Wouldn't it have been better for your package
to specify that user shall run against the same
deps that you tested with?
2. use a package
Same scenario, but now you're using someone's scrape-widget
package. Author tested with lxml 4.7.1 but you have lxml 4.9.1,
which behaves differently, and this makes the app behave
differently, crashing in ways the author never saw.
3. use two packages
You want to run both scrape-frobozz-magic-widgets
and scrape-acme-widget. Their authors tested using
different versions of requests, and of lxml.
Changing dep changes the app behavior.
You can only use one or the other, unless you're
willing to re-run pip quite frequently.
4. collaborate on a team
You write code that has deps.
So does your colleague.
You have to coordinate things,
so testing on one laptop
instills confidence the test
would succeed on other laptops.
5. use CI
You have a teammate named Jenkins, and
want to communicate to him that you used
a specific version of a dep when you saw the test succeed.
6. get a new laptop
Things were working.
Then your laptop exploded,
you got a new one,
and you (quickly) want to see things work again.
Some of your deps were downrev, due to
recently released bugs and breaking changes.
Reading a file full of dep versions from your github repo
lets you immediately reproduce the state of the world
back when things were working.
I ran a Website on Zope/Plone for several years, until the server crashed catastrophically. I have no interest in bringing up the Web site again (nor am I convinced that I even could, since the version of both Zope and Plone that it was based on has been obsolete for at least six years), but I'd love to be able to get the content out so that I can use it. It's stored in a Zope filesystem (i.e. Plone-2.5.3/zeocluster/server/var/Data.fs). Is there any set of tools out there that would allow my to write a Python script to save the content to files? Or is there another way to get at the content short of attempting to reinstall the whole Web site? Data.fs is over half a gigabyte, so there's a fair bit in there.
I'm running Python 2.7 and 3.4 on Ubuntu 14.04, in case that's relevant.
Bite the bullet and re-install Plone 2.5.3. While you can theoretically read the ZODB, its contents consist of Python pickles which won't make sense without the classes that were pickled.
Re-installing 2.5.3 will have some challenges, but if you start with the "Unified Installer" for 2.5.3 (https://plone.org/products/plone/releases/2.5.3), you'll at least have all the source pieces, including the matching Python.
Once you've got 2.5.3 running again, you'll have the code base you need to read and export the data via Python script.
Zope/Plone uses ZODB for storage. Try that on a copy of Data.fs.
I love Tim Pope's rails.vim, and I'm wondering if there's an equivalent vim plugin for Django. I'm especially looking for easy navigation of the Django file structure via vim command mode.
I use django.vim for Django Templates
Theres nothing as well structured as that plugin.
As far as quick navigation goes I have this in my vimrc
http://code.djangoproject.com/wiki/UsingVimWithDjango#Mappings (That whole doc will give you some good starting points)
Also I've published a couple of offerings on vim.org for some nav tasks
http://www.vim.org/scripts/script.php?script_id=2781 (For Reverse url and template jumping)
http://www.vim.org/scripts/script.php?script_id=2780 (completing imports)
Other than that general purpose vim fu can take you a long way
I've created a repo that I want to add a lot of branches to for vim config (django/python centric). There are already a few branches and some path-hacking for settings.py. Feel free to fork/branch and share!
http://github.com/skyl/vim-config-python-ide
I haven't gotten around to adding nerdtree, but I think that is a really popular plugin for a filebrowser.
The django wiki page on using vim now lists the pony.vim plugin, which seems like it gives similar things to rails.vim, including the ability to jump between models, views, templates, etc per app, as well as run some of the django commands right from within vim. Part of it is that, quite simply, django's folder structure is different than Rails (less complicated?/less defined?/certainly different ethos overall). But pony.vim seems like it covers most of the bases.
rope-vim can make completions easier, though it does require just a tiny bit of customization, plus it adds direct access to the docs on autocomplete, which is quite nice.
(I'm answering this here because this is the top result on Google when searching for rails.vim equivalent for Django :P)
** Update 10/8/2013 **
I'm now using a jedi driven python vim configuration (along with some tmux config)
https://github.com/JarrodCTaylor/imt_dotfiles
I also have a fairly complete vim config for django development (if you are interested).
https://github.com/toranb/vimfiles
I use rope-vim as mentioned by others but I also have a few other useful plugins to ensure you can run unit tests (using nose) in your django project with QTPY
A few things I ran into that others never seem to mention when doing python / django development on OSX and Ubuntu (day job dev / night time dev) including:
https://github.com/lambdalisue/vim-django-support
https://github.com/jmcantrell/vim-virtualenv
If you ensure vim has the virtualenv activated (assuming you are using virtualenv) the rope plugin will know where to find your site-packages for quick "go to definition" lookups along with other refactoring support.
I use this without any need for pycharm now as I get full autocompletion with rope-vim and supertab. I also have the command-t plugin for quick "find by file" lookups / etc
I recently found that using basic ctags on OSX + Ubuntu enabled me to "find symbol" using the below. I also added a simple "recent files" lookup using the find in buffer. I also added a few shortcuts to show a fuzzy finder like search from the current directory (for the file I happen to have open). I use this to show other related files quickly / etc.
find by symbol equiv (shows classes / methods in a fuzzy finder using your ctags file)
:FufTag
find in buffer (recent files)
:FufBuffer
show fuzzy finder w/ other files in the current dir
:FufFileWithCurrentBufferDir
I've been using zc.buildout more and more and I'm encountering problems with some recipes that I have solutions to.
These packages generally fall into several categories:
Package with no obvious links to a project site
Package with links to free hosted service like github or google code
Setup #2 is better then #1, but not much better because for both of these situations, I would have to wait for the developer to apply these changes before i can use the updated package buildout.
What I've been doing up to this point is basically forking the package, giving it a different name and uploading it to pypi, but this is creating redundancy and I think only aggravating the problem.
One possible solution, is to use to use a personal server package index where I would upload updated versions of the code until the developer updates he/her package. This is doable, but it adds additional work, that I would prefer to avoid.
Is there a better way to do this?
Thank you
Your "upload my personalized fork" solution sounds like a terrible idea. You should try http://pypi.python.org/pypi/collective.recipe.patch which lets you automatically patch eggs. Try setting up a local PyPi-compatible index. I think you can also point find-links = at a directory (not just a http:// url) containing your personal versions of those "almost good enough" packages. You can also try monkey patching the defective package, or take advantage of the Zope component model to override the necessary bits in a new package. Often the real authors are listed somewhere in the source code of a package, even if they decided not to put their names up on PyPi.
I've been trying to cut down on the number of custom versions of packages I use. Usually I work with customized packages as develop eggs by linking src/some.project to my checkout of that project's code. I don't have to build a new egg or reinstall every time I edit those packages.
A lot of Python packages used in buildouts are hosted in Plone's svn collective. It's relatively easy to get commit access to that repository.
I tried several Python IDEs (on Windows platform) but finally I found only Eclipse + PyDev meeting my needs. This set of tools is really comfortable and easy to use. I'm currently working on a quite bigger project. I'd like to have a possibility to use CVS or any other version control system which would be installed on my local harddrive (I recently moved my house and don't have yet an access to internet.)
It doesn't matter for me if it'd be CVS - can also be any other version control system. It'd be great if it will be not too hard to configure with Eclipse. Can anyone give me some possible solution? Any hints?
Regards and thanks in advance for any clues. Please forgive my English ;)
Last time I tried this, Eclipse did not support direct access to local repositories in the same way that command line cvs does because command line cvs has both client and server functionality whereas Eclipse only has client functionality and needs to go through (e.g.) pserver, so you would probably need to have a cvs server running.
Turns out that I didn't really need it anyway as Eclipse keeps its own history of all changes so I only needed to do an occasional manual update to cvs at major milestones.
[Eventually I decided not to use cvs at all with Eclipse under Linux as it got confused by symlinks and started deleting my include files when it "synchronised" with the repository.]
If you don't mind a switch to Subversion, Eclipse has its SubClipse plugin.
As others have indicated, there are plugins available for Eclipse for SVN, Bazar, Mercurial and Git.
Even so, despite their presence, I find using the command line the most comfortable.
svn commit -m 'now committing'
Assuming you are not committing for more than several times a day, this should work well enough. Is there anything specific that is preventing you from using the command line?
I tried Eclipse+Subclipse and Eclipse+Bazaar plugin. Both work very well, but I have found that Tortoise versions of those version source control tools are so good that I resigned from Eclipse plugins. On Windows Tortoise XXX are my choice. They integrate with shell (Explorer or TotalCommander), changes icon overlay if file is changed, shows log, compare revisions etc. etc.
I would definitely recommend switching over to a different VCS—I prefer Mercurial, along with a lot of the Python community. That way, you'll be able to work locally, but still have the ability to publish your changes to the world later.
You can install TortoiseHg for Windows Explorer, and the MercurialEclipse plugin for Eclipse.
There's even a Mercurial for CVS users document to help you change over, and a list of mostly-equivalent commands.
I believe Eclipse does have CVS support built in - or at least it did have when I last used it a couple of years ago.
For further information on how to use CVS with Eclipse see the Eclipse CVS FAQ
I recently moved my house and don't have yet an access to internet.
CVS and SVN are the Centralized Version control systems. Rather than having to install them on your local system just for single version control, you could use DVCS like Mercurial or Git.
When you clone a Mercurial Repository, you have literally all versions of all the repo files available locally.
I use Eclipse with a local CVS repository without issue. The only catch is that you cannot use the ":local:" CVS protocol. Since you're on Windows, I recommend installing TortoiseCVS and then configuring the included CVSNT server as follows:
Control Panel: CVSNT
Repository configuration: create a repository and publish it
Note the Server Name and make sure it matches your hostname
Eclipse: Create a new repository location using the :pserver: connection type and point it to your local hostname
This (or any actual source control system) has the advantage over the Eclipse Local History of being able to associate checkin comments with changes, group changes into change sets, etc. You can use the Eclipse Local History to recover from minor mistakes, but it's no replacement for source control (and expires as well: see Window->Preferences General->Workspace->Local History).