I have many lines of georeferenced hydrological data with weekly resolution:
Station name, Lat, Long, Week 1 average, Week 2 average ... Week 52 average
Unfortunately, I also have some data with only monthly resolution:
Station name, Lat, Long, January average, February average ... December average
Rather than "reinventing the wheel," can anyone recommend a favorite module, package, or technique that would provide a reasonable interpolation of weekly values from monthly values? Linear would be fine, but it would be nice if we could use the coordinates to improve the interpolation based on nearby stations.
I've tagged this post with python because it's the language I've been using recently (although not its statistical functions). If the answer is "use a stats program like r" so be it, but I'm curious as to what's out there for python. Thanks!
I haven't had a chance to dig into it, but the hpgl (High Performance Geostatistics Library) provides a number of kriging (geospatial interpolation) methods:
Algorithms
Simple Kriging (SK)
Ordinary Kriging (OK)
Indicator Kriging (IK)
Local Varying Mean Kriging (LVM Kriging)
Simple CoKriging (Markov Models 1 & 2)
Sequential Indicator Simulation (SIS)
Corellogram Local Varying Mean SIS (CLVM SIS)
Local Varying Mean SIS (LVM SIS)
Sequential Gaussian Simulation (SGS)
If you are interested into expanding your experience into R, there are a number of good, well used and documented packages out there. I would start by looking at the Spatial Taskview, which lists what packages can be used for spatial data. One of the paragraphs deals with interpolation. I am most familiar with automap/gstat (I wrote automap), where especially gstat is a powerfull geostatistics package which supports a wide range of methods.
http://cran.r-project.org/web/views/Spatial.html
Integrating Python and R can be done in multiple ways, e.g. Using system calls or an in memory link using Rpy. See also:
Python interface for R Programming Language
I am looking into doing the same thing, and I found this kriging module written by Sat Kumar Tomer at AMBHAS.
There appears to be methods for producing variograms and performing ordinary kriging.
I'll update this answer if I use this and make further discoveries.
Since I originally posted this question (in 2012!) an actively-developed Python Kriging module has been released https://github.com/bsmurphy/PyKrige
There's also this older option:
https://github.com/capaulson/pyKriging
Related
I am new to OpenMM and I would appreciate some guidance on the following matter:
Currently I am not interested in running molecular dynamics simulations, for starters I would just like to compute what are the forces or free energies between individual pairs of atoms using OpenMMs AMBER force field for example. Essentially I would like to end up with a heat map which represents forces between atom pairs something like this:
Where numbers represent strength of the force or value of free energy.
I have trouble finding out how to access such lower level functionality of OpenMM where I could write a custom script that calculates only desired forces provided the 3D coordinates of atoms and their types. In their tutorials I have just found how to run fully fledged simulations by providing force field data and PDB files of molecular systems.
Preferably I would like to achieve this with python.
Any concrete example or guidance is much appreciated.
I have found an answer in the Openmm's issue tracker on GitHub.
In short: There is no API to achieve exactly that in OpenMM as what I am trying to do is not well defined from purely physical/chemical perspective. My best bet is to compute something that looks like an energy based only on pairwise inter-atom distances which can be quarried from an openmm state like this (as suggested in the discussion referenced above):
state = simulation.context.getState(getPositions=True)
positions = state.getPositions(asNumpy=True).value_in_unit(nanometer)
I have two sets of shapefiles with polygons. One set of shapefile is just the US counties I'm interested in and this varies across firms and years. The other set of shapefile is the business area of firms and of course this varies across firms and years. I need to get the intersection of these two layers for each firm in each year. So far the function overlay(df1, df2, how = 'intersection') accomplished my goal. But it takes around 300s for each firm-year. Given that I have a long list of firms and many years, this would take me days to finish. Is there any way to enhance this performance?
I notice that if I do the same thing in ArcGIS, the 300s comes down to a few seconds. But I'm a new user of ArcGIS, not familiar with the python in it yet.
If you look at the current geopandas overlay source code, they've actually updated the overlay function to utilize Rtree spatial indexing! I don't think that doing doing the Rtree manually would be any faster (actually will probably be slower) at this point in time.
See source code here: https://github.com/geopandas/geopandas/blob/master/geopandas/tools/overlay.py
Hopefully you've figured this out by now, but the solution is to utilize Geopanda's R-tree spatial index. You can achieve orders of magnitude improvement by implementing it appropriately.
Goeff Boeing has written an excellent tutorial.
http://geoffboeing.com/2016/10/r-tree-spatial-index-python/
I'm trying to call upon the famous multilateration algorithm in order to pinpoint a radiation emission source given a set of arrival times for various detectors. I have the necessary data, but I'm still having trouble implementing this calculation; I am relatively new with Python.
I know that, if I were to do this by hand, I would use matrices and carry out elementary row operations in order to find my 3 unknowns (x,y,z), but I'm not sure how to code this. Is there a way to have Python implement ERO, or is there a better way to carry out my computation?
Depending on your needs, you could try:
NumPy if your interested in numerical solutions. As far as I remember, it could solve linear equations. Don't know how it deals with non-linear resolution.
SymPy for symbolic math. It solves symbolically linear equations ... according to their main page.
The two above are "generic" math packages. I doubt you will find (easily) any dedicated (and maintained) library for your specific need. Their was already a question on that topic here: Multilateration of GPS Coordinates
So I have a 2D vector field {u(x,y,t), v(x,y,t)} representing velocities of an unsteady flow at different instances in time. I don't have an analytical description of the flow, just the two components u and v over time.
I am aware of matplotlib.quiver and the answer to this question which suggests to use this for plotting streamlines.
Now I want to also plot a couple of pathlines and streaklines of the vector field.
Is there any tool that is capable of doing this (preferably a Python package)? This seems to be a common task but I couldn't find anything and don't want to waste time on reinventing the wheel.
Currently, there is no functionality in matplotlib to plot streaklines. However, Tom Flannaghan's streamline plotting utility has been improved and merged into the codebase. It will be available in matplotlib version 1.2, which is to be released in the next few weeks.
At present, your best bet is to solve the streakline ODE in the Wikipedia page you linked to. If you want to use python to do this, you can use scipy.integrate.odeint. This is exactly what matplotlib.axes.streamplot does at present for streamlines.
I'd like to know if there is any implemented python library for GPS trajectory pre-processing such as compression, smoothing, filtering, etc.
Expanding on my comment, a Kalman filter is the usual choice for estimating position and velocity from noisy sensor readings.
Here's what Wikipedia has to say on the topic (emphasis mine:)
The Kalman filter is an algorithm, commonly used since the 1960s for
improving vehicle navigation (among other applications, although
aerospace is typical), that yields an optimized estimate of the
system's state (e.g. position and velocity). The algorithm works
recursively in real time on streams of noisy input observation data
(typically, sensor measurements) and filters out errors using a
least-squares curve-fit optimized with a mathematical prediction of
the future state generated through a modeling of the system's physical
characteristics.
The Kalman filter is the basic version; there's also the extended Kalman filter and unscented Kalman filter (though my control systems lecturer never got around to telling us what those were actually used for.)
#stark has provided a link to an implementation of the Kalman filter in Python (not sure of the quality.) You may be able to find others, or roll your own with scipy.
Not GPS-specific, but numpy has general statistics and scientific algorithms. For example, if you want to make a best-fit line to a series of points, you would run a linear regression on the data.