Use GeoDataFrame as a osgeo.ogr DataSource - python

I have read shapefile into a GeoDataFrame and did some modification to it:
import geopandas as gpd
# Read shapefile into geodataframe
geodf = gpd.read_file("shapefile.shp")
# Do some "pandas-like" modifications to shapefile
geodf = modify_geodf(geodf)
However, I would also like to apply some functionality of osgeo.ogr module on it:
from osgeo import ogr
# Read shapefile into ogr DataSource
ds=ogr.Open("shapefile.shp")
# Do some "gdal/ogr-like" modifications to shapefile
ds = modify_ds(ds)
QUESTION: Is there any way to use or convert an already in-memory shapefile, currently in a form of a GeoDataFrame, as an osgeo.ogr.DataSource directly?
The way I do it so far is to save GeoDataFrame to file with to_file() and then osgeo.ogr.Open() it again, but this seems kind of redundant to me.

Yes, it is possible. You can pass the GeoDataFrame into ogr.Open in GeoJson format which is supported OGR vector format. Hence, you don't need to save temporary file into a disk:
import geopandas as gpd
from osgeo import ogr
# Read file into GeoDataFrame
data = gpd.read_file("my_shapefile.shp")
# Pass the GeoDataFrame into ogr as GeoJson
shp = ogr.Open(data.to_json())
# Do your stuff with ogr ...
Hopefully this helps!

No. Only formats supported by OGR can be opened with ogr.Open().

Related

Text file writing: next line starting from last value

I have a list of networkx graphs, and I am trying to write a text file containing a massive edge list of all graphs. If you run the following code:
from torch_geometric.datasets import TUDataset
dataset = TUDataset(root='data/TUDataset', name='MUTAG')
Then do to data->TUDataset->MUTAG->raw, I am trying to replicate the raw files but using my data.
My raw data is a MATLAB .mat file containing a struct where the first column A is each individual graph's corresponding adjacency matrix where i create the networkx graph:
from scipy.io import loadmat
import pandas as pd
raw_data = loadmat('data_final3.mat', squeeze_me=True)
data = pd.DataFrame(raw_data['Graphs'])
import networkx as nx
A = data.pop('A')
nx_graph = []
for i in range(len(A)):
nx_graph.append(nx.Graph(A[i]))
I created the MUTAG_graph_indicator file using:
with open('graph_indicator.txt', 'w') as f:
for i in range(len(nx_graph)):
f.write((str(i)+'\n')*len(nx_graph[i].nodes))
If there is a way to do this either using python or MATLAB, I would greatly appreciate the help. Yes, torch_geometric does have from_networkx, but it doesn't seem to contain the same information as if I created the torch_geometric graphs the same way as the sample data.

How do I create a dataframe from a geojason file without using geopandas?

I'm looking to turn a geojason into a pandas dataframe that I can work with using python. However, for some reason, the geojason package will not install on my computer.
So wanted to know how I could turn a geojason file into a dataframe witout using the geojason package.
This is what I have so far
import json
import pandas as pd
with open('Local_Authority_Districts_(December_2020)_UK_BGC.geojson') as f:
data = json.load(f)
Here is a link to the geojason that I'm working with. I'm new to python so any help would be much appreciated. https://drive.google.com/file/d/1V4WljiJcASqq9ksh8CHM_2nBC0K2PR18/view?usp=sharing
You could use geopandas. It's as easy as this:
import geopandas as gpd
gdf = gpd.read_file('Local_Authority_Districts_(December_2020)_UK_BGC.geojson')
You can turn the resulting geodataframe into a regular dataframe with:
df = pd.DataFrame(gdf)

netCDF files has no variables in python when importing with xarray

I'm VERY new to xarray, and I tried to import a satellite netcdf files into python using xarray using this file: https://tropomi.gesdisc.eosdis.nasa.gov/data//S5P_TROPOMI_Level2/S5P_L2__NO2____HiR.1/2020/003/S5P_OFFL_L2__NO2____20200103T170946_20200103T185116_11525_01_010302_20200105T100506.nc
This is the code I used:
import xarray as xr
import numpy as np
import pandas as pd
tropomi = xr.open_dataset('test2.nc', engine = 'netcdf4')
tropomi
Output:
But the output does not present any variables, and has 53 attributes - why is this happening?
Thanks!
I figured it out. When you open the file without a group defined, you get the global attributes with no variables. You need to include a group='PRODUCT' to get the data products, like this:
tropomi = xr.open_dataset('test2.nc', group='PRODUCT')

Reading datasets for geopandas

I trying reads data for dataset with help of geopandas, but interpreter wright:
File
"/home/divinitytoffee/PycharmProjects/Radar/venv/lib/python3.5/site-packages/geopandas/datasets/init.py",
line 33, in get_path raise alueError(msg) ValueError: The dataset
'resource/RAVL_vLuki/rd0a0h.00d' is not available
import geopandas as gpd
import fiona.ogrext
import pandas as pd
gpd_data = gpd.gpd.read_file(gpd.datasets.get_path('resource/RAVL_vLuki/rd0a0h.00d'))
Actually the question, how to fix?
Data are presented in the form of a *.00d
The geopandas.datasets.get_path is meant to return the path of a few datasets that are included in the geopandas library itself (eg for examples).
When reading your own file, you need to pass the path directly to read_file:
gpd_data = gpd.gpd.read_file('resource/RAVL_vLuki/rd0a0h.00d')
Do you want to import it as a data frame?
I don't know if this will help.
import geopandas as gpd
gpd_data = gpd.gpd.GeoDataFrame.from_file('resource/RAVL_vLuki/rd0a0h.00d')

python script converting .dat to json

I have .dat file that I want to use in my script which draws scatter graph with data input from that .dat file. I have been manually converting .dat files to .csv for this purpose but I find it not satisfactory.
This is what I am using currently.
import pandas as pd import matplotlib.pyplot as plt import numpy as np
filename=raw_input('Enter filename ')
csv = pd.read_csv(filename)
data=csv[['deformation','stress']]
data=data.astype(float)
x=data['deformation']
y=data['stress']
plt.scatter(x,y,s=0.5)
fit=np.polyfit(x,y,15)
p=np.poly1d(fit)
plt.plot(x,p(x),"r--")
plt.show()
Programmer friend told me that it would be more convenient to convert it to JSON and use it as such. How would I go about this?
try using the numpy read feature
import numpy as np
yourArray = np.fromfile('YourData.dat',dtype=dtype)
yourArray = np.loadtxt('YourData.dat')
loadtxt is more flexible than fromfile

Categories