Issues Installing Geopandas using Conda - python

I am trying to install the geopandas package for Python. I have tried every installation process outlined by Geopandas here (I have tried literally every option they offer): https://geopandas.org/en/stable/getting_started/install.html. I finally thought I got it to work by creating a new environment (it appears the package is installed), however, when I try to import a shapefile I get an error. I have also searched through all the stack exchange answers on questions for downloading Geopandas and none of the solutions are working for me. See the code and error below. Can anyone help me download geopandas successfully?
I am running Python version 3.9.13, on Windows, using Spyder.
CODE:
import geopandas as gpd
# Set filepath
fp = "filepath.shp"
# Read file using gpd.read_file()
data = gpd.read_file(fp)
ERROR:
fp = "filepath.shp"
data = gpd.read_file(fp)
Traceback (most recent call last):
Input In [3] in <cell line: 1>
data = gpd.read_file(fp)
File ~\Anaconda3\lib\site-packages\geopandas\io\file.py:81 in read_file
if hasattr(features.crs, "to_dict"):
File ~\Anaconda3\lib\site-packages\fiona\collection.py:214 in crs
self._crs = self.session.get_crs()
File fiona/ogrext.pyx:634 in fiona.ogrext.Session.get_crs
File fiona/_err.pyx:259 in fiona._err.exc_wrap_pointer
CPLE_OpenFailedError: Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.

Related

Writing a dask dataframe to parquet using to_parquet() results "RuntimeError: file metadata is only available after writer close"

I am trying to use store Dask dataframe in parquet files. I have pyarrow library installed.
import numpy as np
import pandas as pd
import dask.dataframe as dd
df = pd.DataFrame(np.random.randint(100,size=(100000, 20)),columns=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T'])
ddf = dd.from_pandas(df, npartitions=10)
ddf.to_parquet('saved_data_prqt', compression='snappy')
However, I get this error as a result of my code
---------------------------------------------------------------------------
ArrowNotImplementedError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pyarrow\parquet.py in write_table(table, where, row_group_size, version, use_dictionary, compression, write_statistics, use_deprecated_int96_timestamps, coerce_timestamps, allow_truncated_timestamps, data_page_size, flavor, filesystem, compression_level, use_byte_stream_split, data_page_version, use_compliant_nested_type, **kwargs)
~\anaconda3\lib\site-packages\pyarrow\parquet.py in close(self)
682 self.is_open = False
683 if self._metadata_collector is not None:
--> 684 self._metadata_collector.append(self.writer.metadata)
685 if self.file_handle is not None:
686 self.file_handle.close()
.................. it's a long error description which I shortened. If the whole error text required please let me know in the comments section and I'll try to add the full version.
~\anaconda3\lib\site-packages\pyarrow\_parquet.pyx in pyarrow._parquet.ParquetWriter.metadata.__get__()
RuntimeError: file metadata is only available after writer close
Does anybody know how to debug the error and what the reason of it?
Thank you!
I ran your exact code snippets and the Parquet files were written without any error. This code snippet also works:
ddf.to_parquet("saved_data_prqt", compression="snappy", engine="pyarrow")
I'm using Python 3.9.7, Dask 2021.8.1, and pyarrow 5.0.0. What versions are you using?
Here's the notebook I ran and here's the environment if you'd like to replicate my computations exactly.
I fixed the error by creating an isolated virtual environment with Python 3.9 and Pyarrow 5.0 in conda followed by installation of corresponding Python kernel in Jupyter Notebook.
It's important to activate the environment in conda followed by launching Jupyter Notebook from conda otherwise (for unknow reason) if I open Jupyter Notebook from windows start menu the error persists.

Change where pyodbc expects libodbc.2.dylib to live (changing default odbc file locations)

When importing pyodbc
❯ python
>>> import pyodbc
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: dlopen(/Users/pcosta/Documents/test/myenv/lib/python3.7/site-packages/pyodbc.cpython-37m-darwin.so, 2): Library not loaded: /usr/local/opt/unixodbc/lib/libodbc.2.dylib
Referenced from: /Users/pcosta/Documents/test/myenv/lib/python3.7/site-packages/pyodbc.cpython-37m-darwin.so
Reason: image not found
I know why this is happening, as I don't have libodbc.2.dylib in the expected location. The reason is I do not have permission to write to /usr/local/, so I have Homebrew installing into ~/.brew. This mostly works fine. I am even able to get both tsql and isql working as expected by following the steps outlined here: https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-SQL-Server-from-Mac-OSX.
So I do have libodbc.2.dylib, it's just that it lives in /Users/pcosta/.brew/lib, not /usr/local/opt/unixodbc/lib.
The main questions is can I get pyodbc to look for libodbc.2.dylib (and other associated files) in another directory?
I have all the files needed and have configured them correctly, I just need to repoint pyodbc somehow.
Thanks!
Thanks in part to guidance from this GitHub issue I was able to come to some solution.
Assuming you have brew install unixodbc:
Add the following paths (to .zshrc, .bashrc, or .bash_profile):
export LDFLAGS="-L/Users/pcosta/homebrew/opt/unixodbc/lib $LDFLAGS"
export CPPFLAGS="-I/Users/pcosta/homebrew/opt/unixodbc/include $CPPFLAGS"
export PKG_CONFIG_PATH="/Users/pcosta/homebrew/opt/unixodbc/lib/pkgconfig $PKG_CONFIG_PATH"
Run pip install --no-binary pyodbc pyodbc to bypass the binary and build yourself

RuntimeError: b'no arguments in initialization list'

I'm trying to solve my issue in my own but I couldn't, I'm trying to run this code in every format you can imagine and in ArcGIS pro software it's the same I can't find this error message in any other issue. From similar issues, it seems some data files could be missing?
import geopandas as gpd
import json
import numpy as np
from shapely.geometry import LineString, Point, box
import ast
from pyproj import Proj
paths = road_features.SHAPE.map(lambda x: np.array(ast.literal_eval(x)["paths"][0]))
pathLineStrings = paths.map(LineString)
gdf = gpd.GeoDataFrame(road_features,geometry=pathLineStrings)
#gdf.crs = {'init': 'epsg:3857'}
gdf.crs = {'init': 'epsg:4326'}
gdf = gdf.to_crs({'init': 'epsg:4326'})
i get this error
RuntimeError: b'no arguments in initialization list'
also i tried it in arcgis pro i got the same
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\geopandas\geodataframe.py", line 443, in to_crs
geom = df.geometry.to_crs(crs=crs, epsg=epsg)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\geopandas\geoseries.py", line 304, in to_crs
proj_in = pyproj.Proj(self.crs, preserve_units=True)
File "C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Lib\site-packages\pyproj\__init__.py", line 362, in __new__
return _proj.Proj.__new__(self, projstring)
File "_proj.pyx", line 129, in _proj.Proj.__cinit__
RuntimeError: b'no arguments in initialization list'
to make sure this is pyproj error rather than geopandas.
import pyproj
pyproj.Proj("+init=epsg:4326")
if the above runtime error is the same, we can be sure this error is due to pyproj.
just conda remove pyproj and install it with pip.
pip install pyproj
at least this works for me.
Today(July 30), I resintalled from miniconda, conda remove pyproj did not work for me, instead I pip uninstall pyproj and pip install pyproj makes everything fine.
The problem is problably within the pyproj instalation of Anaconda on Windows platform. Just like Stephen said, solution is to edit the path in "datadir.py" (located in ...Anaconda3\Lib\site-packages\pyproj).
Correct path is ".../Anaconda3/Library/share". Make sure full path is complete (may contain username etc.). I also needed to change \ to /.
This change worked for me. Yes and after this change, it is necesary to restart Spyder (or whatever you use).
Is there an initial crs defined?
I ran into the same problem only when I passed only the epsg command: gdf.to_crs('epsg:4326').
As you show
my_geoseries.crs = {'init' :'epsg:3857'}
should be the first step and then transforming to
gdf = gdf.to_crs({'init': 'epsg:4326'})
If you are working in ArcGIS you could also check in the properties whether the initial epsg is defined ?
I'm using Pycharm.
I had to use a combination of both Stone Shi's remark and Dorregaray's.
import pyproj
pyproj.Proj("+init=epsg:4326")
> RuntimeError: b'no arguments in initialization list'
According to Stone Shi, the above proves that it's a pyproj err.
So I used Pycharm->Settings and reinstalled pyproj.
Then
import pyproj
pyproj.Proj("+init=epsg:4326")
> RuntimeError: b'no arguments in initialization list'
So, it's a pyproj err but Pycharm->Settings reinstalling pyproj does not help me.
I then edited my C:\Anaconda3\Lib\site-packages\pyproj\datadir.py
from:
pyproj_datadir="C:/Anaconda3\share\proj"
to Dorregaray's:
pyproj_datadir="C:\Anaconda3\Library\share"
Then test again:
import pyproj
pyproj.Proj("+init=epsg:4326")
>Process finished with exit code 0
No Runtime Error!
Then test on my
wgs84 = data.to_crs({'init': 'epsg:4269'})
>Process finished with exit code 0
For me upgrading pyproj and geopandas, fixed this issue:
pip install pyproj --upgrade
pip install geopandas --upgrade
Using Geopandas, try that (it should work) :
gdf = gpd.GeoDataFrame(gdf, geometry=gdf['geometry'])
gdf.crs = {'init' :'epsg:2154'}
gdf = gdf.to_crs({'init' :'epsg:4326'})
You should redefine well your geodataframe,
then define the initial geo referential
and finally convert it in the good one.
Don't forget to drop the NaN if there are any.
I came across the same error. I was working with Python version 3.6.3 and Geopandas version 0.4.0. It was solved by using the following instead of df = df.to_crs({'init': 'epsg:4326'}):
df = df.to_crs(epsg=4326)
you can force reinstall pyproj from pip directly using
pip install --upgrade --force-reinstall pyproj
instead of uninstalling and reinstall again which will also uninstall all the dependent libraries

Jupyter gives IOError for .text file

I am using Jupyter notebook to import some data from a text file.
The folder from which I have imported the notebook has another file, data.txt but when I try to use the loadtxt() module, the following error appears:
IOError Traceback (most recent call last)
<ipython-input-4-a129a96139d0> in <module>()
----> 1 our_data = loadtxt("data.txt")
IOError: data.txt not found.
I looked for a solution and the manual in the notebook stated that the file may not be in the same directory or folder as your notebook.
I checked twice and found that the folder on my computer contains both the notebook and the data.txt file in the same location.
What is the issue?
The file is simply not in the folder of the output of this code
import os
print(os.getcwd())
You need to either put the data.txt file in this folder or load the file with a path the points to the file.
As far as I know, loadtxt() method is from numpy, so you should addimport numpy as np and use it as np.loadtxt().
Hope this helps!
Can you try using a full path instead of just data.txt?
Maybe the current directory for jupyter is not where the notebook is.
Or you could try printing the current directory, or current directory contents like this to be sure:
import os;print(os.listdir("."))

Why do I get ImportError of xlwings when running RunPython from Excel on Mac?

I am trying to run a Python script from Excel 2016 on Mac. When I run the code nothing happens, and the status bar in Excel gets stuck on "Running". I have checked the xlwings log file and I can see that the error is
Traceback (most recent call last):
File "", line 1, in
File "/Users/dano/Desktop/hello.py", line 3, in
import xlwings as xw
ImportError: No module named xlwings
However when I import xlwings from a Python shell it works fine, and I have also managed to write to the active workbook from the Python Shell using xlwings. Why does it say that there is no module named xlwingswhen I clearly have it installed?
I am using the simple hello.py example from the xlwings documentation:
import numpy as np
import xlwings as xw
def world():
wb = xw.Book.caller()
wb.sheets[0].range('A1').value = 'Hello World!'
The .py file and the excel file are located on my Desktop. I am running Python 3.6 and have installed xlwings using pip3.
xlwings takes the default Python installation as defined in your .bash_profile file, see the docs.
That is, you either need to include python3 in your PATH (given that you used pip3) or you need to set the Python interpreter via xlwings.
To set it in your .bash_profile, you would do something like:
export PATH="/path/to/python3/bin:$PATH"

Categories