koalas pip install fails on pyarrow dependency - python

I tried installing Databricks' new koalas package using the recommended pip install koalas on but it failed on the pyarrow install.
I then installed pyarrow and retried koalas but it still failed on pyarrow. I visited the Github page which informed me:
If this fails to install the pyarrow dependency, you may want to try
installing with Python 3.6.x, as pip install arrow does not work out
of the box for 3.7 https://github.com/apache/arrow/issues/1125.
I searched through the discussions and could not make sense of the "solutions", perhaps because there aren't any. I am using Python 3.7.3. The error messages I get are:
creating build/temp.macosx-10.7-x86_64-3.7
-- Runnning cmake for pyarrow
cmake -DPYTHON_EXECUTABLE=/anaconda3/bin/python -DPYARROW_BOOST_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /private/tmp/pip-install-uhdr9agf/pyarrow
unable to execute 'cmake': No such file or directory
error: command 'cmake' failed with exit status 1
----------------------------------------
Failed building wheel for pyarrow
Running setup.py clean for pyarrow
Failed to build pyarrow
Installing collected packages: pyarrow, koalas
Found existing installation: pyarrow 0.13.0
Uninstalling pyarrow-0.13.0:
Successfully uninstalled pyarrow-0.13.0
Running setup.py install for pyarrow ... error
Complete output from command /anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/private/tmp/pip-install-uhdr9agf/pyarrow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/tmp/pip-record-i7k4nwil/install-record.txt --single-version-externally-managed --compile:
...
-- Runnning cmake for pyarrow
cmake -DPYTHON_EXECUTABLE=/anaconda3/bin/python -DPYARROW_BOOST_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /private/tmp/pip-install-uhdr9agf/pyarrow
unable to execute 'cmake': No such file or directory
error: command 'cmake' failed with exit status 1
----------------------------------------
Rolling back uninstall of pyarrow
...
Command "/anaconda3/bin/python -u -c "import setuptools, tokenize;__file__='/private/tmp/pip-install-uhdr9agf/pyarrow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /private/tmp/pip-record-i7k4nwil/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-install-uhdr9agf/pyarrow/
I have tried pip install koalas, sudo pip install koalas, and sudo -H pip install koalas and all have the same error message.
Has anyone found a solution to these errors? Or is koalas not (yet) compatible with 3.7?

you probably saw this but the github post you mentioned regarding arrow says "It does work for Python<3.7. For Python 3.7, you need to have installed the Arrow C++ packages via different means."
I was able to get koalas working on a single machine spark local mode with python 3.6 and ran the github sample script successfully ... it also specifies "pyspark>=2.4.0 is recommended"
I am sure if you try 3.6 it will work for you.
import sys
print(sys.version)
import pandas as pd
import databricks.koalas as ks
import pyarrow as pa
3.6.8
pdf = pd.DataFrame({'x':range(3), 'y':['a','b','b'], 'z':['a','b','b']})
print(pdf.head())
x y z
0 0 a a
1 1 b b
2 2 b b
df = ks.from_pandas(pdf)
df.columns = ['x', 'y', 'z1']
df['x2'] = df.x * df.x
df['x2']
0 0
1 1
2 4
Name: x2, dtype: int64

Related

Brew install python3 cannot on macOS 10.9

I tried to use brew install python3 on macOS 10.9, but I got error as below, what can I do? Can you help me? By the way, I am using a macmini 2009 late, do I have upgrade my MacOS to 10.10 or 10.11, I am not sure what is the highest version for my macmini...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/cask).
==> Installing dependencies for python: sphinx-doc, gdbm, makedepend, openssl, readline, sqlite, xz
==> Installing python dependency: sphinx-doc
==> Downloading https://files.pythonhosted.org/packages/ac/54/4ef326d0c654da1ed91341a7a1f43efc18a8c770ddd2b8e45df97cb79d82/Sphinx-1.7.8.tar.gz
Already downloaded: /Users/mi/Library/Caches/Homebrew/downloads/7724e7147d6bde066d1f33877c84313f0022f9e968578891746ce90feb85f567--Sphinx-1.7.8.tar.gz
==> Downloading https://files.pythonhosted.org/packages/ef/1d/201c13e353956a1c840f5d0fbf041bd45bbd678ea4843ebf25924e8984c/setuptools-40.2.0.zip
Already downloaded: /Users/mi/Library/Caches/Homebrew/downloads/16d798a247946c3552651aa8f309c94303bdc666a9598610c65187e5da345a77--setuptools-40.2.0.zip
==> python -c import setuptools... --no-user-cfg install --prefix=/usr/local/Cellar/sphinx-doc/1.7.8/libexec/vendor --single-version-externally-managed --record=installed.txt
Last 15 lines from /Users/mi/Library/Logs/Homebrew/sphinx-doc/01.python: 2018-09-05 22:19:48 +0900
python
-c
import setuptools, tokenize
__file__ = 'setup.py'
exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec')) --no-user-cfg install
--prefix=/usr/local/Cellar/sphinx-doc/1.7.8/libexec/vendor
--single-version-externally-managed
--record=installed.txt
Do not report this issue to Homebrew/brew or Homebrew/core!
Error: You are using macOS 10.9.
We (and Apple) do not provide support for this old version.
You will encounter build failures and other breakages.
Please create pull-requests instead of asking for help on Homebrew's
GitHub, Discourse, Twitter or IRC. As you are running this old version,
you are responsible for resolving any issues you experience.

pip install scikit-learn error

the error is
Command "c:\users\samuel\appdata\local\programs\python\python37\python.exe -
u -c "import setuptools,
tokenize;__file__='C:\\Users\\Samuel\\AppData\\Local\\Temp\\pip-install-
bnoak8r5\\scikit-learn\\setup.py';f=getattr(tokenize, 'open', open)
(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code,
__file__, 'exec'))" install --record C:\Users\Samuel\AppData\Local\Temp\pip-
record-4yww_x3p\install-record.txt --single-version-externally-managed --
compile" failed with error code 1 in C:\Users\Samuel\AppData\Local\Temp\pip-
install-bnoak8r5\scikit-learn\
on the website its says this
Python (>= 2.7 or >= 3.3),
NumPy (>= 1.8.2),
SciPy (>= 0.13.3).
i have python 3.7
numpy 1.14.3 and scipy 1.1.0
so is my python version to new along with my scipy version?
when i download the file and try to install it i get this error
Command "python setup.py egg_info" failed with error code 1 in C:\Users\Samuel\AppData\Local\Temp\pip-req-build-6eptglns\
I suggest you download Anaconda and use conda install as your package manager for numpy, scipy, scikit-learn, etc. I can't guarantee it will fix this problem but it most certainly will be better at what you are using scikit-learn for. It would certainly change this error because it downloads a new installation of python somewhere else within anaconda's directories. Doing this has fixed this same problem for some people in the past.
I hope this helps.

Scipy Python wheel

I have got a problem with installing Scipy on my Python 2.7 , Windows in IPython.
When I enter "pip install scipy", I have one first error message:
"Failed building wheel for scipy" and then at the end
"
Command "c:\python27\python.exe -c "import setuptools,tokenize;__file__='c:\\us
ers\\admini~1\\appdata\\local\\temp\\pip-build-e3yebj\\scipy\\setup.py';exec(com
pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __f
ile__, 'exec'))" install --record c:\users\admini~1\appdata\local\temp\pip-mwhxl
d-record\install-record.txt --single-version-externally-managed --compile" faile
d with error code 1 in c:\users\admini~1\appdata\local\temp\pip-build-e3yebj\sci
py
"
I do not know how to solve that problem, thanks if you have any ideas
You can download the wheel from this web site:
http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy
You need to pick the right one. So, for
scipy‑0.19.0‑cp34‑cp34m‑win32.whl
cp34 means it will work with python 3.4
make sure you go with this one in your case:
scipy‑0.19.0‑cp27‑cp27m‑win32.whl and whatever your windows is (32 or 64).
once downloaded, go to the command line and put:
python -m pip install scipy‑0.19.0‑cp27‑cp27m‑win32.whl (making sure it is sitting in the right directory).
That should solve the problem.
I also met this problem and I then installed a Python distribution like Anaconda instead!

Installing NumPy on Windows 8.1 with Python 2.7.x

I'm very new to Python and programming world and has been going along with tutorials from newcoder.io This Here! I have been doing as per the instructions but when I try to install NumPy I get an error.
"
error: Microsoft Visual C++ 9.0 is required (Unable to find vcvarsall.bat). Get it from http://aka.ms/vcpython27
Command "C:\Users\HP.virtualenvs\DataVizProj\Scripts\python.exe -c "import setuptools, tokenize;file='c:\users
\appdata\local\temp\pip-build-lsj5sj\numpy\setup.py';exec(compile(getattr(tokenize, 'open', open)(file).re
.replace('\r\n', '\n'), file, 'exec'))" install --record c:\users\hp\appdata\local\temp\pip-6jei4k-record\instal
cord.txt --single-version-externally-managed --compile --install-headers C:\Users\HP.virtualenvs\DataVizProj\includ
te\python2.7" failed with error code 1 in c:\users\hp\appdata\local\temp\pip-build-lsj5sj\numpy
"
But that's not enough, I tried to install VCForPython27.msi from the given link. But still, gets the same error.
Please Help!
I recommend installing the Anaconda distribution of Python. It contains more packages than you can dream of, including numpy of course: http://continuum.io/downloads
The installation is as straightforward as installing the usual Python, no matter what OS you are on.
A good solution is to download the wheel file which match your Python version here:unofficial python binaries
And then go to the folder where the .whl file is present using the command prompt and unpack the wheel file xxxxx.whl with:
>python -m pip install xxxxx.whl
This will install the library in the Lib\site-packages folder, you can check it afterwards.

error: command 'gcc' failed with exit status 1 installing Fatiando (Python Package)

I am trying to install fatiando, a geophysical modelling package for Python.
I have a Mac with OS X v10.9.5. I am getting all the dependencies for Fatiando (via Anaconda) by following the recommended installation suggested on the package site. I have Xcode installed.
I get a list of warnings and a final error message:
fatiando/gravmag/_polyprism.c:349:10: fatal error: 'omp.h' file not found
#include "omp.h"
^
1 warning and 1 error generated.
error: command 'gcc' failed with exit status 1
----------------------------------------
Command "//anaconda/bin/python -c "import setuptools, tokenize;__file__='/var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-QFjo6d-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-CY4vyX-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-QFjo6d-build
Macintosh-5:fatiando matteoniccoli$
The full Terminal output (1100+ lines) can be found here.
I already contacted the developers, this does not seem to be a Fatiando issue.
Any suggestions?
UPDATE, March 15
When I first posted this I did not have Xcode, then I downloaded the latest Xcode from Apple store. Tried again, got the same message. Then I read this and downloaded gcc from here, and installed directly. When I type on terminal: gcc --version, I get this: i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)
After that, though, I still get similar messages. Following another stackoverflow lead, I tried to install setuptools from here
using curl https://bootstrap.pypa.io/ez_setup.py -o - | python
Now I get a different error (at the end again of a long output) when I try to install fatiando:
fatiando/gravmag/_polyprism.c:349:10: fatal error: 'omp.h' file not found
#include "omp.h"
^
1 warning and 1 error generated.
error: command '/usr/bin/clang' failed with exit status 1
----------------------------------------
Command "//anaconda/bin/python -c "import setuptools, tokenize;__file__='/private/var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-build-m1ieVO/fatiando/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-9wI6Z7-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/32/mwq0jhwd3dx7vjqmm8hkljp80000gn/T/pip-build-m1ieVO/fatiando
Someone from a forum asked me by email:
Re Fatiando, did you install Xcode command line tools? Eg see this
http://railsapps.github.io/xcode-command-line-tools.html
But when I try to verify tI’ve successfully installed Xcode Command Line Tools as suggested there, I get this, so I assume it was not the issue:
-bash: /Library/Developer/CommandLineTools: is a directory
UPDATE MARCH 16
Tried solution suggested by Leo Uieda.
pip install --upgrade https://github.com/fatiando/fatiando/archive/kill-omp.zip went without a problem, but
pip install --upgrade https://github.com/fatiando/fatiando/archive/master.zip gets me back at square 1:
...
...
fatiando/gravmag/_polyprism.c:349:10: fatal error: 'omp.h' file not found
#include "omp.h"
^
1 warning and 1 error generated.
error: command '/usr/bin/clang' failed with exit status 1
----------------------------------------
Rolling back uninstall of fatiando
This is a very common problem with the Fatiando install, specially on Windows and Mac. OpenMP was introduced in PR 106 for the fatiando.gravmag forward modeling modules. It was easy to implement (just replace a range(ndata) with a prange(ndata)) and was resulting in 1.5-2x speedup over sequential execution. Also, the parallel execution was automatic. So it seemed like a good trade-off at the time ("Just install an extra dependency? What could go wrong?").
The problems began when the Anaconda gcc and the default Mac gcc didn't come with OpenMP. So Windows users had to install an extra dependency (in a very specific order, like a satanic ritual) and Mac users had to fend for themselves.
OpenMP and compiled Cython modules are being removed from Fatiando (#169) in preference of multiprocessing and numba. This would make it a pure Python package (no compilation necessary) and most of the install issues should be resolved.
In the mean time, PR 177 removes the OpenMP requirement from the Cython modules. This should fix your current install problems. To get the changes right away, you can install the version from the kill-omp branch by running:
pip install --upgrade https://github.com/fatiando/fatiando/archive/kill-omp.zip
If the above command doesn't work, it means that the pull request has been merged into the main branch of the project (master). If that's the case, you can install the latest version from the master branch:
pip install --upgrade https://github.com/fatiando/fatiando/archive/master.zip
These changes will be included in the future v0.4 release. Hope this fixes your problem.
(It would be useful to know which version of gcc you are using.)
gcc did not ship with OpenMP prior to v4.9.
See this answer could help you update gcc it using xcode.

Categories