On my Ubuntu 18.04, I have Python 3 and R installed. I am about to study some data science, and just found https://www.anaconda.com/distribution/.
anaconda comes with Python and R and some packages. Will installing anaconda conflict with my existing installation of Python 3 and R?
Shall I install anaconda, or shall I install the packages manually and individually on demand?
How do people in data science install the tools?
Thanks.
I have a system python and R as well as anaconda and they don’t seem to conflict and I have the same OS as you. Conda, the package and environment manager that comes with anaconda, supposedly does not mix well with pip, meaning if you have a virtual environment you should use one or the other. However, I have mixed them without any difficulties.
I prefer creating virtual environments the more standard way, but there are some scientific packages that are much easier to install using conda.
Related
I am using Mac. I am wondering is it possible to have 2 versions of tensor flow co-existing in my computer? I pip installed tensorflow-1.13 and tensor flow-1.8 through two python virtual env. However, there seem to be some problems ...
How do I find out the corresponding c++ tensor flow library in my Mac? Where are they installed? Thanks!
Yes, you can do this with virtual environments: each virtual environment will contain a different version of TensorFlow, and you can switch from one to the other easily. There are many solutions to create virtual environments, but some of the most popular are:
conda
virtualenv
pipenv
Conda is a general-purpose, cross-platform package manager, mostly used with Python, but it can also install many other software packages. A conda environment includes everything, including Python itself, and the system binaries for the libraries you use. So you can have different conda environments with different versions of Python, and different versions of every package you want, including TensorFlow, and any C++ library your code relies on. You can install Anaconda, which is a bundle that includes Conda + Python + many scientific libraries. Or you can install miniconda which includes the bare minimum to run conda.
Virtualenv is a python library which allows you to create virtual environments strictly for Python.
pipenv is also a python library that seems to be gaining a lot of momentum right now, and includes a lot of the functionality of virtualenv.
If you are a beginner, I would recommend going with conda. You will usually run into less issues.
First, download and install either Anaconda or Miniconda.
Next, create a virtual environment:
conda create --name myenv
Then activate this virtual environment:
conda activate myenv
Now you can install all the libraries you need:
conda install whatever-library-you-need
However, not all libraries are available in conda. For example, TensorFlow 2.0 is not there yet (as of May 13th 2019). But that's okay, you can also use pip!
pip install --pre tensorflow
This will install TF 2.0 alpha.
You can then create another environment and install a different version of TF.
You can read more about the interaction between Conda and Pip on the web, but the short story is that they work well together as long as you use pip last. In short, install everything you can with conda, and finish with pip.
I am about to create a python interface in R with the package Reticulate. In order to access the python functions in R, the respective python packages need to be installed.
Two questions came to my mind:
1) If you use the reticulate package, does the Anaconda package need to be installed? Or is it sufficient to install the python packages only?
2) Is it possible to install python packages in R, similar to install.packages("r_package")?
Does anyone have experience with this Topic? Thanks in advance!
I'll add a little bit of nuance to the previous answer.
Like #f0nzie said, Anaconda is not a package, but a package manager. Ideally, you will create an environment using Anaconda to assist with your package management and version control. The documentation for conda environments is here.
Now, you can install python packages to your anaconda package in R. That is possible using reticulate::conda_install(envname, packages). The documentation for conda_install() can be found here.
1) The R package reticulate can work with the default python or with Anaconda2 or Anaconda3. If you want Anaconda to work with R, you will have to install Anaconda first. Once installed, you call the library(reticulate), and run py_config() or reticulate::py_discover_config(), that will give you the list of paths and environment used by the Python installation. Then, once you know the Python path, you add a line like this use_python("/opt/miniconda2/bin/python"), right after library(reticulate) and you are in business.
2) to install Python packages so R (or reticulate) can see them, you have to install them as regular Python packages from a terminal or console; not R. Example: conda install numpy to install numpy, or conda install scipy to install scipy, and so on.
I am just doing all this in a Docker container rocker/rstudio. It should be easier in a standard OS.
Here is step-by-step instructions: rstudio reticulate
Cheers!
In case you need a specific version of Python modules then place == after module name, e.g. the following will install specific versions of 3 modules using pip:
reticulate::conda_install(c("PyMuPDF==1.14.20", "PyPDF2==1.26.0", "reportlab==3.5.23"),
envname = "myenv", pip = TRUE)
The Anaconda website mentions that the installer has 100 of pre-built packages. Even the installer size of 500mb hints that there should be some pre-built packages.
Yet when we want to use any of the packages we have to install them through the command eg. conda install nltk
Which basically downloads the package from internet and then installs it. Which seems counterintuitive since it is already mentioned on website that nltk is present in the installer.
Can anybody throw some light on this?
There are two parts:
Conda - Package & environment management system. This gives you the
conda command and serves a similar function as pip and
virtualenv.
Anaconda - Python package distribution containing 100's of scientific
packages that are tests and verified to work together.
If you install Miniconda, you will just get conda without the full Anaconda distribution. If you install Anaconda, you will get both the conda management system and the Python distribution. You can also get Anaconda after only having installed conda by running conda install Anaconda.
Title basically states it all. I upgraded my version of Python in order to hopefully play more nicely with Mac OS 10.9, but am now unable to use some modules I need for my work (NumPy, Pandas, SciPy, Scikit-Learn, etc.) Does this upgrade automatically wipe out any previously installed modules? Do I just need to install them again? Thanks in advance.
When you upgraded, it created a new sitepackages directory structure. Your packages are not installed any more, so yes you need to reinstall them into the new version.
Before you do that, take a good look at virtual environments rather than install the modules and packages globally.
http://docs.python-guide.org/en/latest/dev/virtualenvs will get you started, then google virtualenvwrapper.
I would recommend you try out the anaconda python distribution. It comes with all of these packages pre-installed, and its free. Also, in addition to pip, you can use the conda package manager which is much better for scientific packages. See http://technicaldiscovery.blogspot.com/2013/12/why-i-promote-conda.html for an explanation.
With conda, you can install numpy/scipy/pandas/etc with conda install numpy scipy pandas and it just works, and takes about 10 seconds. No compilation necessary (OTOH pip install scipy can take over 15 minutes, requires a fortran compiler, and is generally very tricky).
link: http://continuum.io/downloads
I don't know what I have did but now when use pip to install a package it install it for python 3 (python3.3 folder) not for python2.7.
Another problem I installed django_debug_toolbar and now my django version is 1.6.4 not 1.3 I installed.
Now I can't remove django 1.6.4 with pip. Do you have a solution?
Learn using virtualenv. It allows you to have different environments with isolated version of Python and set of installed packages. Each created virtual environment is by default having pip installed.
You messed up things (as you know very well) as you probably reinstalled pip for another version of Python.
You might find more versions of pip in your system. Check the version of python they use (on Linux watch the shebang on first line). Use explicit path to proper pip to manage packages for related Python.
Often people install pip and rename it or give an alias - names like pip33 or pip27.
Note, that virtualenv allows creation of different environments (with different Python versions) without need to install virutalenv for each of these Pythons.
With virtualenv I would also highly recommend using virtualenvwrapper which adds a few very handy commands.
My problem came when I installed django_debug_toolbar. when I pip-installed django-debug-toolbar, the latest version of Django was installed automatically.