Pip build option to use multicore - python

I found that pip only use single core when it compiles packages. Since some python packages takes some time to build using pip, I'd like to utilize multicore on the machine. When using Makefile, I can do that like following command:
make -j4
How can I achieve same thing for pip?

The Ultimate Way to Resolve This Problem
Because all the c / cpp files would be compiled by using make commend, and make has an option which specify how many cpu cores shoule be used to compile the source code, we could do some tricks on make.
Backup your original make command:
sudo cp /usr/bin/make /usr/bin/make.bak
write a "fake" make command, which will append --jobs=6 to its parameter list and pass them to the original make command make.bak:
make.bak --jobs=6 $#
So after that, not even compile python with c libs, but also others contain c libs would speed up on compilation by 6 cores. Actually all files compiled by using make command will speed up.
And good luck.
Use: --install-option="--jobs=6" (pip docs).
pip3 install --install-option="--jobs=6" PyXXX
I have the same demand that use pip install to speed the compile progress. My target pkg is PySide. At first I use pip3 install pyside, it takes me nearly 30 minutes (AMD 1055T 6-cores, 10G RAM), only one core take 100% load.
There are no clues in pip3 --help, but I found lots of options like pip install -u pyXXX, but I didn't know what is '-u' and this parameter was not in pip --help too. I tried 'pip3 install --help' and there came the answer: --install-option.
I read the code of PySide's code and found another clue: OPTION_JOBS = has_option('jobs'), I put ipdb.set_trace() there and finally understand how to use multicore to compile by using pip install.
it took me about 6 minutes.
--------------------------update------------------------------
as comment below, I finally used tricks like this:
cd /usr/bin
sudo mv make make.bak
touch make
then edit make: vim make or other way you like and type this:
make.bak --jobs=6 $*
I'm not familiar with bash, so I'm not sure if this is the correcct bash code. I'm writing this comment in windows. The key is rename make into make.bak, and then create a new make, use this new make to call make.bak with added param --jobs=6

Tested this works
https://stackoverflow.com/a/57014278/6147756
Single command:
MAKEFLAGS="-j$(nproc)" pip install xxx
Enable for all commands in a script:
export MAKEFLAGS="-j$(nproc)"

From what I can tell it does not look like pip has this ability but I may be mistaken.
To do multiprocessing in python you use the multiprocessing package, [here is a guide I found] (http://pymotw.com/2/multiprocessing/basics.html) about how to do it if you are interested and this is a link to the python docs that talk about it. I also found this question useful, Multiprocessing vs Threading Python, to make sure that multiprocessing did what I thought it did, being take advantage of multiple CPUs.
I have gone through the pip source code (available here) looking for a reference to the multiprocessing package and did not find any use of the package. This would mean that pip does not use/support multiprocessing. From what I can tell the /pip/commands/install.py file is the one of interest for your question as it is called when you run pip install <package>. For this file specifically the imports are
from __future__ import absolute_import
import logging
import os
import tempfile
import shutil
import warnings
from pip.req import InstallRequirement, RequirementSet, parse_requirements
from pip.locations import virtualenv_no_global, distutils_scheme
from pip.basecommand import Command
from pip.index import PackageFinder
from pip.exceptions import (
InstallationError, CommandError, PreviousBuildDirError,
)
from pip import cmdoptions
from pip.utils.deprecation import RemovedInPip7Warning, RemovedInPip8Warning
which you can see does not have any reference to the multiprocessing package but I did check all of the other files just to be sure.
Furthermore, I checked the pip install documentation and found no reference to installing using multiple cores.
TL;DR: Pip doesn't do what you are asking. I may be wrong as I didn't look at the source that long but I'm pretty sure it just doesn't support it.

Related

How to fix the error that I receive when installing numpy in Python?

I have installed Python 3.10.6 and Pycharm community edition.
Everything was working until I tried to use numpy.
pip3 install numpy
import numpy as np
This is the error message:
pip3 install numpy
^^^^^^^
SyntaxError: invalid syntax
I also have tried to use pip install numpy and pip2 install numpy and pip3 install numpy scipy, but same error. Reinstalling both python and pycharm didn't help.
Ah, I understand your problem more specifically now. I also use PyCharm, and this same problem happened to me. It was very frustrating, and took me lots of reading to fix it.
PyCharm and other IDEs (integrated development environment) have something called 'run configurations' attached to each file you are working on. These run configurations basically specify which directory on the hard drive the file will use to read and execute your commands. The directory will contain the libraries you need to run your code.
They use these configurations to make it easy to quickly choose which directory (and which libraries) you want a certain file to use. You must specify these configurations in PyCharm for your specific file to run using Numpy. The great thing about PyCharm is that you can actually specify libraries you want to use within the IDE itself (and bypass having to specify a computer-native directory).
Here's How
Go to PyCharm Preferences
Expand the arrow that says 'Project: (your project name)'
Click on 'Python Interpreter'
Click the small '+' symbol
Type in 'numpy' to search for the library (package)
Click install package
Now try to run your file and it should be good to go!
Note that you must do this for each package you wish to use when accessing your file, and as you advance your programming knowledge it will be necessary to learn how to specify the directory you want PyCharm to run the Python Interpreter from. Since you are only using one library though, I think this solution should be fine for the time being.
You should install numpy with that command in your bash/zsh shell.
pip3 install numpy
the python script can then import it.
to test, run pip3 install numpy
then,
python to open a python shell.
and then you'll see
>>>
Type import numpy as np and be sure it imports. It should now.
It can be maddeningly confusing when first starting out with python and trying to figure out how to download libraries. Here are a few critical things I wish I understood before starting my Python journey, as well as the answer to your question.
Python is the language, and the files that support its functionality are located on the hard drive.
Libraries (like Numpy) can be thought of almost as interpreters (note that we are not using the computer definition of 'interpreter') and are stored alongside the Python files on the hard drive. They give Python more flexibility in terms of what it is able to do by increasing what commands Python is able to understand.
Once a library is downloaded, it must be imported to your Python script before you start writing library-specific commands. Importing a library tells Python: "Hey, I'm going to be writing some commands that you haven't seen before, but here is the library with the commands and what they want you to do in a way that you understand."
'pip' is Python's installer for these libraries.
Ex) I have a csv file that I want to read. I learn that Pandas has a csv reader function:
pandas.read_csv()
If I were to type this function in a script, Python would have no idea what I meant. But if I were to download Pandas, then import it into my script, Python would understand exactly what I'm saying.
How to Download Numpy
Assuming you are on Windows, open the terminal (command prompt) and run the command:
py -m pip install numpy
If you don't already have it, the terminal should have a few lines run and should end with something like 'numpy installed successfully'.
You can check to see if you have it by running the following command in your terminal:
py -m pip list
This command provides you with a list of all the downloaded libraries. You can check among them to make sure Numpy is downloaded.
Importing Libraries
Once you've downloaded the libraries you need, you need to import them into your script (the Python file where you are writing your code) in order for it to run properly. This is accomplished using the import command. One important thing to note is that you can import libraries and assign them a nickname using the as modifier.
Ex) Back to that csv file I want to read. I don't want to type 'pandas' in front of all the Pandas commands, so when I import it into the script I abbreviate it as 'pd':
import pandas as pd
pd.read_csv()
See the difference?
TL;DR for Your Scenario
Go to the terminal, and use the py -m pip list command to check if you have Numpy downloaded. If not, run the py -m pip install numpy command. Then go to your script with your actual python code, and import numpy with the import numpy command. Common Python practice is to import numpy as np, FYI.
Hope this clears things up.
It may say that you need to upgrade pip, which is fine, and it should give you a command to run that will upgrade pip to the newest version.

Python Libraries - Making them work on PCs that arent mine

Apologies if this is a very stupid question but I am new to python and although I have done some googling I cannot think how to phrase my search query.
I am writing a python script that relies on some libraries (pandas, numpy and others). At some point in the future I will be passing this script onto my University so they can mark it etc. I am fairly confident that the lecturer will have python installed on their PC but I cannot be sure they will have the relevant libraries.
I have included a comments section at the top of the script outlining the install instructions for each library but is there a better way of doing this so I can be sure the script will work regardless of what libraries they have?
An example of my script header
############### - Instructions on how to import libraries - ###############
#using pip install openpyxl using the command - pip install openpyxl
#########################################################################
import openpyxl
import random
import datetime
Distributing code is a huge chapter where you can invest enormous amounts of time in order to get things right, according to the current best practices and what not. I think there is different degrees of rightness to solutions to your problem, with more rightness meaning more work. So you have to pick the degree you are comfortable with and are good to go.
The best route
Python supports packaging, and the safest way to distribute code is to package it. This allows you to specify requirements in a way that installing your code will automatically install all dependencies as well.
You can use existing cookiecutters, which are project-templates, to create the base you need to build packages:
pip install cookiecutter
cookiecutter https://github.com/audreyr/cookiecutter-pypackage
Running this, and answering the ensuing questions, will leave you with python code that can be packaged. You can add the packages you need to the setup.py file:
requirements = ['openpyxl']
Then you add your script under the source directory and build the package with:
pip wheel .
Let's say you called your project my_script, you got yourself a fresh my_script-0.1.0-py2.py3-none-any.wheel file that you can send to your lecturer. When they install it with pip, openpyxl will be automatically installed in case it isn't already.
Unfortunately, if they should also be able to execute your code you are not done yet. You need to add a __main__.py file to the my_script folder before packaging it, in which you import and execute the parts of your code that are runnable:
my_script/my_script/__main__.py:
from . import runnable_script
if __name__ == '__main__':
runnable_script.run()
The installed package can then be run as a module with python -m my_script
The next best route
If you really only have a single file and want to communicate to your lecturer which requirements are needed to run the script, send them both your script and a file called requirements.txt, which contains the following lines:
openpyxl
.. and that's it. If there are other requirements, put them on separate lines. If the lecturer has spent any amount of time working with python, they should know that running pip install -r requirements.txt will install the requirements needed to run the code you have submitted.
The if-you-really-have-to route
If all your lecturer knows how to do is entering python and then the name of your script, use DudeCoders approach. But be aware that silently installing requirements without even interactive prompts to the user is a huge no-no in the software-engineering world. If you plan to work in programming you should start with good practices rather sooner than later.
You can firstly make sure that the respective library is installed or not by using try | except, like so:
try:
import numpy
except ImportError:
print('Numpy is not installed, install now to continue')
exit()
Now, if numpy is installed in his computer, then system will just import numpy and will move on, but if Numpy is not installed, then the system will exit python logging the information required, i.e., x is not installed.
And implement the exact same for each and every library you are using.
But if you want to directly install the library which is not installed, you can use this:
Note: Installing libraries silently is not a recommended way.
import os
try:
import numpy
except ImportError:
print('Numpy is not installed, installing now......')
resultCode = os.system('pip install numpy')
if resultCode == 0:
print('Numpy installed!')
import numpy
else:
print('Error occured while installing numpy')
exit()
Here, if numpy is already installed, then the system will simply move on after installing that, but if that is not installed, then the system will firstly install that and then will import that.

pip hangs after installing package

I'm using Python 3 on windows I'm trying to install a package from within a script.
The purpose is that I don't want to explain to the person I'm sending the script how to install the packages he needs, so I'm hoping to do it on the fly from within the script.
Here's my code:
import pip
pip.main(["install", 'pyetrade'])
import pyetrade
Everything installs correctly with pip.main, however it doesn't move on to "import pyetrade" or the rest of the code. It just hangs there.
Any ideas how to get around this? This also seems to happen when I use the command propt -- it seems to just hang after installation.

automated script to install python package over a proxy?

Would it be possible to write a little Python script that would automatically install a needed Python library import if the needed library wasn't already installed?
Currently I am using
try:
import xlrd #Library to iterate through excel docs
except ImportError:
raise ImportError('XLRD not installed, use "sudo pip install xlrd"\n')
but would like something more automated.
As mentioned in the comments the setup.py approach is a good one and quite easy.
Beyond that if you need to install dependencies across multiple servers I suggest you have a look at solution like http://www.ansible.com/home or https://puppetlabs.com/
As a one-off you can do that too:
ssh user#yourserver.com 'pip install xlrd'
Edit - Following up on the comments:
Once you have created you setup.py specifying the dependency on xlrd you have to run it on every machine where you want to install your app. In order to do that checkout the automation tool I advised above.
You could also do that but that's really dirty:
try:
import xlrd #Library to iterate through excel docs
except ImportError:
from subprocess import call
call(["pip", "install", "xlrd"])

use "pip install/uninstall" inside a python script [duplicate]

This question already has answers here:
How can I Install a Python module within code?
(12 answers)
Closed 2 years ago.
how, inside a python script can I install packages using pip?
I don't use the os.system, I want to import pip and use it.
pip.main() no longer works in pip version 10 and above. You need to use:
from pip._internal import main as pipmain
pipmain(['install', 'package-name'])
For backwards compatibility you can use:
try:
from pip import main as pipmain
except ImportError:
from pip._internal import main as pipmain
I think those answers are outdated. In fact you can do:
import pip
failed = pip.main(["install", nameOfPackage])
and insert any additional args in the list that you pass to main(). It returns 0 (failed) or 1 (success)
Jon
It's not a good idea to install packages inside the python script because it requires root rights. You should ship additional modules alongside with the script you created or check if the module is installed:
try:
import ModuleName
except ImportError:
print 'Error, Module ModuleName is required'
If you insist in installing the package using pip inside your script you'll have to look into call from the subprocess module ("os.system()" is deprecated).
There is no pip module but you could easily create one using the method above.
I used the os.system to emulate the terminal installing a pip module, (I know os.system is deprecated, but it still works and it is also the easiest way to do it), E.G I am making a Game Engine which has multiple python scripts that all use Pygame, in the startup file I use this code to install pygame onto the user's system if they don't have it:
import os
os.system('pip install pygame')
Unfortunately, I don't know how to install pip if they don't have it so this script is dependent on pip.
If you are behind a proxy, you can install a module within code as follow...
import pip
pip.main(['install', '--proxy=user:password#proxy:port', 'packagename'])
This is a comment to this post that didn't fit in the space allotted to comments.
Note that the use case of installing a package can arise inside setup.py itself. For example, generating ply parser tables and storing them to disk. These tables must be generated before setuptools.setup runs, because they have to be copied to site_packages, together with the package that is being installed.
There does exist the setup_requires option of setuptools.setup, however that does not install the packages.
So a dependency that is required both for the installation process and for the installed package will not be installed this way.
Placing such a dependency inside install_requires does not always work as expected.
Even if it worked, one would have to pass some function to setuptools.setup, to be run between installation of dependencies in setup_requires and installation of the package itself. This approach is nested, and thus against PEP 20.
So the two flat approaches that remain, are:
run setup.py twice, either automatically (preferred), or manually (by notifying the user that the tables failed to build prior to setuptools.setup.
first call pip (or some other equivalent solution), in order to install the required dependencies. Then proceed with building the tables (or whatever pre-installation task is necessary), and call setuptools.setup last.
Personally, I prefer No.2, because No.1 can be confusing to a user observing the console output during installation, unless they already know the intent of calling setuptools.setup twice.
Besides, whatever rights are needed for installation (e.g., root, if so desired), are certainly present when setup.py is run (and exactly then). So setup.py could be considered as the "canonical" use case for this type of action.

Categories