python - How to handle file locations and paths when installing package?

python - How to handle file locations and paths when installing package? - python

I'm working on building my first Python package for a client to use. At the most, I am envisioning the user pulling the code from GitHub, then pip installing (pip install .). This package will be used in a Windows environment. What is the convention or where is the easiest place to put log files? Is there a way to tell setup.py to make a log directory that is easily accessible for the user?
For a more specific example, let's say I had the code base locally at C:\Users\iamuser\projects\client_project. I pip install . while in the client_project directory. There is a logs\ directory (C:\Users\iamuser\projects\client_project\logs) that I'd like the log files to be placed into. Is there a way to have my setup.py place log files in that directory? If not, are there any other tools I should try?
I have tried something like this, but any paths acquired while running setup are not where the original setup.py file was located (example: os.path.abspath(__file__) shows some other location than within the client_project directory).

While creating a Python package, I would not make any assumptions about the user's filesystem or permissions therein (ideally not even about the OS). If your Python package creates log files, you could use Python's build-in logging system and logging information would go wherever the user wants it to go (stdout and stderr by default).
If your package generates files the user should have the option to decide where they go, either using a settings or config file or environmental variables.
There are a few places where you could safely store them by default. Such as the home or current working directory (or subfolders of them, see pro and cons in the comments). Important is to use relative paths either in relationship to ~, os.getcwd() or the __file__ attribute of your script. Linux systems have a few places that are usually present and can be used such as /tmp or /var/log but that does not work on Windows.
Sometimes, I store output files in the parent of the current working directory in order not to checkin output files into a Git repo but this relays on additional assumptions.

Related

Prevent a Python-embedded to look in my default path C:\Python38 for modules

I'm using Cython in --embed mode to produce a .exe. I'm evaluating the Minimal set of files required to distribute an embed-Cython-compiled code and make it work on any machine. To do this, I only copy a minimal number of files from the Python Windows embeddable package.
In order to check this, I need to be sure that the current process I'm testing doesn't in fact use my system default Python install, i.e. C:\Python38.
To do this, I open a new cmd.exe and do set PATH= which temporarily removes everything from the PATH. Then I can test any self-compiled app.exe and make sure it doesn't reuse C:\Python38's files under the hood.
It works, except for the modules. Even after doing set PATH=, my code app.py
import json
print(json.dumps({"a":"b"}))
when Cython---embed-compiled into a .exe works, but it still uses C:\Python38\Lib\json\__init__.py! I know this for sure, because if I temporarily remove this file, my .exe now fails, because it cannot find the json module.
How to completely remove any link to C:\Python38 when debugging a Python program which shouldn't use these files?
Why isn't set PATH= enough? Which other environment variable does it use for modules? I checked all my system variables and I think I don't find any which seems related to Python.

Python has a quite complicated heuristic for finding its "installation" (see for example this SO-question or this description), so probably it doesn't find the installation you are providing but the "default" installation.
Probably the most simple way is to set the environment variable PYTHONPATH pointing to the desired installation prior to start of the embedded interpreter.
By examination of sys.path one can check whether the correct installation was found.

Thanks to #ead's answer and his link getpath.c finally redirecting to getpathp.c in the case of Windows, we can learn that the rule for building the path for module etc. is:
current directory first
PYTHONPATH env. variable
registry key HKEY_LOCAL_MACHINE\SOFTWARE\Python or the same in HKCU
PYTHONHOME env. variable
finally:
Iff - we can not locate the Python Home, have not had a PYTHONPATH
specified, and can't locate any Registry entries (ie, we have nothing
we can assume is a good path), a default path with relative entries is
used (eg. .\Lib;.\DLLs, etc)
Conclusion: in order to debug an embedded version of Python, without interfering with the default system install (C:\Python38 in my case), I finally solved it by temporarily renaming the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Python to HKEY_LOCAL_MACHINE\SOFTWARE\PythonOld.
Side note: I'm not sure I will ever revert this registry key back to normal: my normal Python install shouldn't need it anyway to find its path, since when I run python.exe from anywhere (it is in the PATH for everyday use), it will automatically look in .\Lib\ and .\DLL\ which is correct. I don't see a single use case in which my normal install python.exe wouldn't find its subdir .\Lib\ or .\DLL\ and requiring the registry for this. In which use case would the registry be necessary? if python.exe is started then its path has been found, and it can take its .\Lib subfolder, without help from registry. I think 99,99% of the time this registry feature is doing more harm than good, preventing a Python install to be really "portable" (i.e. that we can move from one folder to another).
Notes:
To be 100% sure, I also did this in command line, but I don't think it's necessary:
set PATH=
set PYTHONPATH=
set PYTHONHOME=
Might be helpful to do debugging of an embedded Python: import ctypes. If you haven't _ctypes.pyd and libffi-7.dll in your embedded install folder, it should fail. If it doesn't, this means it looks somewhere else (probably in your default system-wide Python install).

how do I install my modual onto my local copy of python on windows?

I'm reading headfirst python and have just completed the section where I created a module for printing nested list items, I've created the code and the setup file and placed them in a file labeled "Nester" that is sitting on my desktop. The book is now asking for me to install this module onto my local copy of Python. The thing is, in the example he is using the mac terminal, and I'm on windows. I tried to google it but I'm still a novice and a lot of the explanations just go over my head. Can someone give me clear thorough guide?.

On Windows systems, third-party modules (single files containing one or more functions or classes) and third-party packages (a folder [a.k.a. directory] that contains more than one module (and sometimes other folders/directories) are usually kept in one of two places: c:\\Program Files\\Python\\Lib\\site-packages\\ and c:\\Users\\[you]\\AppData\\Roaming\\Python\\.
The location in Program Files is usually not accessible to normal users, so when PIP installs new modules/packages on Windows it places them in the user-accessible folder in the Users location indicated above. You have direct access to that, though by default the AppData folder is "hidden"--not displayed in the File Explorer list unless you set FE to show hidden items (which is a good thing to do anyway, IMHO). You can put the module you're working on in the AppData\\Roaming\\Python\\ folder.
You still need to make sure the folder you put it in is in the PATH environment variable. PATH is a string that tells Windows (and Python) where to look for needed files, in this case the module you're working on. Google "set windows path" to find how to check and set your path variable, then just go ahead and put your module in a folder that's listed in your path.
Of course, since you can add any folder/directory you want to PATH, you could put your module anywhere you wanted--including leaving it on the Desktop--as long as the location is included in PATH. You could, for instance, have a folder such as Documents\\Programming\\Python\\Lib to put your personal modules in, and use Documents\\Programming\\Python\\Source for your Python programs. You'd just need to include those in the PATH variable.
FYI: Personally, I don't like the way python is (by default) installed on Windows (because I don't have easy access to c:\\Program Files), so I installed Python in a folder off the drive root: c:\Python36. In this way, I have direct access to the \\Lib\\site-packages\\ folder.

What is the expected behaviour in a virtualenv python?

Here is my problem, I am trying to make an application which copies data files during its setup. When i am doing pip install the setup copies a few files to a directory.
Now my question is, When inside a virtual environment, what is the behaviour that the customer expects- does he want all the created data files inside the virtual environment directory for each virtualenv or copy all the files into a common directory outside the virtual environment directory.
While running the application there will be new files that will be created and stored along these copied files. What is the ideal behaviour that is expected form a python virtualenv. common or isolated?

virtualenv is more for development, not for deployment. There are many scenarios to deploy Python app. but if you prefer virtualenv usage and you have common files, they can be anywhere IMHO, because virtualenv is not real isolation, it's only Python paths mangling/modification mechanism (not like "chroot"), so you decide where to place your common files, even /usr/share/my-app-1.0/datafiles/. Also, virtualenv is used to isolate binaries/scripts, but if data files are static, you can place them when you prefer.

In my opinion, that depends on application, which you create. Virtualenv is just way of running on same machine multiple applications with different dependencies. Data from applications is another thing.
When I would write web application, that will be single app on server, then I would use one directory.
On the other hand, when I would write GUI app, then things get different. If data is something that must be changed with every version, but end user does not touch it directly (e.g. some dictionaries, translation, etc) I would put it in dist-packages along the application package (see package data in setup.py).
On the other hand, if user can "touch" and use those files, then I would put them in users home directory. See How to find the real user home directory using python?

Use .py or .pyc file when sharing/backing up?

This answer tells me that a .pyc file gets created when a .py file is run, which I understand saves loading time when re-run. Which makes me wonder what the point of the .py file is after the .pyc is created.
When backing up my code, or sharing it, I don't want to include redundant or extraneous files. Which filetype should I focus on?
Side question: I have one script that calls another. After running them, the called script got a .pyc file written, but the master script that does the calling did not. Why would that be?

Python .pyc files are generated when a module is imported, not when a top level script is run. I'm not sure what you mean by calling, but if you ran your master script from the command line and it imported the other script, then only the imported one gets a .pyc.
As for distributing .pyc files, they are minor version sensitive. If you bundle your own python or distribute multiple python-version sensitive files, then maybe. But best practice is to distribute the .py files.
Python's script and module rules seem a bit odd until you consider its installation model. A common installation model is that executables are installed somewhere on the system's PATH and shared libraries are installed somewhere in a library path.
Python's setup.py does the same thing. Top level scripts go on the PATH but modules and packages go in an library path. For instance on my system, pdb3 (a top level script) is at /usr/bin/pdb3 and os (an imported module) is at /usr/lib/python3.4/os.py. Suppose python compiled pdb3 to pdb3.pyc. Well, I'd still call pdb3 and the .pyc is useless. So why clutter the path?
Its common for installs to run as root or administrator so you have write access on those paths. But you wouldn't have write access to them later as a regular user. You can have setup.py generate .pyc files during install. You get the right .pyc files for whatever python you happen to have, and since you are running as root/admin during install you still have acess to the directories. Trying to build .pyc files later is a problem because a regular user doesn't have access to the directories.
So, best practice is to distribute .py files and have setup.py build the .pyc during install.

If you simply want to run your Python script, all you really need is .pyc which is the bytecode generated from your source code. See here for details on running a .pyc file. I will warn that some of the detials are bit twisty.
However I recommend including your source code and leaving out your .pyc files as they are generated automatically by the Python Interpreter. Besides, if you, or another person would want to revise/revisit your source code at a later point, you would need the .py files. Furthermore, it is usually best practice to just include your source code.

Where I should put my python scripts in Linux?

My python program consists of several files:
the main execution python script
python modules in *.py files
config file
log files
executables scripts of other languages.
All this files should be available only for root. The main script should run on startup, e.g. via upstart.
Where I should put all this files in Linux filesystem?
What's the better way for distribution my program? pip, easy_install, deb, ...? I haven't worked with any of these tool, so I want something easy for me.
The minimum supported Linux distributive should be Ubuntu.

For sure, if this program is to be available only for root, then the main execution python script have to go to /usr/sbin/.
Config files ought to go to /etc/, and log files to /var/log/.
Other python files should be deployed to /usr/share/pyshared/.
Executable scripts of other languages will go either in /usr/bin/ or /usr/sbin/ depending on whether they should be available to all users, or for root only.

If only root should access the scripts, why not put it in /root/ ?
Secondly, if you're going to distribute your application you'll probably need easy_install or something similar, otherwise just tar.gz the stuff if only a few people will access it?
It all depends on your scale..
Pyglet, wxPython and similar have a hughe userbase.. same for BeautifulSoup but they still tar.gz the stuff and you just use setuptools to deply it (whcih, is another option).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.