Which Python packages aren't used in application - python

Can you recommend a CLI / python package to tell me which modules aren't being imported into my application?
I considered using coverage + nosetests, but many of the unwanted/not needed packages have tests (to the credit of the previous developer).
As background, I'm dealing with a legacy code base and want to remove what isn't being used so far so I can reduce my mental load before a refactor.

There's a Python module called vulture for doing exactly this. I haven't used it, but:
Vulture finds unused classes, functions and variables in your code.

Related

Python - how to show useful error on failed import

Some background: I am revamping the package we use in my company as the code base for many projects. This is an internal package, and is used/developed by a very limited number of people, so we allow ourselves to sometimes make major changes without a proper deprecation protocol.
The problem at hand: I am removing a few modules from the package as part of the code revamp. These contain some old classes and functions, which are no longer used in new projects. Just in case someone will ever need these objects, perhaps running some old notebook, I would like to move these modules to a different "legacy" package. When someone attempts to import a now-deleted module in the main package, instead of the standard ImportError, I would like to be able to give a useful error message instructing them to import from the legacy package. Is there any way to achieve this? Perhaps using the __init__.py file?

add functions to Python standard library?

Is there a way to add functions I create to the Python standard library on my local machine?
I come from the matlab world where things aren't really efficient and fast but there are looooads of functions at my fingertips without having to import their files. My problem is that, if I make a function in Python and want to use it, then i will need to also remember the module its in. My memory is shite. I understand that Python is structured that way for efficiency but if I'm adding only a handful of functions to the standard library that I consider very important, I'd guess that the impact to the performance is practically negligible.
Python has a namespace called __builtins__ in which you can stick stuff that you want available all the time. You probably shouldn't, but you can. Be careful not to clobber anything. Python won't stop you from using the same name as a built-in function, and if you do that, it'll probably break a lot of things.
# define function to always be available
def fart():
print("poot!")
__builtins__.fart = fart
# make re module always available without import
import re
__builtins__.re = re
Now the question is how to get Python to run that code for you each time you start up the interpreter. The answer is usercustomize.py. Follow these instructions to find out where the correct directory is on your machine, then put a new file called usercustomize.py in that directory that defines all the stuff you want to have in __builtins__.
There's also an environment variable, PYTHONSTARTUP, that you can set to have a Python script run whenever you start the interpreter in interactive mode (i.e. to a command prompt). I can see the benefit of e.g. having your favorite modules available when exploring in the REPL. More details here.
It sounds like you want to create your own packages & modules with tools you plan on using in the future on other projects. If that is the case, you want to look into the packaging your own project documentation:
https://packaging.python.org/tutorials/packaging-projects/
You may also find this useful:
How to install a Python package system-wide on Linux?
How to make my Python module available system wide on Linux?
How can I create a simple system wide python library?

Fully embedded SymPy+Matplotlib+others within a C/C++ application

I've read the Python documentation chapter explaining how to embed the Python interpreter in a C/C++ application. Also, I've read that you can install Python modules either in a system-wide fashion, or locally to a given user.
But let's suppose my C/C++ application will use some Python modules such as SymPy, Matplotlib, and other related modules. And let's suppose end users of my application won't have any kind of Python installation in their machines.
This means that my application needs to ship with "pseudo-installed" modules, inside its data directories (just like the application has a folder for icons and other resources, it will need to have a directory for Python modules).
Another requirement is that the absolute path of my application installation isn't fixed: the user can "drag" the application bundle to another directory and it will run fine there (it already works this way but that's prior to embedding Python in it, and I wish it continues being this way after embedding Python).
I guess my question could be expressed more concisely as "how can I use Python without installing Python, neither system-wide, nor user-wide?"
There are various ways you could attempt to do this, but none of them are general solutions. From the (docs):
5.5. Embedding Python in C++
It is also possible to embed Python in a C++ program; precisely how this is done will depend on the details of the C++ system used; in general you will need to write the main program in C++, and use the C++ compiler to compile and link your program. There is no need to recompile Python itself using C++.
This is the shortest section in the document, and is roughly equivalent to: 'left as an exercise for the reader`. I do not believe you will find any straight forward solutions.
Use pyinstaller to gather the pieces:
This means that my application needs to ship with "pseudo-installed" modules, inside its data directories (just like the application has a folder for icons and other resources, it will need to have a directory for Python modules).
If I needed to tackle this problem, I would use pyinstaller as a base. (Disclosure: I am an occasional contributer). One of the major functions of pyinstaller is to gather up all of the needed resources for a python program. In onedir mode, all of the things needed to let the program run are gathered into one directory.
You could include this tool into your make system, and have it place all of the needed pieces into your python data directory in your build tree.

A tool to validate the structure of a Python Package?

I started writing Python code not too long ago and everything just works, but I have been having problem writing a package. I was wondering if there is such a thing as a "package validation tool". I know, I could just start up a REPL and start importing the module but...is there a better way? Is there a tool that could tell me "you have these possible errors"?
Or maybe there is something in the middle: is there a way to test a Python's package structure?
As always, thanks in advance!
If you call a module using:
python -m module
Python will load/execute the module, so you should catch crude syntax errors. Also, if module has a block like:
if __name__ = "__main__":
do_something()
It will be called. For some small self-contained modules I often use this this block to run tests.
Given the very dynamic nature of Python, it is very hard to check for correctness if the module author is not using TTD. There is no silver bullet here. There are tools that will check for "code smells" and compliance with standards (dynamic languages tend to generate a profusion of linters).
pylint
PyChecker
PyFlakes
PEP8
A good IDE like PyCharm can help, if you like IDEs.
These tools can help, but are still far from the assurance of static languages where the compiler can catch many errors at compile time. For example, Go seems to be designed to have a very pedantic compiler. Haskell programs are said to be like mathematical proofs.
If you are coming from a Language with strong compile time checks, just relax. Python is kind of a "throw against the wall and see if it sticks", language. Some of the Python "macho" principles:
duck typing
EAFP
We are all consenting adults
There is no tool to test the package structure per se, and I'm unsure of what would be tested. Almost any structure is a valid structure...
But there are some tools to help you test your package data if you are distributing your module, they may be useful:
Pyroma will check the packages meta data.
check-manifest will check the MANIFEST.in file.
I have both of them installed and also uses zest.releaser which also has some basic sanity-checks. But none of these will check that the code is OK, so it won't look for the __init__ files, for example.

Python project deployment design

Here is the situation: the company that I'm working in right now gave me the freedom to work with either java or python to develop my applications. The company has mainly experience in java.
I have decided to go with python, so they where very happy to ask me to give maintenance to all the python projects/scripts related to the database maintenance that they have.
Its not that bad to handle all that stuff and its kind of fun to see how much free time I have compared to java programmers. There is just one but, the projects layout is a mess.
There are many scripts that simply lay in virtual machines all over the company. Some of them have complex functionality that is spread across a few modules(4 at maximum.)
While thinking about it about it, I realized that I don't know how to address that, so here are 3 questions.
Where do I put standalone scripts? We use git as our versioning system.
How do structure the project's layout in a way that the user do not need to dig deep into the folders to run the programs(in java I created a jar or a jar and a shell script to handle some bootstrap operations.)
What is a standard way to create modules that allow easy reusability(mycompany.myapp.mymodule?)
Where do I put standalone scripts?
You organize them "functionally" -- based on what they do and why people use them.
The language (Python vs. Java) is irrelevant.
You have to think of scripts as small applications focused on some need and create appropriate directory structures for that application.
We use /opt/thisapp and /opt/thatapp. If you want a shared mount-point, you might use a different path.
How do structure the project's layout in a way that the user do not need to dig deep into the folders to run the programs
You organize them "functionally" -- based on what they do and why people use them. At the top level of a /opt/thisapp directory, you might have an __init__.py (because it's a package) and perhaps a main.py script which starts the real work.
In Python 2.7 and Python 3, you have the runpy module. With this you would name your
top-level main script __main__.py
http://docs.python.org/library/runpy.html#module-runpy
What is a standard way to create modules that allow easy reusability(mycompany.myapp.mymodule?)
Read about packages. http://docs.python.org/tutorial/modules.html#packages
A package is a way of creating a module hierarchy: if you make a file called __init__.py in a directory, Python will treat that directory as a package and allow you to import its contents using dotted imports:
spam \
__init__.py
ham.py
eggs.py
import spam.ham
The modules inside a package can reference each other -- see the docs.
If these are all DB maintenance scripts, I would make a package called DB or something, and place them all in it. You can have subpackages for the more complicated ones. So if you had a script for, I don't know, cleaning up the transaction logs, you could put it in ourDB.clean and do
import ourDB.clean
ourDB.clean.transaction_logs( )

Categories