Suppose I have a code like this in module a.py
import numpy as np
def sqrt(x):
return np.sqrt(x)
And I have a module b.py written like this:
import a
print(a.sqrt(25))
print(a.np.sqrt(25))
I will see that the code runs fine and that when using autocomplete in most IDEs, I found that a.np is accessible. I want to make a.np private so that only a code can see that variable.
I don't want b to be able to access a.np.
What is a good approach to make this possible?
Why do I want a.np to be inaccessible? Because I want it to not show in the autocomplete when I type a. and press Tab in Jupyter Lab. It hides what the modules can do because there are so many imports that I use in my module.
The solution is the same as for "protected" attributes / methods in a class (names defined in a module are actually - at runtime - attributes of the module object): prefix those names with a single leading underscore, ie
import numpy as _np
def sqrt(x):
return _np.sqrt(x)
Note that this will NOT prevent someone to use a._np.sqrt(x), but at least it makes it quite clear that he is using a protected attribute.
I see 2 approaches here:
more user-friendly solution: change alias names to "underscored" ones
import numpy as _np
...
this will not prevent from importing it, but it will say to user that this are implementation details and one should not depend on them.
preferred-by-me solution: do nothing, leave it as it is, use semver and bump versions accordingly.
Related
It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?
Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.
According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?
You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.
That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2
It is OK to do from ... import * in an interactive session.
Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.
Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.
http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.
These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.
It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b
As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *
As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.
I am building a python library. The functions I want available for users are in stemmer.py. Stemmer.py uses stemmerutil.py
I was wondering whether there is a way to make stemmerutil.py not accessible to users.
If you want to hide implementation details from your users, there are two routes that you can go. The first uses conventions to signal what is and isn't part of the public API, and the other is a hack.
The convention for declaring an API within a python library is to add all classes/functions/names that should be exposed into an __all__-list of the topmost __init__.py. It doesn't do that many useful things, its main purpose nowadays is a symbolic "please use this and nothing else". Yours would probably look somewhat like this:
urdu/urdu/__init__.py
from urdu.stemmer import Foo, Bar, Baz
__all__ = [Foo, Bar, Baz]
To emphasize the point, you can also give all definitions within stemmerUtil.py an underscore before their name, e.g. def privateFunc(): ... becomes def _privateFunc(): ...
But you can also just hide the code from the interpreter by making it a resource instead of a module within the package and loading it dynamically. This is a hack, and probably a bad idea, but it is technically possible.
First, you rename stemmerUtil.py to just stemmerUtil - now it is no longer a python module and can't be imported with the import keyword. Next, update this line in stemmer.py
import stemmerUtil
with
import importlib.util
import importlib.resources
# in python3.7 and lower, this is importlib_resources and needs to be installed first
stemmer_util_spec = importlib.util.spec_from_loader("stemmerUtil", loader=None)
stemmerUtil = importlib.util.module_from_spec(stemmer_util_spec)
with importlib.resources.path("urdu", "stemmerUtil") as stemmer_util_path:
with open(stemmer_util_path) as stemmer_util_file:
stemmer_util_code = stemmer_util_file.read()
exec(stemmer_util_code, stemmerUtil.__dict__)
After running this code, you can use the stemmerUtil module as if you had imported it, but it is invisible to anyone who installed your package - unless they run this exact code as well.
But as I said, if you just want to communicate to your users which part of your package is the public API, the first solution is vastly preferable.
Basically I have 3 modules that all communicate with eachother and import eachother's functions. I'm trying to import a function from my shigui.py module that creates a gui for the program. Now I have a function that gets the values of user entries in the gui and I want to pass them to the other module. I'm trying to pass the function below:
def valueget():
keywords = kw.get()
delay = dlay.get()
category = catg.get()
All imports go fine, up until I try to import this function with
from shigui import valueget to another module that would use the values. In fact, I can't import any function to any module from this file. Also I should add that they are in the same directory. I'm appreciative of any help on this matter.
Well, I am not entirely sure of what imports what, but here is what I can tell you. Python can sometimes allow for circular dependencies. However, it depends on what the layout of your dependencies is. First and foremost, I would say see if there is any way you can avoid this happening (restructuring your code, etc.). If it is unavoidable then there is one thing you can try. When Python imports modules, it does so in order of code execution. This means that if you have a definition before an import, you can sometimes access the definition in the first module by importing that first module in the second module. Let me give an example. Consider you have two modules, A and B.
A:
def someFunc():
# use B's functionality from before B's import of A
pass
import B
B:
def otherFunc():
# use A's functionality from before A's import of B
pass
import A
In a situation like that, Python will allow this. However, everything after the imports is not always fair game so be careful. You can read up on Python's module system more if you want to know why this works.
Helpful, but not complete link: https://docs.python.org/3/tutorial/modules.html
I'm having some real headaches right now trying to figure out how to import stuff properly. I had my application structured like so:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller.py
I would always run my applications from the root directory, so most of my imports would be something like this
from util_functions import *
from widgets.chooser import *
from widgets.controller import *
# ...
And my widgets/__init__.py was setup like this:
from widgets.chooser import Chooser
from widgets.controller import MainPanel, Switch, Lever
__all__ = [
'Chooser', 'MainPanel', 'Switch', 'Lever',
]
It was working all fine, except that widgets/controller.py was getting kind of lengthy, and I wanted it to split it up into multiple files:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller/
- __init__.py
- mainpanel.py
- switch.py
- lever.py
One of issues is that the Switch and Lever classes have static members where each class needs to access the other one. Using imports with the from ___ import ___ syntax that created circular imports. So when I tried to run my re-factored application, everything broke at the imports.
My question is this: How can I fix my imports so I can have this nice project structure? I cannot remove the static dependencies of Switch and Lever on each other.
This is covered in the official Python FAQ under How can I have modules that mutually import each other.
As the FAQ makes clear, there's no silvery bullet that magically fixes the problem. The options described in the FAQ (with a little more detail than is in the FAQ) are:
Never put anything at the top level except classes, functions, and variables initialized with constants or builtins, never from spam import anything, and then the circular import problems usually don't arise. Clean and simple, but there are cases where you can't follow those rules.
Refactor the modules to move the imports into the middle of the module, where each module defines the things that need to be exported before importing the other module. This can means splitting classes into two parts, an "interface" class that can go above the line, and an "implementation" subclass that goes below the line.
Refactor the modules in a similar way, but move the "export" code (with the "interface" classes) into a separate module, instead of moving them above the imports. Then each implementation module can import all of the interface modules. This has the same effect as the previous one, with the advantage that your code is idiomatic, and more readable by both humans and automated tools that expect imports at the top of a module, but the disadvantage that you have more modules.
As the FAQ notes, "These solutions are not mutually exclusive." In particular, you can try to move as much top-level code as possible into function bodies, replace as many from spam import … statements with import spam as is reasonable… and then, if you still have circular dependencies, resolve them by refactoring into import-free export code above the line or in a separate module.
With the generalities out of the way, let's look at your specific problem.
Your switch.Switch and lever.Lever classes have "static members where each class needs to access the other one". I assume by this you mean they have class attributes that are initialized using class attributes or class or static methods from the other class?
Following the first solution, you could change things so that these values are initialized after import time. Let's assume your code looked like this:
class Lever:
switch_stuff = Switch.do_stuff()
# ...
You could change that to:
class Lever:
#classmethod
def init_class(cls):
cls.switch_stuff = Switch.do_stuff()
Now, in the __init__.py, right after this:
from lever import Lever
from switch import Switch
… you add:
Lever.init_class()
Switch.init_class()
That's the trick: you're resolving the ambiguous initialization order by making the initialization explicit, and picking an explicit order.
Alternatively, following the second or third solution, you could split Lever up into Lever and LeverImpl. Then you do this (whether as separate lever.py and leverimpl.py files, or as one file with the imports in the middle):
class Lever:
#classmethod
def get_switch_stuff(cls):
return cls.switch_stuff
from switch import Swift
class LeverImpl(Lever):
switch_stuff = Switch.do_stuff()
Now you don't need any kind of init_class method. Of course you do need to change the attribute to a method—but if you don't like that, with a bit of work, you can always change it into a "class #property" (either by writing a custom descriptor, or by using #property in a metaclass).
Note that you don't actually need to fix both classes to resolve the circularity, just one. In theory, it's cleaner to fix both, but in practice, if the fixes are ugly, it may be better to just fix the one that's less ugly to fix and leave the dependency in the opposite direction alone.
I'm creating a class to extend a package, and prior to class instantiation I don't know which subset of the package's namespace I need. I've been careful about avoiding namespace conflicts in my code, so, does
from package import *
create problems besides name conflicts?
Is it better to examine the class's input and import only the names I need (at runtime) in the __init__ ??
Can python import from a set [] ?
does
for name in [namespace,namespace]:
from package import name
make any sense?
I hope this question doesn't seem like unnecessary hand-ringing, i'm just super new to python and don't want to do the one thing every 'beginnger's guide' says not to do (from pkg import * ) unless I'm sure there's no alternative.
thoughts, advice welcome.
In order:
It does not create other problems - however, name conflicts can be much more of a problem than you'd expect.
Definitely defer your imports if you can. Even though Python variable scoping is simplistic, you also gain the benefit of not having to import the module if the functionality that needs it never gets called.
I don't know what you mean. Square brackets are used to make lists, not sets. You can import multiple names from a module in one line - just use a comma-delimited list:
from awesome_module import spam, ham, eggs, baked_beans
# awesome_module defines lots of other names, but they aren't pulled in.
No, that won't do what you want - name is an identifier, and as such, each time through the loop the code will attempt to import the name name, and not the name that corresponds to the string referred to by the name variable.
However, you can get this kind of "dynamic import" effect, using the __import__ function. Consult the documentation for more information, and make sure you have a real reason for using it first. We get into some pretty advanced uses of the language here pretty quickly, and it usually isn't as necessary as it first appears. Don't get too clever. We hates them tricksy hobbitses.
When importing * you get everything in the module dumped straight into your namespace. This is not always a good thing as you could accentually overwrite something like;
from time import *
sleep = None
This would render the time.sleep function useless...
The other way of taking functions, variables and classes from a module would be saying
from time import sleep
This is a nicer way but often the best way is to just import the module and reference the module directly like
import time
time.sleep(3)
you can import like from PIL import Image, ImageDraw
what is imported by from x import * is limited to the list __all__ in x if it exists
importing at runtime if the module name isn't know or fixed in the code must be done with __import__ but you shouldn't have to do that
This syntax constructions help you to avoid any name collision:
from package import somename as another_name
import package as another_package_name