Using externally defined function/variable vs internal method/attribute

Using externally defined function/variable vs internal method/attribute - python

I am wondering about this question for a while and still not sure about the appropriate answer.
If there is somewhere good answer already, sorry for that.
When is the good use case to use function or variable defined somewhere in module within the class instead of defining it inside as method/attribute?
Example:
PATH_TO_DIR = "abc\\def"
class Reader:
def __init__(self, file_name):
self.file_name = file_name
def read_file(self):
return pd.read_excel(os.path.join(PATH_TO_DIR, self.file_name))
or
class Reader:
PATH_TO_DIR = "abc\\def"
def __init__(self, file_name):
self.file_name = file_name
def read_file(self):
return pd.read_excel(os.path.join(self.PATH_TO_DIR, self.file_name))
The same problem is bothering me regarding function/method, for example we could define read_file() function and use it within class externally.
I feel like defining it as method/attribute make more sense, but I have seen a lot of codes where those parts was defined externally.
I would like to know the answer regarding good practices of python programming - I know that language is able to handle a lot of strange things, but its not the case ;)

I would lean towards option 3: pass the correct absolute path to Reader.__init__. The job of Reader, presumably, is to parse a file, not worry about file-system layout.
PATH_TO_DIR = "abc\\def"
class Reader:
def __init__(self, file_name):
self.file_name = file_name
def read_file(self):
return pd.read_excel(self.file_name)
r = Reader(os.path.join(PATH_TO_DIR, "foo.xl"))

I believe, that a good practice is to have it defined externally, because in that way you could reuse this function more easily. Also, you can reuse the same variable in other functions/classes.
In your first example you are defining variable that could be used in multiple classes. Also the same class could be imported by other script that you did not design for.
In second example - you can use this variable only in this function and if you want to reuse this function somewhere else - you have to overwrite this variable after initialization. And this means running __init__() method.
Personally, I avoid defining variables inside classes and functions.

Related

Variables inside constructor without "self" - Python

I am trying to understand whether declaring variables inside constructors is an ok practice. I only need these variables inside the constructor. I am asking this because most of the times I've seen constructors they only contained self variables. I've tried to find an answer on the internet but had no luck.
Here is an example code
class Patient:
def __init__(self, all_images_path: str):
all_patient_images = os.listdir(all_images_path)
all_patient_images = [(all_images_path + x) for x in all_patient_images if 'colourlay' in x]
self.images: [Image] = [Image.open(img_path) for img_path in all_patient_images]
Is there anything wrong with the above code? If yes, what?
Thank you!

__init__ is just a normal function that has a special purpose (initializing the newly created object).
You can do (almost) whatever you like in there. You typically see lines like self.name = arg because assignments like these are often required to set up the internal state of the object. It's perfectly fine to have local variables though, and if you find they make your code cleaner, use them.

From a design standpoint, Patient.__init__ is doing too much. I would keep __init__ as simple as possible:
class Patient:
def __init__(self, images: list[Image]):
self.images = images
The caller of __init__ is responsible for producing that list of Image values. However, that doesn't mean your user is the one calling __init__. You can define a class method to handle the messy details of extracting images from a path, and calling __init__. (Note it gets less messy if you use the pathlib module.)
from pathlib import Path
class Patient:
def __init__(self, images: list[Image]):
self.images = images
#classmethod
def from_path(cls, all_images_path: Path):
files = all_images_path.glob('*colourlay*')
return cls([Image.open(f) for f in files])
Note that Image itself seems to take this approach: you aren't constructing an instance of Image directly, but using a class method to extract the image data from a file.

Patching in Python

I have a python file say
python_file_a.py
def load_content():
dir = "/down/model/"
model = Model(model_dir=dir)
return model
model = load_content()
def invoke(req):
return model.execute(req)
test_python_file_a.py
#patch("module.python_file_a.load_content")
#patch("module.python_file_a.model", Mock(spec=Model))
def test_invoke():
from module.python_file_a import model, invoke
model.execute = Mock(return_value="Some response")
invoke("some request")
This is still trying to load the actual model from the path "/down/model/" in the test. What is the correct way of patching so that the load_content function is mocked in the test?

Without knowing more about what your code does or how it's used it's hard to say exactly, but in this case the correct approach--and in many cases--is to not hard-code values as local variables in functions. Change your load_content() function to take an argument like:
def load_content(dirname):
...
or even give it a default value like
def load_content(dirname="/default/path"):
pass
For the test don't use the model instance instantiated at module level (arguably you should not be doing this in the first place, but again it depends on what you're trying to do).
Update: Upon closer inspect the problem really seems to stem from you instantiating a module-global instance at import time. Maybe try to avoid doing that and use lazy instantiation instead, like:
model = None
then if you really must write a function that accesses the global variable:
def invoke():
global model
if model is None:
model = load_content()
Alternatively you can use a PEP 562 module-level __getattr__ function.
Or write a class instead of putting everything at module-level.
class ModelInvoker:
def __init__(self, dirname='/path/to/content'):
self.dirname = dirname
#functools.cached_property
def model(self):
return load_content(self.dirname)
def invoke(self, req):
return model.execute(req)
Many other approaches to this depending on your use case. But finding some form of encapsulation is what you need if you want to be able to easily mock and replace parts of some code, and not execute code unnecessarily at import time.

In Python, is referencing an instance variable from a staticmethod possible?

I know the question has been asked before, but I find myself bumping into situations where a staticmethod is most appropriate, but there is also a need to reference an instance variable inside this class. As an example, lets say I have the following class:
class ExampleClass(object):
def __init__(self, filename = 'defaultFilename'):
self.file_name = filename
#staticmethod
def doSomethingWithFiles(file_2, file_1 = None):
#if user didn't supply a file use the instance variable
if file_1 is None:
# no idea how to handle the uninitialized class case to create
# self.file_name.
file_1 = __class__.__init__().__dict__['file_name'] <--- this seems sketchy
else:
file_1 = file_1
with open(file_1, 'r') as f1, open(file_2, 'w') as f2:
.....you get the idea...
def moreMethodsThatUseSelf(self):
pass
Now suppose I had a few instances of the ExampleClass (E1, E2, E3) with different filenames passed into __init__, but want to retain the ability to use either an uninitialized class ExampleClass.doSomethingWithFiles(file_2 = E1.file_name, file_1 = E2.file_name) or E1.doSomethingWithFiles(file_2 = E2.file_name, file_1 = 'some_other_file') as the situation requires.
Is there any reason for me to trying to find a way to do what I am thinking, or am I making a mess?
UPDATE
I think the comments are helpful and I also think it's an issue I'm encountering due to bad design.
The issue started out as a way to prevent concurrent access to HDF5 files by giving each class instance an rlock that I could use as a context manager for preventing any other attempts to access the file while it was in use. Each class instance had it's own rlock it acquired and released when done with whatever it needed to do. I was also using #staticmethod to perform a routine that then generated a file which was passed into it's own init() and was unique to each class instance. At the time it seemed clever, but I regret it. I also think I am entirely unsure of when #staticmethods are ever appropriate and maybe was confusing it with #classmethods, but a class variable would no longer make the rlocks and files that are unique to my class instances possible. I think I should probably just think more about design vs. trying to justify using a class definition I do not really understand in a manner it was designed to protect against.

If you think you keep bumping into situations where a staticmethod is most appropriate, you're probably wrong—good uses for them are very rare. And if your staticmethod needs to access instance variables, you're definitely wrong.
A staticmethod cannot access instance variables directly. There can be no instances of the class, or a thousands; which one would you access the variables from?
What you're trying to do is to create a new instance, just to access its instance variables. This can occasionally be useful—although it's more often a good sign you didn't need a class in the first place. (And, when it useful, it's unusual enough to be usually worth signaling, by having the caller write ExampleClass().doSomethingWithFiles instead of ExampleClass.doSomethingWithFiles.)
That's legal, but you do it by just calling the class, not by calling its __init__ method. That __init__ never returns anything; it receives an already-created self and modifies it. If you really want to, you can call its __new__ method, but that effectively just means the same thing as calling the class. (In the minor ways in which they're different, it's calling the class that you want.)
Also, once you've got an instance, you can just use it normally; you don't need to look at its __dict__. (Even if you only had the attribute name as a string variable, getattr(obj, name) is almost always what you want there, not obj.__dict__[name].)
So:
file_1 = __class__().file_name
So, what should you do instead?
Well, look at your design. The only thing an ExampleClass instance does is hold a filename, which has a default value. You don't need an object for that, just a plain old string variable that you pass in, or store as a global. (You may have heard that global variables are bad—but global variables in disguise are just as bad, and have the additional problem that they're in disguise. And that's basically what you've designed. And sometimes, global variables are the right answer.)

why not input the instance as parameter to static method. I hope this code will be helpful.
class ClassA:
def __init__(self, fname):
self.fname = fname
def print(self):
print('fname=', self.fname)
#staticmethod
def check(f):
if type(f)==ClassA :
print('f is exist.')
f.print()
print('f.fname=', f.fname)
else:
print('f is not exist: new ClassA')
newa = ClassA(f)
return newa
a=ClassA('temp')
b=ClassA('test')
ClassA.check(a)
ClassA.check(b)
newa = ClassA.check('hello')
newa.print()

You cannot refer to an instance attribute from a static method. Suppose multiple instances exist, which one would you pick the attribute from?
What you seem to need is to have a class attribute and a class method. You can define one by using the classmethod decorator.
class ExampleClass(object):
file_name = 'foo'
#classmethod
def doSomethingWithFiles(cls, file_2, file_1 = None):
file_1 = cls.file_name
# Do stuff

Maybe I'm misunderstanding what your intentions are but I think you're misusing the default parameter.
It appears you're trying to use 'defaultFilename' as the default parameter value. Why not just skip the awkward
if file_1 is None:
# no idea how to handle the uninitialized class case to create
# self.file_name.
file_1 = __class__.__init__().__dict__['file_name'] <--- this seems sketchy
and change the function as follows,
def doSomethingWithFiles(file_2, file_1='defaultFilename'):
If hardcoding that value makes you uncomfortable maybe try
class ExampleClass(object):
DEFAULT_FILE_NAME = 'defaultFilename'
def __init__(self, filename=DEFAULT_FILE_NAME):
self.file_name = filename
#staticmethod
def doSomethingWithFiles(file_2, file_1=DEFAULT_FILE_NAME):
with open(file_1, 'r') as f1, open(file_2, 'w') as f2:
# do magic in here
def moreMethodsThatUseSelf(self):
pass
In general, though, you're probably modeling your problem wrong if you want to access an instance variable inside a static method.

How do you decide which level a function should be at in python?

I have a file called file_parsers.py and it contains the following class:
class FileParser():
def __init__(self, file_text):
self.file_text = file_text
def do_something(self):
my_value = func_with_no_state()
I'm not sure what questions to ask when deciding whether func_with_no_state() should be inside the class or outside of the class as a file-level function?
Also, is it easier to stub this function when it is at a file-level or inside the class?

So... Does any other class use func_with_no_state? If not, it should be hidden within FileParser. If something else does use it, you have a bigger question. If OtherClass uses func_with_no_state pretty frequently (on par with FileParser) then it would be a good idea to keep func_with_no_state outside so that both classes can use it. But if FileParser is by far the main user, then OtherClass could just pull the function from FileParser's definition.

Preferred pythonic way to associate upload_to function with model class?

I have a class with a Django ImageField and I have been struggling to decide between two alternatives for storing that field's upload_to function. The first approach is pretty straightforward. The function is defined on the module level (c.f. https://stackoverflow.com/a/1190866/790075, https://stackoverflow.com/a/3091864/790075):
def get_car_photo_file_path(instance, filename):
ext = filename.split('.')[-1]
filename = "%s.%s" % (uuid.uuid4(), ext) # chance of collision <1e-50
return os.path.join('uploads/cars/photos', filename)
class CarPhoto(models.Model):
photo = models.ImageField(upload_to=get_car_photo_file_path)
This is simple and easy to understand, but pollutes the module scope by adding a function that is really only pertinent to the CarPhoto class.
In the second approach, I use the callable-class pattern to associate the function more closely with the CarPhoto class. This moves the upload_to function out of module scope but feels unnecessarily complicated.
class CarPhoto(models.Model):
class getCarPhotoFilePath():
# Either use this pattern or associate function with module instead of this class
def __call__(self, instance, filename):
ext = filename.split('.')[-1]
filename = "%s.%s" % (uuid.uuid4(), ext) # chance of collision <1e-50
return os.path.join('uploads/cars/photos', filename)
photo = models.ImageField(upload_to=getCarPhotoFilePath())
I have seen suggestions for using the #staticmethod and #classmethod decorators (c.f. https://stackoverflow.com/a/9264153/790075), but I find that when I do this the function never executes and the filename ends up looking like: /path/to/file/<classmethod object>, with the method object embedded in the file path, which is certainly not intended!
Which of these is the preferred pattern? Is there a better way?

I would recommend that you:
import this
To me, this falls under the Zen of Python's section stating:
Simple is better than complex.
Complex is better than complicated.
I think your simple solution is better. But, your complex doesn't feel overly complicated. I think you will probably be ok either way. Just my two cents.

There's a naming convention to prevent name pollution.
use _get_car_photo_file_path to mark your function as internal (though not hidden);
use __get_car_photo_file_path to prevent access to it from outside your class.
You can add a classmethod or staticmethod like this to your CarPhoto class, which is simpler than adding a callable class (the latter reminds me of Java's way to define an anonymous class for the sake of one method).
The name will cleanly show that _get_car_photo_file_path is an implementation detail and not a part of the interface, thus preventing pollution of class's namespace. Being CarPhoto's method, the function will not pollute module's namespace.

Currently in the code I'm working with we have the variation of the simplest one. The only difference is that since the function is intended for internal use, it's marked so with _ prefix.
def _get_car_photo_file_path(instance, filename):
[...]
class CarPhoto(models.Model):
photo = models.ImageField(upload_to=_get_car_photo_file_path)
However, I do believe this would be more Pythonic (or rather more OOP):
class CarPhoto(models.Model):
#staticmethod
def _get_file_path(instance, filename):
[...]
photo = models.ImageField(upload_to=_get_file_path)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using externally defined function/variable vs internal method/attribute - python

Related

Variables inside constructor without "self" - Python

Patching in Python

In Python, is referencing an instance variable from a staticmethod possible?

How do you decide which level a function should be at in python?

Preferred pythonic way to associate upload_to function with model class?

Categories

Resources