How to group/collect variables and keep IDE (PyCharm) highlighting? - python

In Python, I often collect several variables that I use as parameters or settings into one container.
For instance, I often organize settings (or parameters or options) like
nr_of_users = 100
file_label = "copy"
into a dictionary, like so
options['nr_of_users']= 100
options['file_label']= "copy"
This makes it easy to print & review these parameters, and pass them between functions in chunks etc..., since they are all collected under one "handle".
And I can add fields as they come-up, without first defining a class and its members.
The issue with this approach is that my IDE doesn't recognize these as variables, and I miss out on variable highlighting (linking definition to use), and refactoring (renaming) support.
(My IDE is PyCharm, but I suspect it's not unique to this one)
Is there a better approach?

Related

No Auto Variables Debugging Python in VS2019

I like VS2019 and I want to do as much dev on it without needing to switch IDEs constantly. To this, I tried coding in Python but when it came to debugging, it really hold no weight to Pycharm.
For one, the “Autos” variables don’t show on my end:
This is with a project I created within VS2019. Instead to see variables, I have to go to the super cluttered “locals” tab which actually includes for whatever reason, collections, and a bunch of packages cluttering up my debug monitor. I can’t even take out these variables so I can have a cleaner window
In C++, Autos were automatically populated with variables within the scope of the current function call. In the locals everything including stuff I don’t care is there:
The worst part, making classes with multiple values, the object of the class in the debug window can’t even expan to show you the values the object holds like it does so well in Pycharm:
Is there a way to fix this? Different debug monitor windows you can use to make variable tracking as close to and intuitive as Pycharm?
​I fixed it by going to Tools->Options->Python->Debugging, and changing it to "use legacy debugger". Unfortunately it doesn't track multipe variables on auto. Just the most recent one

Tracking changes in python source files?

I'm learning python and came into a situation where I need to change the behvaviour of a function. I'm initially a java programmer so in the Java world a change in a function would let Eclipse shows that a lot of source files in Java has errors. That way I can know which files need to get modified. But how would one do such a thing in python considering there are no types?! I'm using TextMate2 for python coding.
Currently I'm doing the brute-force way. Opening every python script file and check where I'm using that function and then modify. But I'm sure this is not the way to deal with large projects!!!
Edit: as an example I define a class called Graph in a python script file. Graph has two objects variables. I created many objects (each with different name!!!) of this class in many script files and then decided that I want to change the name of the object variables! Now I'm going through each file and reading my code again in order to change the names again :(. PLEASE help!
Example: File A has objects x,y,z of class C. File B has objects xx,yy,zz of class C. Class C has two instance variables names that should be changed Foo to Poo and Foo1 to Poo1. Also consider many files like A and B. What would you do to solve this? Are you serisouly going to open each file and search for x,y,z,xx,yy,zz and then change the names individually?!!!
Sounds like you can only code inside an IDE!
Two steps to free yourself from your IDE and become a better programmer.
Write unit tests for your code.
Learn how to use grep
Unit tests will exercise your code and provide reassurance that it is always doing what you wanted it to do. They make refactoring MUCH easier.
grep, what a wonderful tool grep -R 'my_function_name' src will find every reference to your function in files under the directory src.
Also, see this rather wonderful blog post: Unix as an IDE.
Whoa, slow down. The coding process you described is not scalable.
How exactly did you change the behavior of the function? Give specifics, please.
UPDATE: This all sounds like you're trying to implement a class and its methods by cobbling together a motley patchwork of functions and local variables - like I wrongly did when I first learned OO coding in Python. The code smell is that when the type/class of some class internal changes, it should generally not affect the class methods. If you're refactoring all your code every 10 mins, you're doing something seriously wrong. Step back and think about clean decomposition into objects, methods and data members.
(Please give more specifics if you want a more useful answer.)
If you were only changing input types, there might be no need to change the calling code.
(Unless the new fn does something very different to the old one, in which case what was the argument against calling it a different name?)
If you changed the return type, and you can't find a common ancestor type or container (tuple, sequence etc.) to put the return values in, then yes you need to change its caller code. However...
...however if the function should really be a method of a class, declare that class and the method already. The previous paragraph was a code smell that your function really should have been a method, specifically a polymorphic method.
Read about code smells, anti-patterns and When do you know you're dealing with an anti-pattern?. There e.g. you will find a recommendation for the video "Recovery from Addiction - A taste of the Python programming language's concision and elegance from someone who once suffered an addiction to the Java programming language." - Sean Kelly
Also, sounds like you want to use Test-Driven Design and add some unittests.
If you give us the specifics we can critique it better.
You won't get this functionality in a text editor. I use sublime text 3, and I love it, but it doesn't have this functionality. It does however jump to files and functions via its 'Goto Anything' (Ctrl+P) functionality, and its Multiple Selections / Multi Edit is great for small refactoring tasks.
However, when it comes to IDEs, JetBrains pycharm has some of the amazing re-factoring tools that you might be looking for.
The also free Python Tools for Visual Studio (see free install options here which can use the free VS shell) has some excellent Refactoring capabilities and a superb REPL to boot.
I use all three. I spend most of my time in sublime text, I like pycharm for refactoring, and I find PT4VS excellent for very involved prototyping.
Despite python being a dynamically typed language, IDEs can still introspect to a reasonable degree. But, of course, it won't approach the level of Java or C# IDEs. Incidentally, if you are coming over from Java, you may have come across JetBrains IntelliJ, which PyCharm will feel almost identical to.
One's programming style is certainly different between a statically typed language like C# and a dynamic language like python. I find myself doing things in smaller, testable modules. The iteration speed is faster. And in a dynamic language one relies less on IDE tools and more on unit tests that cover the key functionality. If you don't have these you will break things when you refactor.
One answer only specific to your edit:
if your old code was working and does not need to be modified, you could just keep old names as alias of the new ones, resulting in your old code not to be broken. Example:
class MyClass(object):
def __init__(self):
self.t = time.time()
# creating new names
def new_foo(self, arg):
return 'new_foo', arg
def new_bar(self, arg):
return 'new_bar', arg
# now creating functions aliases
foo = new_foo
bar = new_bar
if your code need rework, rewrite your common code, execute everything, and correct any failure. You could also look for any import/instantiation of your class.
One of the tradeoffs between statically and dynamically typed languages is that the latter require less scaffolding in the form of type declarations, but also provide less help with refactoring tools and compile-time error detection. Some Python IDEs do offer a certain level of type inference and help with refactoring, but even the best of them will not be able to match the tools developed for statically typed languages.
Dynamic language programmers typically ensure correctness while refactoring in one or more of the following ways:
Use grep to look for function invocation sites, and fix them. (You would have to do that in languages like Java as well if you wanted to handle reflection.)
Start the application and see what goes wrong.
Write unit tests, if you don't already have them, use a coverage tool to make sure that they cover your whole program, and run the test suite after each change to check that everything still works.

Dynamically broadcast configuration changes in python twisted

I am about to refactor the code of a python project built on top of twisted. So far I have been using a simple settings.py module to store constants and dictionaries like:
#settings.py
MY_CONSTANT='whatever'
A_SLIGHTLY_COMPLEX_CONF= {'param_a':'a', 'param_b':b}
A great deal of modules import settings.py to do their stuff.
The reason why I want to refactor the project is because I am in need to change/add configuration parameters on the fly. The approach that I am about to take is to gather all configuration in a singleton and to access its instance whenever I need to.
import settings.MyBloatedConfig
def first_insteresting_function():
cfg = MyBloatedConfig.get_instance()
a_much_needed_param = cfg["a_respectable_key"]
#do stuff
#several thousands of functions later
def gazillionth_function_in_module():
tired_cfg = MyBloatedConfig.get_instance()
a_frustrated_value = cfg["another_respectable_key"]
#do other stuff
This approach works but feels unpythonic and bloated. An alternative would be to externalize the cfg object in the module, like this:
CONFIG=MyBloatedConfig.get_instance()
def a_suspiciously_slimmer_function():
suspicious_value = CONFIG["a_shady_parameter_key"]
Unfortunately this does not work if I am changing the MyBloatedConfig instance entries in another module. Since I am using the reactor pattern, storing staff on a thread local is out of question as well as using a queue.
For completeness, following is the implementation I am using to implement a singleton pattern
instances = {}
def singleton(cls):
""" Use class as singleton. """
global instances
#wraps(cls)
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance
#singleton
class MyBloatedConfig(dict):
....
Is there some other more pythonic way to broadcast configuration changes across different modules?
The big, global (often singleton) config object is an anti-pattern.
Whether you have settings.py, a singleton in the style of MyBloatedConfig.get_instance(), or any of the other approaches you've outlined here, you're basically using the same anti-pattern. The exact spelling doesn't matter, these are all just ways to have a true global (as distinct from a Python module level global) shared by all of the code in your entire project.
This is an anti-pattern for a number of reasons:
It makes your code difficult to unit test. Any code that changes its behavior based on this global is going to require some kind of hacking - often monkey-patching - in order to let you unit test its behavior under different configurations. Compare this to code which is instead written to accept arguments (as in, function arguments) and alters its behavior based on the values passed to it.
It makes your code less re-usable. Since the configuration is global, you'll have to jump through hoops if you ever want to use any of the code that relies on that configuration object under two different configurations. Your singleton can only represent one configuration. So instead you'll have to swap global state back and forth to get the different behavior you want.
It makes your code harder to understand. If you look at a piece of code that uses the global configuration and you want to know how it works, you'll have to go look at the configuration. Much worse than this, though, is if you want to change your configuration you'll have to look through your entire codebase to find any code that this might affect. This leads to the configuration growing over time, as you add new items to it and only infrequently remove or modify old ones, for fear of breaking something (or for lack of time to properly track down all users of the old item).
The above problems should hint to you what the solution is. If you have a function that needs to know the value of some constant, make it accept that value as an argument. If you have a function that needs a lot of values, then create a class that can wrap up those values in a convenient container and pass an instance of that class to the function.
The part of this solution that often bothers people is the part where they don't want to spend the time typing out all of this argument passing. Whereas before you had functions that might have taken one or two (or even zero) arguments, now you'll have functions that might need to take three or four arguments. And if you're converting an application written in the style of settings.py, then you may find that some of your functions used half a dozen or more items from your global configuration, and these functions suddenly have a really long signature.
I won't dispute that this is a potential issue, but should be looked upon mostly as an issue with the structure and organization of the existing code. The functions that end up with grossly long signatures depended on all of that data before. The fact was just obscured from you. And as with most programming patterns which hide aspects of your program from you, this is a bad thing. Once you are passing all of these values around explicitly, you'll see where your abstractions need work. Maybe that 10 parameter function is doing too much, and would work better as three different functions. Or maybe you'll notice that half of those parameters are actually related and always belong together as part of a container object. Perhaps you can even put some logic related to manipulation of those parameters onto that container object.

Are verbose __init__ methods in Python bad?

I have a program that I am writing in Python that does the following:
The user enters the name of a folder. Inside that folder a 8-15 .dat files with different extensions.
The program opens those dat files, enters them into a SQL database and then allows the user to select different changes made to the database. Then the database is exported back to the .dat files. There are about 5-10 different operations that could be performed.
The way that I had planned on designing this was to create a standard class for each group of files. The user would enter the name of the folder and an object with certain attributes (file names, dictionary of files, version of files (there are different versions), etc) would get created. Determining these attributes requires opening a few of these files, reading file names, etc.
Should this action be carried out in the __init__ method? Or should this action be carried our in different instance methods that get called in the __init__ method? Or should these methods be somewhere else, and only be called when the attribute is required elsewhere in the program?
I have already written this program in Java. And I had a constructor that called other methods in the class to set the object's attributes. But I was wondering what standard practice in Python would be.
Well, there is nothing special about good OOP practices in Python. Decomposition of one big method into a bunch of small ones is great idea both in Java and in Python. Among other things small methods gives you an opportunity to write different constructors:
class GroupDescriptor(object):
def __init__(self, file_dictionary):
self.file_dict = file_dictionary
self.load_something(self.file_dict['file_with_some_info'])
#classmethod
def from_filelist(cls, list_of_files):
file_dict = cls.get_file_dict(list_of_files)
return cls(file_dict)
#classmethod
def from_dirpath(cls, directory_path):
files = self.list_dir(directory_path)
return cls.from_filelist(files)
Besides, I don't know how it is in Java but in Python you don't have to worry about exceptions in constructor because they are finely handled. Therefore, it is totally normal to work with such exception-prone things like files.
It looks the action you are describing are initialization, so it'd be perfectly ok to put them into __init__. On the other hand, these actions seem to be pretty expensive, and probably useful in the other part of a program, so you might want to encapsulate them in some separate function.
There's no problem with having a long __init__ method, but I would avoid it simply because its more difficult to test. My approach would be to create smaller methods which are called from __init__. This way you can test them and the initialization separately.
Whether they should be called when needed or run up front really depends on what you need them to do. If they are expensive operations, and are usually not all needed, then maybe its better to only call them when needed. On the other hand, you might want to run them up front so that there is no lag when the attributes are required.
Its not clear from your question whether you actually need a class though. I have no experience with Java, but I understand that everything in it is a class. In python it is perfectly acceptable to just have a function if that's all that's required, and to only create classes when you need instances and other classy things.
The __init__ method is called when the object is instantiated.
Coming from a C++ background I believe its not good to do actual work other than initialization in the constructor.

What's a good way to keep track of class instance variables in Python?

I'm a C++ programmer just starting to learn Python. I'd like to know how you keep track of instance variables in large Python classes. I'm used to having a .h file that gives me a neat list (complete with comments) of all the class' members. But since Python allows you to add new instance variables on the fly, how do you keep track of them all?
I'm picturing a scenario where I mistakenly add a new instance variable when I already had one - but it was 1000 lines away from where I was working. Are there standard practices for avoiding this?
Edit: It appears I created some confusion with the term "member variable." I really mean instance variable, and I've edited my question accordingly.
I would say, the standard practice to avoid this is to not write classes where you can be 1000 lines away from anything!
Seriously, that's way too much for just about any useful class, especially in a language that is as expressive as Python. Using more of what the Standard Library offers and abstracting away code into separate modules should help keeping your LOC count down.
The largest classes in the standard library have well below 100 lines!
First of all: class attributes, or instance attributes? Or both? =)
Usually you just add instance attributes in __init__, and class attributes in the class definition, often before method definitions... which should probably cover 90% of use cases.
If code adds attributes on the fly, it probably (hopefully :-) has good reasons for doing so... leveraging dynamic features, introspection, etc. Other than that, adding attributes this way is probably less common than you think.
pylint can statically detect attributes that aren't detected in __init__, along with many other potential bugs.
I'd also recommend writing unit tests and running your code often to detect these types of "whoopsie" programming mistakes.
Instance variables should be initialized in the class's __init__() method. (In general)
If that's not possible. You can use __dict__ to get a dictionary of all instance variables of an object during runtime. If you really need to track this in documentation add a list of instance variables you are using into the docstring of the class.
It sounds like you're talking about instance variables and not class variables. Note that in the following code a is a class variable and b is an instance variable.
class foo:
a = 0 #class variable
def __init__(self):
self.b = 0 #instance variable
Regarding the hypothetical where you create an unneeded instance variable because the other one was about one thousand lines away: The best solution is to not have classes that are one thousand lines long. If you can't avoid the length, then your class should have a well defined purpose and that will enable you to keep all of the complexities in your head at once.
A documentation generation system such as Epydoc can be used as a reference for what instance/class variables an object has, and if you're worried about accidentally creating new variables via typos you can use PyChecker to check your code for this.
This is a common concern I hear from many programmers who come from a C, C++, or other statically typed language where variables are pre-declared. In fact it was one of the biggest concerns we heard when we were persuading programmers at our organization to abandon C for high-level programs and use Python instead.
In theory, yes you can add instance variables to an object at any time. Yes it can happen from typos, etc. In practice, it rarely results in a bug. When it does, the bugs are generally not hard to find.
As long as your classes are not bloated (1000 lines is pretty huge!) and you have ample unit tests, you should rarely run in to a real problem. In case you do, it's easy to drop to a Python console at almost any time and inspect things as much as you wish.
It seems to me that the main issue here is that you're thinking in terms of C++ when you're working in python.
Having a 1000 line class is not a very wise thing anyway in python, (I know it happens alot in C++ though),
Learn to exploit the dynamism that python gives you, for instance you can combine lists and dictionaries in very creative ways and save your self hundreds of useless lines of code.
For example, if you're mapping strings to functions (for dispatching), you can exploit the fact that functions are first class objects and have a dictionary that goes like:
d = {'command1' : func1, 'command2': func2, 'command3' : func3}
#then somewhere else use this list to dispatch
#given a string `str`
func = d[str]
func() #call the function!
Something like this in C++ would take up sooo many lines of code!
The easiest is to use an IDE. PyDev is a plugin for eclipse.
I'm not a full on expert in all ways pythonic, but in general I define my class members right under the class definition in python, so if I add members, they're all relative.
My personal opinion is that class members should be declared in one section, for this specific reason.
Local scoped variables, otoh, should be defined closest to when they are used (except in C--which I believe still requires variables to be declared at the beginning of a method).
Consider using slots.
For example:
class Foo:
__slots__ = "a b c".split()
x = Foo()
x.a =1 # ok
x.b =1 # ok
x.c =1 # ok
x.bb = 1 # will raise "AttributeError: Foo instance has no attribute 'bb'"
It is generally a concern in any dynamic programming language -- any language that does not require variable declaration -- that a typo in a variable name will create a new variable instead of raise an exception or cause a compile-time error. Slots helps with instance variables, but doesn't help you with, module-scope variables, globals, local variables, etc. There's no silver bullet for this; it's part of the trade-off of not having to declare variables.

Categories