Cyclomatic complexity metric practices for Python [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have a relatively large Python project that I work on, and we don't have any cyclomatic complexity tools as a part of our automated test and deployment process.
How important are cyclomatic complexity tools in Python? Do you or your project use them and find them effective? I'd like a nice before/after story if anyone has one so we can take a bit of the subjectiveness out of the answers (i.e. before we didn't have a cyclo-comp tool either, and after we introduced it, good thing A happened, bad thing B happened, etc). There are a lot of other general answers to this type of question, but I didn't find one for Python projects in particular.
I'm ultimately trying to decide whether or not it's worth it for me to add it to our processes, and what particular metric and tool/library is best for large Python projects. One of our major goals is long term maintenance.

We used the RADON tool in one of our projects which is related to Test Automation.
RADON
Depending on new features and requirements, we need to add/modify/update/delete codes in that project. Also, almost 4-5 people were working on this. So, as a part of review process, we identified and used RADON tools since we want our code maintainable and readable.
Depend on the RADON tool output, there were several times we re-factored our code, added more methods and modified the looping.
Please let me know if this is useful to you.

Python isn't special when it comes to cyclomatic complexity. CC measures how much branching logic is in a chunk of code.
Experience shows that when the branching is "high", that code is harder to understand and change reliably than code in which the branching is lower.
With metrics, it typically isn't absolute values that matter; it is relative values as experienced by your organization. What you should do is to measure various metrics (CC is one) and look for a knee in the curve that relates that metric to bugs-found-in-code. Once you know where the knee is, ask coders to write modules whose complexity is below the knee. This is the connection to long-term maintenance.
What you don't measure, you can't control.

wemake-python-styleguide supports both radon and mccabe implementations of Cyclomatic Complexity.
There are also different complexity metrics that are not covered by just Cyclomatic Complexity, including:
Number of function decorators; lower is better
Number of arguments; lower is better
Number of annotations; higher is better
Number of local variables; lower is better
Number of returns, yields, awaits; lower is better
Number of statements and expressions; lower is better
Read more about why it is important to obey them: https://sobolevn.me/2019/10/complexity-waterfall
They are all covered by wemake-python-styleguide.
Repo: https://github.com/wemake-services/wemake-python-styleguide
Docs: https://wemake-python-stylegui.de

You can also use mccabe library. It counts only McCabe complexity, and can be integrated in your flake8 linter.

Related

Is there a good option for creation of custom branching rules in branch-and-bound for MILPs in Python?

Basically, I want to recreate the conceptual results from the paper "Learning to Branch in Mixed Integer Programming" by Khalil, et al, at the same time avoiding, if possible:
1)The necessity of obtaining an academic license for CPLEX (which was used in the paper) or similar serious commercial solver
2)The necessity of using C based API. This is not a strict requirement, but Python has the benefit of having good and very accessible ML libraries, which seems like a great advantage for this specific goal
I am aware, that there is a great number of open source Python based MILP solvers, but a lot of them focus on the end-to-end solution of relatively simple problems in their presentation and, if we also consider, that a lot of them (if not all) hook up to other C based solvers, it is highly non-obvious to judge, which ones actually have needed customization potential.
So, if anyone has more in-depth experience with trying to customize Python solvers for their highly specific needs, I would appreciate the advice.
I'm afraid you will hit a roadblock at some point there. It's really hard to do that without doing C/C++ work (imho).
Python-way
I only know three projects with some low-level functionality (and it's still hard to say if those fit your needs).
https://github.com/coin-or/python-mip
relatively new
promises interactive cut-gen
has a chapter Developing Customized Branch-&-Cut algorithms
but i'm not sure if there is enough freedom for your task (seems to focus on cuts for now)
build around open-source solver Cbc/Clp (besides Gurobi)
https://github.com/coin-or/CyLP
not much develeopment for years now
the whole python-3 dev was sad (see issues; pull-request not processed for years; it's a resource problem: the maintainers are nice people!)
was designed to research pivoting
but it also says: For example, you may
.. define cut generators, branch-and-bound strategies
hard to see how to achieve what you look for except for abstract LP-relax - fix - resolve
might be hard to control specifics (warm-start vs. hot-start)
build around open-source Cbc/Clp
https://github.com/SCIP-Interfaces/PySCIPOpt
basic docs show more high-level usage
but it's internal code at least has entries for branchexeclp and co.
maybe it's ready to use (maybe not)
raw list of interface classes
as those things (maybe) wrap the original C-API, there is a lot of good documentation in the parent-project!
build around open-source solver SCIP
easier to grab the solver in academic setting, but by no means free (i'm not a lawyer and won't try to find the right words)
at least one developer of it is active on StackOverflow
Alternative: C++
If trying to get full-control; which might be needed, with minimal need for understanding the underlying solver in all it's details, you probably want to use C/C++ within Coin OSI. Sadly the Cbc part is unmaintained, but depending on your exact task, you might only need Clp for example.
Alternative: Julia
I did not follow the recent developments there, but the language did have a strong early focus on Mathematical Optimization (driven by a big group of people) surpassing python even in it's early days (imho!).
But i'm not sure if MathOptInterface is fine-grained enough for your task.

What is the difference between Property Based Testing and Mutation testing?

My context for this question is in Python.
Hypothesis Testing Library (i.e. Property Based Testing):
https://hypothesis.readthedocs.io/en/latest/
Mutation Testing Library:
https://github.com/sixty-north/cosmic-ray
These are very different beasts but both would improve the value and quality of your tests. Both tools contribute to and make the "My code coverage is N%" statement more meaningful.
Hypothesis would help you to generate all sorts of test inputs in the defined scope for a function under test.
Usually, when you need to test a function, you provide multiple example values trying to cover all the use cases and edge cases driven by the code coverage reports - this is so called "Example based testing". Hypothesis on the other hand implements a property-based testing generating a whole bunch of different inputs and input combinations helping to catch different common errors like division by zero, None, 0, off-by-one errors etc and helping to find hidden bugs.
Mutation testing is all about changing your code under test on the fly while executing your tests against a modified version of your code.
This really helps to see if your tests are actually testing what are they supposed to be testing, to understand the value of your tests. Mutation testing would really shine if you already have a rich test code base and a good code coverage.
What helped me to get ahold of these concepts were these Python Podcasts:
Property-based Testing with Hypothesis
Validating Python tests with mutation testing
Hypothesis with David MacIver
I'm the author or mutmut, the (imo) best mutation tester for python. #alecxe has a very good answer but I would like to expand on it. Read his answer before mine for basic context.
There are some big other differences, like PBT requires mental work to specify the rules for each function under test while MT requires you to justify all behavior in the code which requires much less cognitive effort.
MT is effectively white box and PBT black box.
Another difference is that MT is the exploration of a (fairly small) finite space, while PBT is an exploration of an infinite space (practically speaking). A practical consequence is that you can trivially know when you are done with MT, while you can have a PBT run going for years and you can't know if it has searched the relevant parts of the space. Better rules for PBT radically cuts the run time for this reason.
Mutation testing also forces minimal code. This is a surprising effect, but it's something I have experienced again and again. This is a nice little bonus for MT.
You can also use MT as a simple checklist to get to 100% mutation coverage, you don't need to start with 100% coverage, not at all. But with PBT you can start way below 100% coverage, in essence at 0% before you start.
I hope this clarifies the situation a bit more.

Python - side effects/purity analysis tools? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Are there any existing tools for side effects/purity analysis in Python, similar to http://jppa.sourceforge.net in Java?
I don't know of any that exist, but here are some general approaches to making one:
Analysing source files as text - using regular expressions to find things that show a function definitely isn't pure - e.g. the global keyword. For practical purposes, most decently written functions that only have a return statement in the body are likely to be pure. On the other hand, if a function doesn't have a return statement, it is either useless, or impure.
Analysing functions in a source file as code. If testing a function in isolation produces a NameError, you know that it is either impure (because it doesn't have access to variables at a higher level), or has a mistake in it (referring to a variable before it is defined or some such), however the latter case should be covered by normal testing. The inspect module's function isfunction may be useful if you want to do this.
For each function you test, if it has a relatively small domain (e.g. one input that can either be 1, 2, 3 or 4) then you could exhaustively test all possible inputs, and get a certain answer this way. If it has a limited, or finite but large domain (e.g. all the real numbers between 0 and 1000 (infinite but limited), or all the integers between -12345 and 67890) then you could try sampling a selection of inputs in that domain, and use that to get a probability of purity. However, this approach may not be very useful, as the domain of the function is unlikely to be specified, so you may only be able to check it if you wrote the function, in which case you may not need to analyse it anyway.
Doing something clever, possibly in combination with the above techniques. For instance, making a neural network, with the input as the text of a function, and the output as the likelihood of it being pure. You could then train the network on examples of functions you know to be pure or impure, and then use it on functions of unknown purity.
Edit:
I came back to this question after someone downvoted with new knowledge! The ast module should make it relatively easy to write your own analysis tool like this, as it allows you access to the abstract syntax tree of code. It should be fairly easy to walk through this tree and see if there is anything preventing purity. This is a much better approach than analysing source files as text, and I might have a go at it at some point.
Finally, this question might also be useful, and also this one, which is basically a duplicate of this question.

Graph library API [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am creating a library to support some standard graph traversals. Some of the graphs are defined explicitly: i.e., all edges are added by providing a data structure, or by repeatedly calling a relevant method. Some graphs are only defined implicitly: i.e., I only can provide a function that, given a node, will return its children (in particular, all the infinite graphs I traverse must be defined implicitly, of course).
The traversal generator needs to be highly customizable. For example, I should be able to specify whether I want DFS post-order/pre-order/in-order, BFS, etc.; in which order the children should be visited (if I provide a key that sorts them); whether the set of visited nodes should be maintained; whether the back-pointer (pointer to parent) should be yielded along with the node; etc.
I am struggling with the API design for this library (the implementation is not complicated at all, once the API is clear). I want it to be elegant, logical, and concise. Is there any graph library that meets these criteria that I can use as a template (doesn't have to be in Python)?
Of course, if there's a Python library that already does all of this, I'd like to know, so I can avoid coding my own.
(I'm using Python 3.)
if you need to handle infinite graphs then you are going to need some kind of functional interface to graphs (as you say in the q). so i would make that the standard representation and provide helper functions that take other representations and generate a functional representation.
for the results, maybe you can yield (you imply a generator and i think that is a good idea) a series of result objects, each of which represents a node. if the user wants more info, like backlinks, they call a method on that, and the extra information is provided (calculated lazily, where possible, so that you avoid that cost for people that don't need it).
you don't mention if the graph is directed or not. obviously you can treat all graphs as directed and return both directions. but then the implementation is not as efficient. typically (eg jgrapht) libraries have different interfaces for different kinds of graph.
(i suspect you're going to have to iterate a lot on this, before you get a good balance between elegant api and efficiency)
finally, are you aware of the functional graph library? i am not sure how it will help, but i remember thinking (years ago!) that the api there was a nice one.
The traversal algorithm and the graph data structure implementation should be separate, and should talk to each other only through the standard API. (If they are coupled, each traversal algorithm would have to be rewritten for every implementation.)
So my question really has two parts:
How to design the API for the graph data structure (used by graph algorithms such as traversals and by client code that creates/accesses graphs)
How to design the API for the graph traversal algorithms (used by client code that needs to traverse a graph)
I believe C++ Boost Graph Library answers both parts of my question very well. I would expect it can be (theoretically) rewritten in Python, although there may be some obstacles I don't see until I try.
Incidentally, I found a website that deals with question 1 in the context of Python: http://wiki.python.org/moin/PythonGraphApi. Unfortunately, it hasn't been updated since Aug 2011.

Good geometry library in python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am looking for a good and well developed library for geometrical manipulations and evaluations in python, like:
evaluate the intersection between two lines in 2D and 3D (if present)
evaluate the point of intersection between a plane and a line, or the line of intersection between two planes
evaluate the minimum distance between a line and a point
find the orthonormal to a plane passing through a point
rotate, translate, mirror a set of points
find the dihedral angle defined by four points
I have a compendium book for all these operations, and I could implement it but unfortunately I have no time, so I would enjoy a library that does it. Most operations are useful for gaming purposes, so I am sure that some of these functionalities can be found in gaming libraries, but I would prefer not to include functionalities (such as graphics) I don't need.
Any suggestions ? Thanks
Perhaps take a look at SymPy.
Shapely is a nice python wrapper around the popular GEOS library.
I found pyeuclid to be a great simple general purpose euclidean math package. Though the library may not contain exactly the problems that you mentioned, its infrastructure is good enough to make it easy to write these on your own.
CGAL has Python bindings too.
I really want a good answer to this question, and the ones above left me dissatisfied. However, I just came across pythonocc which looks great, apart from lacking good docs and still having some trouble with installation (not yet pypi compatible). The last update was 4 days ago (June 19th, 2011). It wraps OpenCascade which has a ton of geometry and modeling functionality. From the pythonocc website:
pythonOCC is a 3D CAD/CAE/PLM development framework for the Python programming language. It provides features such as advanced topological and geometrical operations, data exchange (STEP, IGES, STL import/export), 2D and 3D meshing, rigid body simulation, parametric modeling.
[EDIT: I've now downloaded pythonocc and began working through some of the examples]
I believe it can perform all of the tasks mentioned, but I found it to be unintuitive to use. It is created almost entirely from SWIG wrappers, and as a result, introspection of the commands becomes difficult.
geometry-simple has classes Point Line Plane Movement in ~ 300 lines, using only numpy; take a look.
You may be interested in Python module SpaceFuncs from OpenOpt project, http://openopt.org
SpaceFuncs is tool for 2D, 3D, N-dimensional geometric modeling with possibilities of parametrized calculations, numerical optimization and solving systems of geometrical equations
Python Wild Magic is another SWIG wrapped code. It is however a gaming library, but you could manipulate the SWIG library file to exclude any undesired graphics stuff from the Python API.

Categories