PyCharm: regex string intentions for function arguments - python

I have a function that takes a string argument that will be compiled into a regular expression, like so:
class Pattern:
def __init__(self, pattern, **kwargs) -> None:
self.re = re.compile(pattern)
self.extras = dict(**kwargs)
... more methods ...
When I initialize a Pattern object, I'd love for PyCharm (actually, I'm using IntelliJ IDEA with the Python plugin, but I think that's essentially the same platform) to know that my pattern argument string should conform to Python's regular expression syntax, and can be syntax-highlighted, just like when using re.compile(pattern):
# This won't be recognized as a regex intention:
p1 = Pattern(r'^(?P<site>[A-Z]{4}).(?P<unit>\d{4})')
# But this will:
p2 = re.compile(r'^(?P<site>[A-Z]{4}).(?P<unit>\d{4})')
Is there any way to give the IDE a hint here, that whatever is passed as the pattern argument to Pattern should have regex syntax? Ideally in some kind of hint comment or type hint that I could put in the code so it would work for other developers too.
I do know that:
I could create a plugin for this - in my case that would probably be vast overkill.
I can do "Inject language or reference" on an ad-hoc basis for any particular call to Pattern(...) - but that's temporary, it's tedious when there are a lot of such calls, and it doesn't help other developers read the code more easily.
Edit:
This question is the direct analog of "Can one automatically get Intellij's regex assistance for one's own regex parameters", but for Python instead of Java. The IntelliLang solution in that answer seems to be only applicable to editing Java or XML files.

I found one way but it works not so smoothly as I expected.
File - Settings - Editor - Language Injections - Add:
Generic Python
ID: RegExp
Places Patterns:
+ pyLiteralExpression().and(pyMethodArgument("Pattern", 0, "foo.bar.Pattern"))
Now RegExp will be injected into arguments to Pattern.
Next step is to share this injection. The injection was created with IDE-level scope and it is stored in IDE settings that are impossible to share.
To put this injection into project settings you have to move injection to Project-level scope (corresponding button is below the button that adds injection) but then it stops working :( The only way is to Export-Import injection but it is not convenient :(
Feel free to create an issue here.

Related

Keeping alias types simple in Python documentation?

I'm trying to use the typing module to document my Python package, and I have a number of situations where several different types are allowable for a function parameter. For instance, you can either pass a number, an Envelope object (one of the classes in my package), or a list of numbers from which an Envelope is constructed, or a list of lists of numbers from which an envelope is constructed. So I make an alias type as follows:
NumberOrEnvelope = Union[Sequence[Real], Sequence[Sequence[Real]], Real, Envelope]
Then I write the function:
def example_function(parameter: NumberOrEnvelope):
...
And that looks great to me. However, when I create the documentation using Sphinx, I end up with this horrifically unreadable function signature:
example_function(parameter: Union[Sequence[numbers.Real], Sequence[Sequence[numbers.Real]], numbers.Real, expenvelope.envelope.Envelope])
Same thing also with the hints that pop up when I start to try to use the function in PyCharm.
Is there some way I can have it just leave it as "NumberOrEnvelope". Ideally that would also link in the documentation to a clarification of what "NumberOrEnvelope" is, though even if it didn't it would be way better than what's appearing now.
I had the same issue and used https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_type_aliases, introduced in version 3.3.
In your sphinx conf.py, insert this section. It does not seem to make much sense at the first sight, but does the trick:
autodoc_type_aliases = dict(NumberOrEnvelope='NumberOrEnvelope')
Warning: It only works in modules that start with from __future__ import annotation
Note: If there is a target in the documentation, type references even have a hyperlink to the definition. I have classes, documented elsewhere with autoclass, which are used as types of function parameters, and the docs show the nice names of the types with links.
Support for this appears to be in the works.
See Issue #6518.
That issue can be closed by the recent updates to Pull Request #8007 (under review).
If you want the fix ASAP, you can perhaps try using that build.
EDIT: This doesn't quite work, sadly.
Turns out after a little more searching, I found what I was looking for. Instead of:
NumberOrEnvelope = Union[Sequence[Real], Sequence[Sequence[Real]], Real, Envelope]
I found that you can create your own compound type that does the same thing:
NumberOrEnvelope = TypeVar("NumberOrEnvelope", Sequence[Real], Sequence[Sequence[Real]], Real, Envelope)
This displays in documentation as "NumberOrEnvelope", just as I wanted.

Python: tell my IDE what type an object is

When I write a function in Python (v2.7), I very often have a type in mind for one of the arguments. I'm working with the unbelievably brilliant pandas library at the movemement, so my arguments are often 'intended' to be pandas.DataFrames.
In my favorite IDE (Spyder), when you type a period . a list of methods appear. Also, when you type the opening parenthesis of a method, the docstring appears in a little window.
But for these things to work, the IDE has to know what type a variable is. But of course, it never does. Am I missing something obvious about how to write Pythonic code (I've read Python Is Not Java but it doesn't mention this IDE autocomplete issue.
Any thoughts?
I don't know if it works in Spyder, but many completion engines (e.g. Jedi) also support assertions to tell them what type a variable is. For example:
def foo(param):
assert isinstance(param, str)
# now param will be considered a str
param.|capitalize
center
count
decode
...
Actually I use IntelliJ idea ( aka pyCharm ) and they offer multiple ways to specify variable types:
1. Specify Simple Variable
Very simple: Just add a comment with the type information behind the definition. From now on Pycharm supports autcompletition! e.g.:
def route():
json = request.get_json() # type: dict
Source: https://www.jetbrains.com/help/pycharm/type-hinting-in-pycharm.html
2. Specify Parameter:
Add three quote signs after the beginning of a method and the idea will autocomplete a docstring, as in the following example:
Source: https://www.jetbrains.com/help/pycharm/using-docstrings-to-specify-types.html
(Currently on my mobile, going to make it pretty later)
If you're using Python 3, you can use function annotations. As an example:
#typechecked
def greet(name: str, age: int) -> str:
print("Hello {0}, you are {1} years old".format(name, age))
I don't use Spyder, but I would assume there's a way for it to read the annotations and act appropriately.
I don't know whether Spyder reads docstrings, but PyDev does:
http://pydev.org/manual_adv_type_hints.html
So you can document the expected type in the docstring, e.g. as in:
def test(arg):
'type arg: str'
arg.<hit tab>
And you'll get the according string tab completion.
Similarly you can document the return-type of your functions, so that you can get tab-completion on foo for foo = someFunction().
At the same time, docstrings make auto-generated documention much more helpful.
The problem is with the dynamic features of Python, I use Spyder and I've used a lot more of python IDEs (PyCharm, IDLE, WingIDE, PyDev, ...) and ALL of them have the problem you stated here. So, when I want code completion for help I just instantiate the variable to the type I want and then type ".", for example: suppose you know your var df will be a DataFrame in some piece of code, you can do this df = DataFrame() and for now on the code completion should work for you just do not forget to delete (or comment) the line df = DataFrame() when you finish editing the code.

Force str() and the two string literals to create a custom string?

I have a subclass of the 'str' object and I need to apply it to the application domain. Since the application is full of string literal and str() casts it will take a long time to change all of them to the custom string class. I know it is possible to override 'str()' but I'm not sure about string literals.
I know it is not a good idea but the application requires it. Is it possible to do it in Python? And if not, does it require me to modify 'stringlib' which is the C implementation of the 'str' object?
I'm also aware of the fact that modifying the parser to make the application run is considered to be a very bad programming practice. Can this be achieved via an extension rather than a parser modification?
You cannot modify built-in types, e.g. you cannot add new attributes at runtime1:
>>> setattr(str, "hello", lambda: "Hello custom str!")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'str'
I would try to parse all literal strings and replace them with the constructor of your custom string subclass. This is first and foremost:
explicit, which makes it easier for you and other to remember, that this string is different.
Typically i18n libraries have some kind of string literal detection routines for code, which could come in handy during detection. Once these strings are gathered, it's more or less just replace.
1 This is also one reason, python isn't as flexible as ruby. You can't just write test libraries, that monkey patch core classes and add methods like should_equal to every object in the runtime. Here is an entertaining tech talk about these nuances and their consequencies: http://blog.extracheese.org/2010/02/python-vs-ruby-a-battle-to-the-death.html
I ran into this question because I was trying to do something similar - str.format behavior from python >=2.7 working on python >= 2.4. FormattableString from stringformat (https://github.com/florentx/stringformat) seemed like the right thing, but it was only useful if I could make it work on literals.
Despite most comments, monkey-patching builtins and literals in python does appear to be possible using code I found here: https://gist.github.com/295200. This appears to be written by Armin Ronacher - certainly not by me.
In the end, I was able to do something like this in python 2.6 even though it normally works only in 2.7.
dx = get_class_dict(str)
dx['format'] = lambda x,*args,**kargs: FormattableString(x).format(*args,**kargs)
print "We've monkey patched {}, {}".format('str', 'with multiple parameters')
The example in the link above shows how to add a new method instead of replacing an existing one.
This little exercise prompted me to add the str and unicode monkey patching to the stringformat package, so now you can just do this:
# should work on python down to 2.4
import stringformat
print "[{0:{width}.{precision}s}]".format('hello world', width=8, precision=5)
This can be cloned from here: "https://github.com/igg/stringformat", untill/unless it gets pulled into the "official" stringformat. Ha. Only two hyperlinks for me, so no clicky for you!

Pylint best practices

Pylint looks like a good tool for running analysis of Python code.
However, our main objective is to catch any potential bugs and not coding conventions. Enabling all Pylint checks seems to generate a lot of noise. What is the set of Pylint features you use and is effective?
You can block any warnings/errors you don't like, via:
pylint --disable=error1,error2
I've blocked the following (description from http://www.logilab.org/card/pylintfeatures):
W0511: Used when a warning note as FIXME or XXX is detected
W0142: Used * or * magic*. Used when a function or method is called using *args or **kwargs to dispatch arguments. This doesn't improve readability and should be used with care.
W0141: Used builtin function %r. Used when a black listed builtin function is used (see the bad-function option). Usual black listed functions are the ones like map, or filter, where Python offers now some cleaner alternative like list comprehension.
R0912: Too many branches (%s/%s). Used when a function or method has too many branches, making it hard to follow.
R0913: Too many arguments (%s/%s). Used when a function or method takes too many arguments.
R0914: Too many local variables (%s/%s). Used when a function or method has too many local variables.
R0903: Too few public methods (%s/%s). Used when class has too few public methods, so be sure it's really worth it.
W0212: Access to a protected member %s of a client class. Used when a protected member (i.e. class member with a name beginning with an underscore) is access outside the class or a descendant of the class where it's defined.
W0312: Found indentation with %ss instead of %ss. Used when there are some mixed tabs and spaces in a module.
C0111: Missing docstring. Used when a module, function, class or method has no docstring. Some special methods like __init__ don't necessarily require a docstring.
C0103: Invalid name "%s" (should match %s). Used when the name doesn't match the regular expression associated to its type (constant, variable, class...).
To persistently disable warnings and conventions:
Create a ~/.pylintrc file by running pylint --generate-rcfile > ~/.pylintrc
Edit ~/.pylintrc
Uncomment disable= and change that line to disable=W,C
Pyflakes should serve your purpose well.
-E will only flag what Pylint thinks is an error (i.e., no warnings, no conventions, etc.)
Using grep like:
pylint my_file.py | grep -v "^C"
Edit :
As mentionned in the question, to remove the conventions advices from pylint output, you remove the lines that start with an uppercase C.
From the doc of pylint, the output consists in lines that fit the format
MESSAGE_TYPE: LINE_NUM:[OBJECT:] MESSAGE
and the message type can be:
[R]efactor for a “good practice” metric violation
[C]onvention for coding standard violation
[W]arning for stylistic problems, or minor programming issues
[E]rror for important programming issues (i.e. most probably bug)
[F]atal for errors which prevented further processing
Only the first letter is displayed, so you can play with grep to select/remove the level of message type you want.
I didn't use Pylint recently, but I would probably use a parameter inside Pylint to do so.

Ruby equivalent to Python's help()?

When working in interactive Python, I tend to rely on the built-in help() function to tell me what something expects and/or returns, and print out any documentation that might help me. Is there a Ruby equivalent to this function?
I'm looking for something I could use in irb. For example, in interactive Python I could type:
>>> help(1)
which would then print
Help on int object:
class int(object) | int(x[, base])
-> integer | |
Convert a string or number to an integer, if possible. A ...
It's now late 2014 and here's the two ways to get the Python help() *similarity, as long as you have the Ruby Docs installed:
From inside irb, You can call the help method with a string describing what you're looking for.
Example 1: help 'Array' for the Array class
Example 2: help 'Array#each' for the Array class each method.
From the command line, outside of irb, you can use the ri program:
Example 1: ri 'Array' for the Array class
Example 2: ri 'Array#each' for the Array class each method.
* Not quite as good as Python's, but still better than nothing
It's definitely a poor cousin to iPython's help, and one of the main features I miss after moving to Ruby, but you can also use ri from within irb. I'd recommend the wirble gem as an easy way to set this up.
Try using ri from the command line.
It takes a class name, method, or module as an argument, and gives you appropriate documentation. Many popular gems come with this form of documentation, as well, so it should typically work even beyond the scope of core Ruby modules.
There's supposed to be irb_help. But like mentioned in that post, it's broken in my ruby install as well.
For quick shell access to ruby documentation, just type ri followed by the method you're wanting to learn more about (from your shell).
For example:
ri puts
This must be fired up in your shell, not your irb (interactive ruby environment)
If you're in your irb environment, then another way, is to simply type help followed by the method you want to learn more about as follows:
help puts
However, this assumes that you have configured your Ruby environment correctly for that (help) to work properly within irb. I usually just have another shell open, and just use the ri directly for quick access when I'm in doubt about a certain method or arguments to a method.

Categories