I am writing a python script that uses the string.replace() method a lot. I know that if I was importing a python method from a module I could change its name using from like so:
from time import sleep as x
And then, x(5) would be the same as time.sleep(5). But how can I do that to the replace() function? It isn't from any external module, and I have tried to do this:
x = replace()
But it doesn't work. It says NameError: name 'replace' is not defined.
So please tell me how to "rename" the built-in replace function.
First, you should almost never be using string.replace. As the docs say:
The following list of functions are also defined as methods of string and Unicode objects; see section String Methods for more information on those. You should consider these functions as deprecated…
Second, this is wrong for two separate reasons:
x = replace()
First, there is no builtin named replace, and there's no global in your module named replace either. If you did from string import * or from string import replace, then you wouldn't get this error—but if you were doing that, you could just do from string import replace as x, exactly as you're already doing for time.sleep.
Second, those parentheses mean that you're calling the function, and assigning its return value to x, not using the function itself as a value.
So, I think what you want is this:
x = str.replace
That's accessing the replace method on str objects, and storing it in x. An "unbound method" like this can be called by passing an instance of str (that is, any normal string) as the first argument. So:
x(my_string, ' ', '_')
If you want to add the name x as a method on the str class itself, what you want is called "monkeypatching", and normally, it's very simple:
str.x = str.replace
Unfortunately, it doesn't work with most of the built-in types, at least in CPython; you'll get an error like this:
TypeError: can't set attributes of built-in/extension type 'str'
You could of course create your own subclass of str, and use that all over the place instead of str… but that won't help with string literals, strings you get back from other functions, etc., unless you wrap them explicitly. And I'm not sure it's worth the effort. But if you want to:
class s(str):
x = str.replace
Now you can do this:
z = s(function_returning_a_string())
z = z.x(' ', '_')
But notice that at the end, z is back to being a str rather than an s, so if you want to keep using x, you have to do this:
z = s(z.x(' ', '_'))
… and at a certain point, even if you're saving a few keystrokes, you're not saving nearly enough for the cost in readabiity and idiomaticitalnessity.
You can sort of do what you want:
a = 'asdf'
a_replace = a.replace
Now a_replace is a bound method object that does a.replace(whatever) when you call a_replace(whatever). You can also do
my_replace = str.replace
my_replace('asdf', 'a', 'f') # calls 'asdf'.replace('a', 'f')
However, what you probably want is
some_magic()
'asdf'.my_replace('a', 'f') # calls 'asdf'.replace('a', 'f')
and that's not possible without messing with things you're really not supposed to mess with:
# Awful hack. Last resort only.
import gc
for referrer in gc.get_referrers(str.replace):
if type(referrer) is dict and str.__dict__ in gc.get_referrers(referrer):
# Almost certainly the dict behind str's dictproxy.
referrer['my_replace'] = str.replace
break
This is absolutely NOT advisable to do in any of your projects. But this is possible:
We could do some metahacking - replacing the builtin str with our custom one
class myStr(str):
def def my_replace(self, __old, __new, __count):
return self.replace(__old, __new, __count)
__builtins__.__dict__['str'] = myStr
Now the all usages of str are replaced with our implementation. You can add or change whatever you want. In our case both replace methods will work, one inherited from str and one created by us:
print('dsadas'.replace('d', '_'))
>>> _sa_as
print('dsadas'.my_replace('d', '_'))
>>> _sa_as
But remember this is fun, but using those technics in real project could ruin a lot of other functionality.
Related
I'm looking at a case like this:
def parse_item(self, response):
item = MetrocItem()
def ver(string):
if string:
return string
else:
return 'null'
item['latitude'] = ver(response.xpath('//input[#id="latitude"]/#value').extract_first())
It works, but is there a better way to do this?
As #Graipher mentioned in the comments, this is certainly valid in some cases, but in your particular case, it is unnecessary. If your function depends on a local variable, then you're returning a closure that needs to be reconstructed every time you call the method. But in your case, the function is going to behave the same way every time you call the method, so it would make more sense to define it globally as a private method, or even as a lambda in your case.
ver = lambda x: x if x else 'null'
But the preferred approach would be to simply define it globally and start the name with an underscore to make the intention clear.
def _ver(string):
...
You can get rid of that function completely:
string = response.xpath('//input[#id="latitude"]/#value').extract_first()
item['latitude'] = string if string else 'null'
There are use cases for having a local function - either in a class method or other method, say if you want a closure capturing something local.
In you case you want something like a colasecing null:
e.g.
>>> s = None
>>> s or "None"
'None'
So, you can use
item['latitude'] = response.xpath('//input[#id="latitude"]#value').extract_first() or "null"
It is not bad practice, but here's an alternative.
Since this small function doesn't depend on anything in parse_item and you might want it to be constructed only once, for performance, you could do this:
def parse_item(self, response):
...
def _ver(string):
...
parse_item.ver = _ver
del _ver
and refer to it inside parse_item as parse_item.ver instead of just ver. This avoids cluttering the global namespace, even with an underscored method that Python partially hides for you.
I would have recommended the lambda expression as another alternative, but Silvio got to it first.
Also, the comments that it could be a global function, visible to all, only makes sense if there's any chance it might be used anywhere else.
Also also, you can eliminate this example completely, as doctorlove and Niklas have suggested, but I assume that you mean this as an example of more complex cases.
I guess parse_item is a method inside your class. Then you can use Python's "private" methods to make your code more readable and cleaner.
Check how to write Python "private" method and then check why it's not actually a private method. ;)
I'm trying to convert a list into callable functions. I'm writing a top level program to interface with a program that reads different data types.
I previously loaded a module that has these functions defined as well as many others, but we only want to use these 4. We want to be able to adjust from a top level the dataList
dataList= ['sat1', 'sat2', 'sat3', 'sat4'] #(top line input by user)
readData = ['ob.rd_'+L for L in dataList]
for n in range(0,np.size(dataList)): readData[n]('file.'+dataList[n])
Loaded module has functions ob.rd_sat1, ob.rd_sat2, ob.rd_sat3, etc (over 50 different read functions) defined and ready to go. I generate a list with only the functions I want to call. list of function names but defined as string not variable/function names
Here is where I am pulling my hair out by the roots. I cannot get the program to recognize the list of names as functions.
I get
Error: 'str' object is not callable.
I've googled this but cannot find any answer that works. Seems simple but ....
If the functions are in a module, or are methods on an object, you can use getattr to convert a string to the function of the same name.
Example:
import random
names = ("randint", "uniform")
for name in names:
func = getattr(random, name)
result = func(0,100)
print("{}: {}".format(name, result))
In your case, if you want to convert "sat1" into the function ob.rd_sat1, you can do it like this:
import ob
...
dataList= ['sat1', 'sat2', 'sat3', 'sat4']
funcs = [getattr(ob, "rd_" + name) for name in dataList]
For the third statement use
for n in range(0,np.size(dataList)): globals()[readData[n]]('file.'+dataList[n])
In case the functions are not global, use locals instead of globals!
Update 1:
As #Bryan Oakley has pointed out,
you can also use
dataList= ['sat1', 'sat2', 'sat3', 'sat4']
readData = ['rd_'+L for L in dataList]
for n in range(0,np.size(dataList)): getattr(ob, readData[n])('file.'+dataList[n])
How can I avoid lines like:
this_long_variable_name = this_long_variable_name.replace('a', 'b')
I thought I could avoid it by making a function, repl,
def repl(myfind, myreplace, s):
s = s.replace(myfind, myreplace)
print(s) # for testing purposes
return s
but because of stuff about the local vs. global namespaces that I don't understand, I can't get the function to return a changed value for this_long_variable_name. Here's what I've tried:
this_long_variable_name = 'abbbc'
repl('b', 'x', this_long_variable_name)
print('after call to repl, this_long_variable_name =', this_long_variable_name)
The internal print statement shows the expected: axxxc
The print statement after the call to repl show the unchanged: abbbbc
Of course, it works if I give up and accept the redundant typing:
this_long_variable_name = repl('b', 'x', this_long_variable_name)
BTW, it's not just about the length of what has to be retyped, even if the variable's name were 'a,' I would not like retyping a = a.replace(...)
Since in the function s is a parameter, I can't do:
global s
I even tried:
this_long_variable_name.repl('b', 'x')
which shows you both how little I understand and how desperate I am.
The issue you're running into is that Python strings are immutable. str.replace() returns an entirely new string, so in s = s.replace(myfind, myreplace), the name s no longer refers to the original string, which is why you don't see any change on the outside of the function's namespace.
There probably isn't a great solution to your problem. I recommend using a modern IDE or Python REPL with autocompletion to alleviate it. Trying to abuse the standard way of writing things like this may feel good to you, but it will confuse anyone else looking at your code.
Harry it does not work because inside your repl function you actually have a local copy of the content of your this_long_variable_name. This is called "pass by copy" which means python hands over a copy to the function. Unfortunately this is how python does it. Check also here:
Python: How do I pass a string by reference?
Also strings are immutable in python so if you wanna change them you always create a new modified version. Check here:
Aren't Python strings immutable?
Question would be why should you need long variable names in the first place?
I found similar code somewhere:
USER_CONTROLLED = 'a'
open("settings.py", "w").write("USER_CONTROLLED = %s" % eval(repr(a)))
And in another file:
import settings
x = settings.USER_CONTROLLED * [0]
Is this a security vulnerability?
In contrast to what you were told on IRC, there definitely is an x that makes eval(repr(x)) dangerous, so saying it just like that without any restrictions is wrong too.
Imagine a custom object that implements __repr__ differently. The documentation says on __repr__ that it “should look like a valid Python expression that could be used to recreate an object with the same value”. But there is simply nothing that can possibly enforce this guideline.
So instead, we could create a class that has a custom __repr__ that returns a string which when evaluated runs arbitrary code. For example:
class MyObj:
def __repr__ (self):
return "__import__('urllib.request').request.urlopen('http://example.com').read()"
Calling repr() on an object of that type shows that it returns a string that can surely be evaluated:
>>> repr(MyObj())
"__import__('urllib.request').request.urlopen('http://example.com').read()"
Here, that would just involve making a request to example.com. But as you can see, we can import arbitrary modules here and run code with them. And that code can have any kind of side effects. So it’s definitely dangerous.
If we however limit that x to known types of which we know what calling repr() on them will do, then we can indeed say when it’s impossible to run arbitrary code with it. For example, if that x is a string, then the implementation of unicode_repr makes sure that everything is properly escaped and that evaluating the repr() of that object will always return a proper string (which even equals x) without any side effects.
So we should check for the type before evaluating it:
if type(a) is not str:
raise Exception('Only strings are allowed!')
something = eval(repr(a))
Note that we do not use isinstance here to do an inheritance-aware type check. Because I could absolutely make MyObj above inherit from str:
>>> x = MyObj()
>>> isinstance(x, str)
True
>>> type(x)
<class '__main__.MyObj'>
So you should really test against concrete types here.
Note that for strings, there is actually no reason to call eval(repr(x)) because as mentioned above, this will result in x itself. So you could just assign x directly.
Coming to your actual use case however, you do have a very big security problem here. You want to create a variable assignment and store that code in a Python file to be later run by an actual Python interpreter. So you should absolutely make sure that the right side of the assignment is not arbitrary code but actually the repr of a string:
>>> a = 'runMaliciousCode()'
>>> "USER_CONTROLLED = %s" % eval(repr(a))
'USER_CONTROLLED = runMaliciousCode()'
>>> "USER_CONTROLLED = %s" % repr(a)
"USER_CONTROLLED = 'runMaliciousCode()'"
As you can see, evaluating the repr() will put the actual content on the right side of the assignment (since it’s equivalent to "…" % a). But that can then lead to malicious code running when you import that file. So you should really just insert the repr of the string there, and completely forget about using eval altogether.
Can I use the word type in my own code or is it reserved? My function header:
def get(
self,
region='Delhi',
city='Delhi',
category='Apartments',
type='For sale',
limit=60,
PAGESIZE=5,
year=2012,
month=1,
day=1,
next_page=None,
threetapspage=0,
):
Using type as a keyword argument to a function will mask the built-in function "type" within the scope of the function. So while doing so does not raise a SyntaxError, it is not considered good practice, and I would avoid doing so.
Neither. It's not a reserved word (a list of which can be found at http://docs.python.org/reference/lexical_analysis.html#keywords ), but it's generally a bad idea to shadow any builtin.
While others have pointed out that it's bad form to shadow python built-ins, this is only the case when either name a function or function parameter as type, however -
It should be noted that the python built-in type is not shadowed in any way if you were to name a class attribute as type.
Even when referencing your class attribute, it would always be prefixed by the class instance self or a custom instance variable - and the python built-in would not be hindered.
For example:
Okay:
>>> class SomeClass():
... type = 'foobar'
...
... def someFunction(self):
... return self.type
Not Okay:
>>> def type(): # Overrides python built-in in global scope
... pass
...
>>> def foobar(type):
... return type # Overrides python built-in within func
That is more than a decade old question and to be on the safe side, I would recommend using kind instead of type as argument.
For a long time I was considering building a rename recommendation for all reserved or builins, but seeing your question made me finally do it.
Please check python-keyword-aliases.md and feel free to propose new entries to that list.
type should absolutely be consider a reserved word. While it can be tempting to use this word for database fields, consider that the fact that type() is one of the most important debugging/ testing functions because it tells you the class of an object.
$ python
>>> x = 5
>>> s = "rockets"
>>> y = [1,2,3]
>>> print(type(x))
class 'int'
>>> print(type(s))
class 'str'
>>> print(type(y))
class 'list'
An atlas would be classified a type of book, but consider using the word "category" instead.
A definitive con with shadowing builtin variables like type and id: if you happen to copy/paste your code that overshadows id for example, and then you replace all the instances of your variable with something else, but forget to rename one instance so you still have dangling id used, the code will still be syntactically valid and linters may very well not even find the error, but you'll get really really weird results when you run the code, because id won't be a string or number as you expected. It'll be the builtin function id!
This tripped me up badly a few times.