Set variable without default datatype - python

I am creating a function to parse a page with selenium
def get_position_links(start_url, browser):
"""
Retrieve the position_links
"""
position_links = []
next_page_element = ""
next_page_attribute = ""
#kick off
browser.get(start_url)
def get_position_links_and_next_page_elememnt_in_the_current_page(position_links):
##Get the position_links within the page
#browser change appropriately with the page change
nonlocal next_page_element
nonlocal next_page_attribute
position_elements = browser.find_elements_by_class_name("position_link") # Retrieve the postions link elements
#select those only contain python in the title
position_elements = [p for p in position_elements if "python" in p.text.lower()]
#position_links as global variable set at the top
position_links.extend([p.get_attribute("href") for p in position_elements])
#nonlocal to avoied repeated return
next_page_element = browser.find_element_by_class_name("pager_next")
#next_page_attribute for the while flag.
next_page_attribute = next_page_element.get_attribute("class").strip()
#handle the start_url
get_position_links_and_next_page_elememnt_in_the_current_page()
#Traverse until there's no next pages.
while not next_page_attribute.endswith("disabled"):
# time.sleep(random.uniform(1,20))
next_page_element.click()
get_position_links_and_next_page_elememnt_in_the_current_page()
return position_links
In the enclosing function I declareed next_page_element = "" next_page_attribute = "" which I am not sure their data type.
However, I should set a data type for them randomly,
How could I set a variable without default data type like
var nextPageElement
var nextPageAttribute
in Javascript?

You can use this function to find the data type of the variable.
type()
For example,
a = 1.0
print(type(a))
Output: <class 'float'>
Explicit data type conversion is called 'Typecasting'
The general form of explicit data type conversion is
> (required_data_type)(expression)
you can dig into some of the commonly used explicit data type conversions.
links: https://www.datacamp.com/community/tutorials/python-data-type-conversion

I can't see any reason to use nonlocal variables here at all. You should just return the values from the function.
def get_position_links_and_next_page_elememnt_in_the_current_page(position_links):
...
return next_page_element, next_page_attribute
next_page_element, next_page_attribute = get_position_links_and_next_page_elememnt_in_the_current_page(position_links)
Now you don't need nonlocal, you don't need to predefine the elements, and you don't even really need a nested function.

Related

Playwright page.wait_for_event function how to access the page and other variables from inside the callable?

I'm trying to use the playwright page.wait_for_event function. One of the kwargs accepts a Callable. I'm trying to use a helper function that would take two arguments: the event that I'm waiting to fire, and a global variable. But I can't seem to figure out how to find and/or use a variable for the event to pass into the helper function. Most examples I see that use the wait_for_event function use a lambda function for the predicate argument which works great, but I need to perform an action before I return a boolean value which I also don't know how to do with a lambda function.
My apologies if my terminology is incorrect.
The function I'm trying to use as an argument:
def test(event, global_variable):
page.locator('//*[#id="n-currentevents"]/a').click() # Action before boolean
if event.url == 'https://en.wikipedia.org/':
return True
The variations of the page.wait_for_event function I tried:
# Doesn't work
r = page.wait_for_event('request', test('request', global_variable))
r = page.wait_for_event('request', test(event, global_variable))
r = page.wait_for_event(event='request', test(event, global_variable))
r = page.wait_for_event(event:'request', test(event, global_variable))
# Lambda works, but I need to click an element before returning the truth value
r = page.wait_for_event(event='request', lambda req : req.url ==
'https://en.wikipedia.org/')
The wait_for_event function:
def wait_for_event(
self, event: str, predicate: typing.Callable = None, *, timeout: float = None
) -> typing.Any:
"""Page.wait_for_event
> NOTE: In most cases, you should use `page.expect_event()`.
Waits for given `event` to fire. If predicate is provided, it passes event's value into the `predicate` function and
waits for `predicate(event)` to return a truthy value. Will throw an error if the page is closed before the `event` is
fired.
Parameters
----------
event : str
Event name, same one typically passed into `*.on(event)`.
predicate : Union[Callable, NoneType]
Receives the event data and resolves to truthy value when the waiting should resolve.
timeout : Union[float, NoneType]
Maximum time to wait for in milliseconds. Defaults to `30000` (30 seconds). Pass `0` to disable timeout. The default
value can be changed by using the `browser_context.set_default_timeout()`.
Returns
-------
Any
"""
return mapping.from_maybe_impl(
self._sync(
self._impl_obj.wait_for_event(
event=event,
predicate=self._wrap_handler(predicate),
timeout=timeout,
)
)
)
Update: I was able to accomplish my task with a more appropriate method.
URL = "https://en.wikipedia.org"
with page.expect_response(lambda response: response.url == URL as response_info:
page.locator('//xpath').click()
response = response_info.value
You can use a factory function here to pass the page, and the global_variable. Keep in mind that navigating away from the page from inside the callable will lead to an error. So make sure whatever you are clicking does not change the current URL of the page.
def wrapper(page, global_variable):
def test(event):
page.locator('//*[#id="n-currentevents"]/a').click() # Action before boolean
if event.url == 'https://en.wikipedia.org/':
return True
return test
Then, you can register the above function using page.wait_for_event like this:
page.wait_for_event('request', wrapper(page, global_variable))
Remember: You need to pass functions/callables (not their return values) to page.wait_for_event

Is there a way to give a function access to the (external-scope) name of the variable being passed in? [duplicate]

This question already has answers here:
Getting the name of a variable as a string
(32 answers)
Closed 4 months ago.
Is it possible to get the original variable name of a variable passed to a function? E.g.
foobar = "foo"
def func(var):
print var.origname
So that:
func(foobar)
Returns:
>>foobar
EDIT:
All I was trying to do was make a function like:
def log(soup):
f = open(varname+'.html', 'w')
print >>f, soup.prettify()
f.close()
.. and have the function generate the filename from the name of the variable passed to it.
I suppose if it's not possible I'll just have to pass the variable and the variable's name as a string each time.
EDIT: To make it clear, I don't recommend using this AT ALL, it will break, it's a mess, it won't help you in any way, but it's doable for entertainment/education purposes.
You can hack around with the inspect module, I don't recommend that, but you can do it...
import inspect
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.getframeinfo(frame[0]).code_context[0].strip()
args = string[string.find('(') + 1:-1].split(',')
names = []
for i in args:
if i.find('=') != -1:
names.append(i.split('=')[1].strip())
else:
names.append(i)
print names
def main():
e = 1
c = 2
foo(e, 1000, b = c)
main()
Output:
['e', '1000', 'c']
To add to Michael Mrozek's answer, you can extract the exact parameters versus the full code by:
import re
import traceback
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
vars_name = re.compile(r'\((.*?)\).*$').search(code).groups()[0]
print vars_name
return
foobar = "foo"
func(foobar)
# PRINTS: foobar
Looks like Ivo beat me to inspect, but here's another implementation:
import inspect
def varName(var):
lcls = inspect.stack()[2][0].f_locals
for name in lcls:
if id(var) == id(lcls[name]):
return name
return None
def foo(x=None):
lcl='not me'
return varName(x)
def bar():
lcl = 'hi'
return foo(lcl)
bar()
# 'lcl'
Of course, it can be fooled:
def baz():
lcl = 'hi'
x='hi'
return foo(lcl)
baz()
# 'x'
Moral: don't do it.
Another way you can try if you know what the calling code will look like is to use traceback:
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
code will contain the line of code that was used to call func (in your example, it would be the string func(foobar)). You can parse that to pull out the argument
You can't. It's evaluated before being passed to the function. All you can do is pass it as a string.
#Ivo Wetzel's answer works in the case of function call are made in one line, like
e = 1 + 7
c = 3
foo(e, 100, b=c)
In case that function call is not in one line, like:
e = 1 + 7
c = 3
foo(e,
1000,
b = c)
below code works:
import inspect, ast
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.findsource(frame[0])[0]
nodes = ast.parse(''.join(string))
i_expr = -1
for (i, node) in enumerate(nodes.body):
if hasattr(node, 'value') and isinstance(node.value, ast.Call)
and hasattr(node.value.func, 'id') and node.value.func.id == 'foo' # Here goes name of the function:
i_expr = i
break
i_expr_next = min(i_expr + 1, len(nodes.body)-1)
lineno_start = nodes.body[i_expr].lineno
lineno_end = nodes.body[i_expr_next].lineno if i_expr_next != i_expr else len(string)
str_func_call = ''.join([i.strip() for i in string[lineno_start - 1: lineno_end]])
params = str_func_call[str_func_call.find('(') + 1:-1].split(',')
print(params)
You will get:
[u'e', u'1000', u'b = c']
But still, this might break.
You can use python-varname package
from varname import nameof
s = 'Hey!'
print (nameof(s))
Output:
s
Package below:
https://github.com/pwwang/python-varname
For posterity, here's some code I wrote for this task, in general I think there is a missing module in Python to give everyone nice and robust inspection of the caller environment. Similar to what rlang eval framework provides for R.
import re, inspect, ast
#Convoluted frame stack walk and source scrape to get what the calling statement to a function looked like.
#Specifically return the name of the variable passed as parameter found at position pos in the parameter list.
def _caller_param_name(pos):
#The parameter name to return
param = None
#Get the frame object for this function call
thisframe = inspect.currentframe()
try:
#Get the parent calling frames details
frames = inspect.getouterframes(thisframe)
#Function this function was just called from that we wish to find the calling parameter name for
function = frames[1][3]
#Get all the details of where the calling statement was
frame,filename,line_number,function_name,source,source_index = frames[2]
#Read in the source file in the parent calling frame upto where the call was made
with open(filename) as source_file:
head=[source_file.next() for x in xrange(line_number)]
source_file.close()
#Build all lines of the calling statement, this deals with when a function is called with parameters listed on each line
lines = []
#Compile a regex for matching the start of the function being called
regex = re.compile(r'\.?\s*%s\s*\(' % (function))
#Work backwards from the parent calling frame line number until we see the start of the calling statement (usually the same line!!!)
for line in reversed(head):
lines.append(line.strip())
if re.search(regex, line):
break
#Put the lines we have groked back into sourcefile order rather than reverse order
lines.reverse()
#Join all the lines that were part of the calling statement
call = "".join(lines)
#Grab the parameter list from the calling statement for the function we were called from
match = re.search('\.?\s*%s\s*\((.*)\)' % (function), call)
paramlist = match.group(1)
#If the function was called with no parameters raise an exception
if paramlist == "":
raise LookupError("Function called with no parameters.")
#Use the Python abstract syntax tree parser to create a parsed form of the function parameter list 'Name' nodes are variable names
parameter = ast.parse(paramlist).body[0].value
#If there were multiple parameters get the positional requested
if type(parameter).__name__ == 'Tuple':
#If we asked for a parameter outside of what was passed complain
if pos >= len(parameter.elts):
raise LookupError("The function call did not have a parameter at postion %s" % pos)
parameter = parameter.elts[pos]
#If there was only a single parameter and another was requested raise an exception
elif pos != 0:
raise LookupError("There was only a single calling parameter found. Parameter indices start at 0.")
#If the parameter was the name of a variable we can use it otherwise pass back None
if type(parameter).__name__ == 'Name':
param = parameter.id
finally:
#Remove the frame reference to prevent cyclic references screwing the garbage collector
del thisframe
#Return the parameter name we found
return param
If you want a Key Value Pair relationship, maybe using a Dictionary would be better?
...or if you're trying to create some auto-documentation from your code, perhaps something like Doxygen (http://www.doxygen.nl/) could do the job for you?
I wondered how IceCream solves this problem. So I looked into the source code and came up with the following (slightly simplified) solution. It might not be 100% bullet-proof (e.g. I dropped get_text_with_indentation and I assume exactly one function argument), but it works well for different test cases. It does not need to parse source code itself, so it should be more robust and simpler than previous solutions.
#!/usr/bin/env python3
import inspect
from executing import Source
def func(var):
callFrame = inspect.currentframe().f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
expression = source.asttokens().get_text(callNode.args[0])
print(expression, '=', var)
i = 1
f = 2.0
dct = {'key': 'value'}
obj = type('', (), {'value': 42})
func(i)
func(f)
func(s)
func(dct['key'])
func(obj.value)
Output:
i = 1
f = 2.0
s = string
dct['key'] = value
obj.value = 42
Update: If you want to move the "magic" into a separate function, you simply have to go one frame further back with an additional f_back.
def get_name_of_argument():
callFrame = inspect.currentframe().f_back.f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
return source.asttokens().get_text(callNode.args[0])
def func(var):
print(get_name_of_argument(), '=', var)
If you want to get the caller params as in #Matt Oates answer answer without using the source file (ie from Jupyter Notebook), this code (combined from #Aeon answer) will do the trick (at least in some simple cases):
def get_caller_params():
# get the frame object for this function call
thisframe = inspect.currentframe()
# get the parent calling frames details
frames = inspect.getouterframes(thisframe)
# frame 0 is the frame of this function
# frame 1 is the frame of the caller function (the one we want to inspect)
# frame 2 is the frame of the code that calls the caller
caller_function_name = frames[1][3]
code_that_calls_caller = inspect.findsource(frames[2][0])[0]
# parse code to get nodes of abstract syntact tree of the call
nodes = ast.parse(''.join(code_that_calls_caller))
# find the node that calls the function
i_expr = -1
for (i, node) in enumerate(nodes.body):
if _node_is_our_function_call(node, caller_function_name):
i_expr = i
break
# line with the call start
idx_start = nodes.body[i_expr].lineno - 1
# line with the end of the call
if i_expr < len(nodes.body) - 1:
# next expression marks the end of the call
idx_end = nodes.body[i_expr + 1].lineno - 1
else:
# end of the source marks the end of the call
idx_end = len(code_that_calls_caller)
call_lines = code_that_calls_caller[idx_start:idx_end]
str_func_call = ''.join([line.strip() for line in call_lines])
str_call_params = str_func_call[str_func_call.find('(') + 1:-1]
params = [p.strip() for p in str_call_params.split(',')]
return params
def _node_is_our_function_call(node, our_function_name):
node_is_call = hasattr(node, 'value') and isinstance(node.value, ast.Call)
if not node_is_call:
return False
function_name_correct = hasattr(node.value.func, 'id') and node.value.func.id == our_function_name
return function_name_correct
You can then run it as this:
def test(*par_values):
par_names = get_caller_params()
for name, val in zip(par_names, par_values):
print(name, val)
a = 1
b = 2
string = 'text'
test(a, b,
string
)
to get the desired output:
a 1
b 2
string text
Since you can have multiple variables with the same content, instead of passing the variable (content), it might be safer (and will be simpler) to pass it's name in a string and get the variable content from the locals dictionary in the callers stack frame. :
def displayvar(name):
import sys
return name+" = "+repr(sys._getframe(1).f_locals[name])
If it just so happens that the variable is a callable (function), it will have a __name__ property.
E.g. a wrapper to log the execution time of a function:
def time_it(func, *args, **kwargs):
start = perf_counter()
result = func(*args, **kwargs)
duration = perf_counter() - start
print(f'{func.__name__} ran in {duration * 1000}ms')
return result

How do you initialize a global variable only when its not defined?

I have a global dictionary variable that will be used in a function that gets called multiple times. I don't have control of when the function is called, or a scope outside of the function I'm writing. I need to initialize the variable only if its not initialized. Once initialized, I will add values to it.
global dict
if dict is None:
dict = {}
dict[lldb.thread.GetThreadID()] = dict[lldb.thread.GetThreadID()] + 1
Unfortunately, I get
NameError: global name 'dict' is not defined
I understand that I should define the variable, but since this code is called multiple times, by just saying dict = {} I would be RE-defining the variable every time the code is called, unless I can somehow check if it's not defined, and only define it then.
Catching the error:
try:
_ = myDict
except NameError:
global myDict
myDict = {}
IMPORTANT NOTE:
Do NOT use dict or any other built-in type as a variable name.
A more idiomatic way to do this is to set the name ahead of time to a sentinel value and then check against that:
_my_dict = None
...
def increment_thing():
global _my_dict
if _my_dict is None:
_my_dict = {}
thread_id = lldb.thread.GetThreadID()
_my_dict[thread_id] = _my_dict.get(thread_id, 0) + 1
Note, I don't know anything about lldb -- but if it is using python threads, you might be better off using a threading.local:
import threading
# Thread local storage
_tls = threading.local()
def increment_thing():
counter = getattr(_tls, 'counter', 0)
_tls.counter = counter + 1

Dynamically call a var inside string in function

I'm new to python, I have var (string) which is an Xpath query. I want to pass the variable i into the Xpath query. A simple example below:
i = 0
self.var = 'li['+i+']'
def test(self):
while(i<10):
print self.var # 'li[0]', 'li[1]' ...
i += 1
You would need to call str on the variable i, you cannot concatenate an int and a str:
'li['+str(i)+']'
Or just use str.format, you can also pass use range and xpath indexing is also 1-based so you would start at 1:
self.var = "li[{}]"
def test(self):
for i in range(1, 11):
print self.var.format(i)
Or if using lxml for your Xpath queries you can use an Xpath variable like below:
.xpath("li[$i]", i=i)
You can create a list with the first string, the integer, and the second string.
i = 0
var = ['li[',i,']']
def test(var):
while(var[1]<10):
print var[0]+str(var[1])+var[2]
var[1] += 1
test(var)

Python / YAML: How to initialize additional objects not just from the YAML file, within loadConfig?

I have what I think is a small misconception with loading some YAML objects. I defined the class below.
What I want to do is load some objects with the overridden loadConfig function for YAMLObjects. Some of these come from my .yaml file, but others should be built out of objects loaded from the YAML file.
For instance, in the class below, I load a member object named "keep" which is a string naming some items to keep in the region. But I want to also parse this into a list and have the list stored as a member object too. And I don't want the user to have to give both the string and list version of this parameter in the YAML.
My current work around has been to override the __getattr__ function inside Region and make it create the defaults if it looks and doesn't find them. But this is clunky and more complicated than needed for just initializing objects.
What convention am I misunderstanding here. Why doesn't the loadConfig method create additional things not found in the YAML?
import yaml, pdb
class Region(yaml.YAMLObject):
yaml_tag = u'!Region'
def __init__(self, name, keep, drop):
self.name = name
self.keep = keep
self.drop = drop
self.keep_list = self.keep.split("+")
self.drop_list = self.drop.split("+")
self.pattern = "+".join(self.keep_list) + "-" + "-".join(self.drop_list)
###
def loadConfig(self, yamlConfig):
yml = yaml.load_all(file(yamlConfig))
for data in yml:
# These get created fine
self.name = data["name"]
self.keep = data["keep"]
self.drop = data["drop"]
# These do not get created.
self.keep_list = self.keep.split("+")
self.drop_list = self.drop.split("+")
self.pattern = "+".join(self.keep_list) + "-" + "-".join(self.drop_list)
###
### End Region
if __name__ == "__main__":
my_yaml = "/home/path/to/test.yaml"
region_iterator = yaml.load_all(file(my_yaml))
# Set a debug breakpoint to play with region_iterator and
# confirm the extra stuff isn't created.
pdb.set_trace()
And here is test.yaml so you can run all of this and see what I mean:
Regions:
# Note: the string conventions below are for an
# existing system. This is a shortened, representative
# example.
Market1:
!Region
name: USAndGB
keep: US+GB
drop: !!null
Market2:
!Region
name: CanadaAndAustralia
keep: CA+AU
drop: !!null
And here, for example, is what it looks like for me when I run this in an IPython shell and explore the loaded object:
In [57]: %run "/home/espears/testWorkspace/testRegions.py"
--Return--
> /home/espears/testWorkspace/testRegions.py(38)<module>()->None
-> pdb.set_trace()
(Pdb) region_iterator
<generator object load_all at 0x1139d820>
(Pdb) tmp = region_iterator.next()
(Pdb) tmp
{'Regions': {'Market2': <__main__.Region object at 0x1f858550>, 'Market1': <__main__.Region object at 0x11a91e50>}}
(Pdb) us = tmp['Regions']['Market1']
(Pdb) us
<__main__.Region object at 0x11a91e50>
(Pdb) us.name
'USAndGB'
(Pdb) us.keep
'US+GB'
(Pdb) us.keep_list
*** AttributeError: 'Region' object has no attribute 'keep_list'
A pattern I have found useful for working with yaml for classes that are basically storage is to have the loader use the constructor so that objects are created in the same way as when you make them normally. If I understand what you are attempting to do correctly, this kind of structure might be useful:
import inspect
import yaml
from collections import OrderedDict
class Serializable(yaml.YAMLObject):
__metaclass__ = yaml.YAMLObjectMetaclass
#property
def _dict(self):
dump_dict = OrderedDict()
for var in inspect.getargspec(self.__init__).args[1:]:
if getattr(self, var, None) is not None:
item = getattr(self, var)
if isinstance(item, np.ndarray) and item.ndim == 1:
item = list(item)
dump_dict[var] = item
return dump_dict
#classmethod
def to_yaml(cls, dumper, data):
return ordered_dump(dumper, '!{0}'.format(data.__class__.__name__),
data._dict)
#classmethod
def from_yaml(cls, loader, node):
fields = loader.construct_mapping(node, deep=True)
return cls(**fields)
def ordered_dump(dumper, tag, data):
value = []
node = yaml.nodes.MappingNode(tag, value)
for key, item in data.iteritems():
node_key = dumper.represent_data(key)
node_value = dumper.represent_data(item)
value.append((node_key, node_value))
return node
You would then want to have your Region class inherit from Serializable, and remove the loadConfig stuff. The code I posted inspects the constructor to see what data to save to the yaml file, and then when loading a yaml file calls the constructor with that same set of data. That way you just have to get the logic right in your constructor and the yaml loading should get it for free.
That code was ripped from one of my projects, apologies in advance if it doesn't quite work. It is also slightly more complicated than it needs to be because I wanted to control the order of output by using OrderedDict. You could replace my ordered_dump function with a call to dumper.represent_dict.

Categories