I am looking into characterising some software by running many simulations with different parameters.
Each simulation can be assimilated to a test with different input parameters.
The test specification lists the different parameters:
param_a = 1
param_b = range(1,10)
param_c = {'package_1':1.1, 'params':[1,2,34]}
function = algo_1
and that would generate a list of tests:
['test-0':{'param_a':1, 'param_b':1, param_c:},
'test-1':{'param_a':1, 'param_b':2, param_c:},
...]
and call the function with these parameters. The return value of the function is the test results that should be reported in a 'friendly way'.
test-0: performance = X%, accuracy = Y%, runtime = Zsec ...
For example, Erlang's Common Test and Quickcheck are very suitable for this task, and provide HTML reporting of the tests.
Is there anything similar in Python?
you could give Robot Framework a chance. It will be easy/native to call your Python code from Robot test cases. We will get nice HTML reports as well. If you get blocked you will get some help on SO (tag robotframework) or on the Robot User Mailing List.
Considering the lack of available packages, here is the implementation of a couple of different wanted features:
test definition:
a python file that contain a config variable which is a dictionary of static requirements and a variables variable which is a dictionary of varying requirements (stored as lists).
config = {'db' : 'database_1'}
variables = {'threshold' : [1,2,3,4]}
The test specification is import using imp, through parsing the arguments of the script into args:
testspec = imp.load_source("testspec", args.test)
test generation:
The list of tests are generated using a modified version of product from numpy:
def my_product(dicts):
return (dict(izip(dicts, x)) for x in product(*dicts.itervalues()))
def generate_tests(testspec):
return [dict(testspec.config.items() + x.items()) for x in my_product(testspec.variables)]
which returns:
[{'db': 'database_1', 'threshold': 1},
{'db': 'database_1', 'threshold': 2},
{'db': 'database_1', 'threshold': 3},
{'db': 'database_1', 'threshold': 4}]
dynamic module loading:
To load the correct module database_1 under the generic name db, I used again imp in combination with the testspec in the class that uses module:
dbModule = testspec['db']
global db
db = imp.load_source('db', 'config/'+dbModule+'.py')
pretty printing:
not much here, just logging to terminal.
Related
Say I have a function in src/f1.py with the following signature:
def my_func(a1: int, a2: bool) -> float:
...
In a separate file src/f2.py I create a dictionary:
from src.f1 import my_func
my_dict = {
"func": my_func
}
In my final file src/test1.py I call the function:
from src.f2 import my_dict
print(my_dict["func"](1, True))
I'm getting IDE auto-suggestions for my_func in src/f2.py, but not in src/test1.py. I have tried using typing.Callable, but it doesn't create the same signature and it loses the function documentation. Is there any way I can get these in src/test1.py without changing the structure of my code?
I don't want to change the files in which my functions, dictionaries, or tests are declared.
I use VSCode version 1.73.1 and Python version 3.8.13. I cannot change my Python version.
I tried creating different types of Callable objects, but had problems getting the desired results.
They seem to have no docstring support
Some types are optional. I couldn't get that to work.
They do not work with variable names, only data types. I want the variable names (argument names to the function) to be there in the IDE suggestion.
What am I trying to do really?
I am trying to implement a mechanism where a user can set the configurations for a library in a single file. That single file is where all the dictionaries are stored (and it imports all essential functions).
This "configuration dictionary" is called in the main python file (or wherever needed). I have functions in a set of files for accomplishing a specific set of tasks.
Say functions fa1 and fa2 for task A; fb1, fb2, and fb3 for task B; and fc1 for task C. I want configurations (choices) for task A, followed by B, then C. So I do
work_ABC = {"A": fa1, "B": fb2, "C": fc1};
Now, I have a function like
wd = work_ABC
def do_work_abc(i, do_B=True):
res_a = wd["A"](i)
res_b = res_a
if do_B:
res_b = wd["B"](res_a)
res_c = wd["C"](res_b)
return res_c
If you have a more efficient way how I can implement the same thing, I'd love to hear it.
I want IntelliSense to give me the function signature of the function set for the dictionary.
There is no type annotation construct in Python that covers docstrings or parameter names of functions. There isn't even one for positional-only or keyword-only parameters (which would be actually meaningful in a type sense).
As I already mentioned in my comment, docstrings and names are not type-related. Ultimately, this is an IDE issue. PyCharm for example has no problem inferring those details with the setup you provided. I get the auto-suggestion for my_dict["func"] with the parameter names because PyCharm is smart/heavy enough to track it to the source. But it has its limits. If I change the code in f2 to this:
from src.f1 import my_func
_other_dict = {
"func": my_func
}
my_dict = {}
my_dict.update(_other_dict)
Then the suggestion engine is lost.
The reason is simply the discrepancy between runtime and static analysis. At some point it becomes silly/unreasonable to expect a static analysis tool to essentially run the code for you. This is why I always say:
Static type checkers don't execute your code, they just read it.
Even the fact that PyCharm "knows" the signature of my_func with your setup entails it running some non-trivial code in the background to back-track from the dictionary key to the dictionary to the actual function definition.
So in short: It appears you are out of luck with VSCode. And parameter names and docstrings are not part of the type system.
I am trying to call a dynamic method created using exec(), after calling globals() it is considering params as fixture and returning error as fixture 'Template_SI' not found.
Can someone help on how to pass dynamic parameters using globals()function_name(params)?
import pytest
import input_csv
datalist = input_csv.csvdata()
def display():
return 10 + 5
for data in datalist:
functionname = data['TCID']
parameters = [data['Template_name'], data['File_Type']]
body = 'print(display())'
def createfunc(name, *params, code):
exec('''
#pytest.mark.regression
def {}({}):
{}'''.format(name, ', '.join(params), code), globals(), globals())
createfunc(functionname, data['Template_name'], data['File_Type'], code=body)
templateName = data['Template_name']
fileType = data['File_Type']
globals()[functionname](templateName, fileType)
It looks like you're trying to automate the generation of lots of different tests based on different input data. If that's the case, using exec is probably not the best way to go.
pytest provides parameterized tests: https://docs.pytest.org/en/6.2.x/example/parametrize.html
which accomplish test generation by hiding all the details inside Metafunc.parameterize().
If you really want to generate the tests yourself, consider adapting Metafunc to your own purposes. Or, alternatively, checking the unittest framework.
I've written a simple AWS step functions workflow with a single step:
from stepfunctions.inputs import ExecutionInput
from stepfunctions.steps import Chain, TuningStep
from stepfunctions.workflow import Workflow
import train_utils
def main():
workflow_execution_role = 'arn:aws:iam::MY ARN'
execution_input = ExecutionInput(schema={
'app_id': str
})
estimator = train_utils.get_estimator()
tuner = train_utils.get_tuner(estimator)
tuning_step = TuningStep(state_id="HP Tuning", tuner=tuner, data={
'train': f's3://my-bucket/{execution_input["app_id"]}/data/'},
wait_for_completion=True,
job_name='HP-Tuning')
workflow_definition = Chain([
tuning_step
])
workflow = Workflow(
name='HP-Tuning',
definition=workflow_definition,
role=workflow_execution_role,
execution_input=execution_input
)
workflow.create()
if __name__ == '__main__':
main()
My goal is to have the train input pulled from the execution JSON provided at runtime. When I execute the workflow (from the step functions console), providing the JSON {"app_id": "My App ID"} the tuning step does not get the right data, instead it gets a to_string representation of the stepfunctions.inputs.placeholders.ExecutionInput. Furthermore when looking at the generated ASL I can see that the execution input was rendered as a string:
...
"DataSource": {
"S3DataSource": {
"S3DataType": "S3Prefix",
"S3Uri": "s3://my-bucket/<stepfunctions.inputs.placeholders.ExecutionInput object at 0x12261f7d0>/data/",
"S3DataDistributionType": "FullyReplicated"
}
},
...
What am I doing wrong?
Update:
As mentioned by #yoodan the SDK is probably behind, so I'll have to edit the definition before calling create. I can see there is a way to review the definition before calling create, but can I modify the graph definition? how?
The python SDK for step functions generates corresponding code, we need a string concatenation / format built into the Amazon States Language to accomplish what you desire.
Recently in August 2020, Amazon States Language introduced built-in functions such as string format into it's language spec. https://states-language.net/#appendix-b
Unfortunately, the python SDK is not up to date and does not support the new changes.
As a work around, maybe manually modify the definition before calling workflow create?
your definition of step function is good and it looks like it should work.
it is unclear from your code example where the execution actually happens (you stated directly from console) which can lead to several options that may cause the problem, and i believe thats the source pf the issue.
please provide more information regarding that.
are you somehow exporting your created Workflow object so it can be executed?
it seems that the something is missing... as a sanity check, append the following to your main function:
workflow.execute(inputs={"app_id": "My App ID"})
and check logs again
It seems like what you're looking for is not supported by the SDK.
https://github.com/aws/aws-step-functions-data-science-sdk-python/issues/79
You can however, change the definition before creating the state machine. Here is a function that will iterate the definition dict and replace a place holder surrounded with {{PH}} with the right intrinsic functions syntax, for example :
s3://my-bucket/{{app_id}}/data/
def get_updated_definition(data):
if isinstance(data, dict):
for k, v in data.copy().items():
if isinstance(v, dict): # For DICT
data[k] = get_updated_definition(v)
elif isinstance(v, list): # For LIST
data[k] = [get_updated_definition(i) for i in v]
elif isinstance(v, str) and re.search(r'{{([a-z_]+)}}', v): # Update Key-Value
# data.pop(k)
# OR
del data[k]
keys = re.findall(r'{{([a-z_]+)}}', v)
data[f"{k}.$"] = f"States.Format('{re.sub(r'{{[a-z_]+}}', '{}', v)}',{','.join(['$.'+k for k in keys])})"
return data
usage:
workflow_definition = get_updated_definition(workflow.definition.to_dict())
create_state_machine(json.dumps(workflow_definition)) #<-- implement using boto3 (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/stepfunctions.html#SFN.Client.create_state_machine)
For a python application including the following modules:
commons.py
#lru_cache(maxsize=2)
def memoized_f(x):
...
pipeline_a.py
from commons import memoized_f
x = memoized_f(10)
y = memoized_f(11)
pipeline_b.py
from commons import memoized_f
x = memoized_f(20)
y = memoized_f(21)
does python store one memoized_f cache per pipeline_* module? so in the example above, there will be two caches, total for memoized_f? or
because the caching is defined for memoized_f, there is only cache stored here for memoized_f in the application containing all the modules above?
#functools.lru_cache doesn't do any magic. It is a function decorator, meaning it takes the annotated function (here: memoized_f) as input and defines a new function. Essentially it does memoized_f = lru_cache(memoized_f, maxsize=2, ...) (see here)
So your question boils down to whether a (function in a) module imported by two other modules shares state, to which the answer is yes, there is a common cache. This is because each module is only imported once.
See e.g. the official documentation
In my code I have:
def fn1():
"""
fn1 description
Parameters
----------
components_list : list
List of IDs of the reference groups
Return
------
ret : str
A string representing the aggregate expression (See example)
Example
-------
>>> reference_groups = [1,2,3,4]
>>> expression = suite.getComponentsExpression(reference_groups)
(tier1 and tier2) or (my_group)
"""
#some code here
When I run: ./manage.py test, it tries to run the example. Is there a way to prevent this from happening? e.g. maybe CLI option to skip docstrings
Django 1.6 introduced a new test runner, which does not run doctests. So one answer to prevent tests is to upgrade Django!
If that's not possible, the proper fix would be to subclass DjangoTestSuiteRunner, disable the doc test loading, and tell Django to use your new test runner with the [TEST_RUNNER] setting.
Or if that's too tricky, a quick hack would be to change your docstrings so that they don't look like doctests, e.g.
>.> reference_groups = [1,2,3,4]