Where to create multiple variables? - python

I have some python code which reads in a series of user-specified variables from another file, including some if statements for variables that aren't always selected by the user. An example of the code I have is as follows:
import sys
if __name__=="__main__":
#Read-in user input file
filename = sys.argv[-1]
#Import user-defined parameters
m = __import__(filename)
directory = m.directory
experimentNumber = m.experimentNumber
dataInput = m.dataInput
outputFile = m.outputFile
save_flag = m.save_flag
cores = m.cores
npts1 = m.npts1
npts2_recorded = m.npts2_recorded
weightingFuncNameDim2 = m.weightingFuncNameDim2
weightingFuncParamsDim2 = m.weightingFuncParamsDim2
if int(m.ndim) >= 2:
weightingFuncNameDim3 = m.weightingFuncNameDim3
weightingFuncParamsDim3 = m.weightingFuncParamsDim3
else:
weightingFuncNameDim3 = '-'
weightingFuncParamsDim3 = '-'
In total there are around 50 of these variables imported from the user-defined file. The imported variables are all used by other functions later in the code. The way I have coded this seems to set these as global variables so they can easily be used by other variables. However, is there a more pythonic way to do this? However, I'm not sure what the best approach would be to make these variables available without passing 50 variables into each function (cumbersome).

Related

Modify or add functions to an existing package

I want to modify functions from a package in Python. To be precise, I'll take the example of the fastkde package. I have two questions.
Looking at the source code of the function pdf_at_points, the object returned is pdf. Instead, I'd like to return the _pdfobj object. How can I do so without modifying the source code ? Ideally, I'd like to do it in a random script, so that my code is "transferable".
The last two lines of the pdf_at_points function call the applyBernacchiaFilter and the __transformphiSC_points__ functions, that cannot be called in a script. How can I make them accessible in a random script ?
Here's an idea of the ideal output:
from fastkde import fastKDE
import numpy as np
N = 2e5
var1 = 50*np.random.normal(size=N) + 0.1
var2 = 0.01*np.random.normal(size=N) - 300
var3 = 50*np.random.normal(size=N) + 0.1
var4 = 0.01*np.random.normal(size=N) - 300
test_points = list(zip(var3, var4))
# Some lines of code here that I'm looking for,
# that would create the pdf_at_points_modified function
# and that would make applyBernacchiaFilter and __transformphiSC_points__
# functions usable
myPDF = fastKDE.pdf_at_points_modified(var1,var2)
myPDF.applyBernacchiaFilterModified()
pred = myPDF.__transformphiSC_points__(test_points)
pdf_at_points_modified corresponds to the modified pdf_at_points function.
Thanks in advance.

Is it possible to use wildcard_constraints to exclude certain keywords from being matched Snakemake?

I have a rule that calculates new variables based on a set of variables, these variables are separated into different files. I have another rule for calculating the averages of all the different variables in the whole database. My problem is that snakemake tries to find my derived variables in the original database, which of course are not there.
Is there a way to have constrained the averaging rule such that it will calculate the average for all variables except for a list of the variables that are derived
Psudo code of how the rule look like
rule calc_average:
input:
pi_clim_var = lambda w: get_control_path(w, w.variable),
output:
outpath = outdir+'{experiment}/{variable}/{variable}_{experiment}_{model}_{freq}.nc'
log:
"logs/calc_average/{variable}_{model}_{experiment}_{freq}.log"
wildcard_constraints:
variable= '!calculated1!calculated2' # "orrvar1|orrvar2"....
notebook:
"../notebooks/calc_clim.py.ipynb"
I can of make a list of all the variables that I would like to have in the database and
then do:
wildcard_constraints:
variable="|".join(list_of_vars)
But I was wondering if it is possible to do it the other way round? E.g:
wildcard_constraints:
variable="!".join(negate_list_of_vars) # don't match these wildcards
EDIT:
The get_control_path(w, w.variable) constructs the path to the input file based on a lookup table that uses the wildcards as a keys.
def get_control_path(w, variable, grid_label=None):
if grid_label == None:
grid_label = config['default_grid_label']
try:
paths = get_paths(w, variable,'piClim-control', grid_label,activity='RFMIP', control=True)
except KeyError:
paths = get_paths(w, variable,'piClim-control', grid_label,activity='AerChemMIP', control=True)
return paths
def get_paths(w, variable,experiment, grid_label=None, activity=None, control=False):
"""
Get CMIP6 model paths in database based on the lookup tables.
Parameters:
-----------
w : snake.wildcards
a named tuple that contains the snakemake wildcards
"""
if w.model in ["NorESM2-LM", "NorESM2-MM"]:
root_path = f'{ROOT_PATH_NORESM}/{CMIP_VER}'
look_fnames = LOOK_FNAMES_NORESM
else:
root_path = f'{ROOT_PATH}/{CMIP_VER}'
look_fnames = LOOK_FNAMES
if activity:
activity= activity
else:
activity = LOOK_EXP[experiment]
model = w.model
if control:
variant=config['model_specific_variant']['control'].get(model, config['variant_default'])
else:
variant = config['model_specific_variant']['experiment'].get(model, config['variant_default'])
table_id = TABLE_IDS.get(variable,DEFAULT_TABLE_ID)
institution = LOOK_INSTITU[model]
try:
file_endings = look_fnames[activity][model][experiment][variant][table_id]['fn']
except:
raise KeyError(f"File ending is not defined for this combination of {activity}, {model}, {experiment}, {variant} and {table_id} " +
"please update config/lookup_file_endings.yaml accordingly")
if grid_label == None:
grid_label = look_fnames[activity][model][experiment][variant][table_id]['gl'][0]
check_path = f'{root_path}/{activity}/{institution}/{model}/{experiment}/{variant}/{table_id}/{variable}/{grid_label}'
if os.path.exists(check_path)==False:
grid_labels = ['gr','gn', 'gl','grz', 'gr1']
i = 0
while os.path.exists(check_path)==False and i < len(grid_labels):
grid_label = grid_labels[i]
check_path = f'{root_path}/{activity}/{institution}/{model}/{experiment}/{variant}/{table_id}/{variable}/{grid_label}'
i += 1
if control:
version = config['version']['version_control'].get(w.model, 'latest')
else:
version = config['version']['version_exp'].get(w.model, 'latest')
fname = f'{variable}_{table_id}_{model}_{experiment}_{variant}_{grid_label}'
paths = expand(
f'{root_path}/{activity}/{institution}/{model}/{experiment}/{variant}/{table_id}/{variable}/{grid_label}/{version}/{fname}_{{file_endings}}'
,file_endings=file_endings)
# Sometimes the verisons are just messed up... try one more time with latest
if not os.path.exists(paths[0]):
paths=expand(
f'{root_path}/{activity}/{institution}/{model}/{experiment}/{variant}/{table_id}/{variable}/{grid_label}/latest/{fname}_{{file_endings}}'
,file_endings=file_endings)
# Sometimes the file ending are different depending on varialbe
if not os.path.exists(paths[0]) and len(paths) >= 2:
paths = [paths[1]]
return paths
Your description is a little too abstract for me to fully grasp your intent. You may be able to use regex's to solve this, but depending on the number of variables to consider, it could be a very slow calculation to match. Here are some other ideas, if they don't seem right please update your question with a little more context (the rule requesting calc_average and the get_control_path function).
Place derived and original files in different subdirectories. Then you can restrict the average rule to just be the original files.
Incorporate the logic into an input function/expand. Say the requesting rule is doing something like
rule average:
input: expand('/path/to/{input}', input=[input for input in inputs if input not in negate_list_of_vars])
output: 'path/to/average'

I want to define (create) and use different variable (using suffix) through loop (specially for loop)

I want to create multiple variable through for loop to further use and compare in the program.
Here is the code -
for i in range(0,len(Header_list)):
(f'len_{i} = {len(Header_list[i]) + 2 }')
print(len_0);print(f'len{i}')
for company in Donar_list:
print(company[i],f"len_{i}")
if len(str(company[i])) > len((f"len_{i}")) :
(f'len_{i}') = len(str(company[i]))
print(f"len_{i}")
But what is happening, though I managed to create variable len_0,len_1,len_2... in line-2, in line - 3 I also can print len_0..etc variable by only using print(len_0), but I can't print it the values of these by - print(f'len_{i}')
In line 5 I also can't compare with the cofition with my intension. I want it do create variable and compare it further when as it necessary, in this case under for loop.What should I do now? I am a beginner and I can do it using if statement but that wouldn't be efficient, also my intention is not to create any **data structure ** for this.
I don't know whether I could manage to deliver you what I am trying to say. Whatever I just wanna create different variable using suffix and also comprare them through for loop in THIS scenario.
Instead of dynamically creating variables, I would HIGHLY recommend checking out dictionaries.
Dictionaries allow you to store variables with an associated key, as so:
variable_dict = dict()
for i in range(0,len(Header_list)):
variable_dict[f'len_{i}'] = {len(Header_list[i]) + 2 }
print(len_0)
print(f'len{i}')
for company in Donar_list:
print(company[i],f"len_{i}")
if len(str(company[i])) > len(variable_dict[f"len_{i}"]) :
variable_dict[f'len_{i}'] = len(str(company[i]))
print(f"len_{i}")
This allows you to access the values using the same key:
len_of_4 = variable_dict['len_4']
If you REALLY REALLY need to dynamically create variables, you could use the exec function to run strings as python code. It's important to note that the exec function is not safe in python, and could be used to run any potentially malicious code:
for i in range(0,len(Header_list)):
exec(f'len_{i} = {len(Header_list[i]) + 2 }')
print(len_0);print(f'len{i}')
for company in Donar_list:
print(company[i],f"len_{i}")
if exec(f"len(str(company[i])) > len(len_{i})"):
exec(f'len_{i} = len(str(company[i]))')
print(f"len_{i}")
In python everything is object so use current module as object and use it like this
import sys
module = sys.modules[__name__]
Header_list=[0,1,2,3,4]
len_ = len(Header_list)
for i in range(len_):
setattr(module, f"len_{i}", Header_list[i]+2)
print(len_0)
print(len_1)
Header_list=[0,1,2,3,4]
for i in range(0,5):
exec(f'len_{i} = {Header_list[i] + 2 }')
print(f'len{i}')
output:
len0
len1
len2
len3
len4

dynamic name for a file in Python

I have a subject_id which is a dynamic number.For instance, it could be equal to 60. I am manually defining some file names as follows:
x_file = "50.txt"
x_csv_file = "50.csv"
The number (50) could have been 1 or any-number else. Is there any way that I can define subject_id=50 JUST one time and then use those names as x_file = "subject_id.txt" and x_csv_file = "subject_id.csv"?.
Thanks For your help
You might want to define a simple function for this
def file_name(subject_id):
x_file = '{}.txt'.format(subject_id)
x_csv_file = '{}.csv'.format(subject_id)
return x_file, x_csv_file

How do I print out the constant names when printing all the constants in a C file using pycparser?

I'm working on automating a tool that prints out all constants in a C file. So far, I have managed to print out all the constants in a C file but I can't figure out of a way to show the variable names they are associated with without printing out the whole abstract syntax tree, which has a lot of unnecessary information for me. Does anyone have any ideas? Right now, it will print out the constants, and their type. Here is my code:
from pycparser import c_parser, c_ast, parse_file
class ConstantVisitor(c_ast.NodeVisitor):
def __init__(self):
self.values = []
def visit_Constant(self, node):
self.values.append(node.value)
node.show(showcoord=True,nodenames=True,attrnames=True)
def show_tree(filename):
# Note that cpp is used. Provide a path to your own cpp or
# make sure one exists in PATH.
ast = parse_file(filename, use_cpp=True,cpp_args=['-E', r'-Iutils/fake_libc_include'])
cv = ConstantVisitor()
cv.visit(ast)
if __name__ == "__main__":
if len(sys.argv) > 1:
filename = sys.argv[1]
else:
filename = 'xmrig-master/src/crypto/c_blake256.c'
show_tree(filename)
edit:
current output: constant: type=int, value=0x243456BE
desired output: constant: type=int, name=variable name constant belongs to(usually an array name), value=0x243456BE
You may need to create a more sophisticated visitor if you want to retain information about parent nodes of the visited node. See this FAQ answer for an example.
That said, you will also need to define your goal much more clearly. Constant nodes are not always associated with a variable. For example:
return a + 30 + 50;
Has two Constant nodes in it (for 30 and for 50); what variable are they associated with?
Perhaps what you're looking for is variable declarations - Decl nodes with names. Then once you find a Decl node, do another visiting under this node looking for all Constant nodes.

Categories