I am trying to create one table from each call of the defined function and append the tables together. But it works when I first call the function and filled table corr_tb with output. But when I call the function again and expect to append the new output to corr_tb. Nothing happens. Table corr_tb does not change. Is that because of the global or local variable issue?
corr_tb = pd.DataFrame()
def corr_tbl(df, key, var, with_var):
#calculate correlation by key
output = pd.DataFrame(df.groupby(key)[[var, with_var]].corr().ix[1::2,var]).reset_index()[key+[var]]
global corr_tb
if corr_tb.empty:
corr_tb = output
else:
corr_tb.append(output)
#print(output.head()) #result could be print but cannot be appended
#call function
corr_tbl(final, ['key1','key2'], 'var1','Sales')
corr_tbl(final, ['key1','key2'], 'var2','Sales')
pandas.DataFrame.append method doesn't modify the original object but return a new one, you can assign the new one back to corr_tb:
def corr_tbl(...)
# did some calculation here
if ...:
#
else:
corr_tb = corr_tb.append(df) # <<<<<< try modifying this line
Related
for i in vl:
if i.startswith("$"):
print(i.split(" ")[0])
I want to store the output of that last print statement as a variable but I don't know how. Trying to save it as a variable within the for loop or after it returns a "doesn't do anything" error.
Define a variable before the loop and then fill it. This way the variable is present even if the if conditions are not executed.e.g.
variable = 'Default' # create variable
for i in vl:
if i.startswith('$'):
variable = i.split(' ')[0] # fill variable if the if conditions execute
print(variable)
Hello I am pretty new to python and I want to do the following:
I have a function that opens a file, reads the file, closes the file and returns the data:
def getFastaFromFile(filename):
""" Read a fasta file (filename) from disk and return
its full contents as a string"""
inf=open(filename)
data=inf.read()
inf.close()
return data
The data that is being returned are a few lines with strings.
What I want to do is have another function that uses the data from the first function and perform the .readlines(), .readline() and .count() commands
My second function:
def printTableFromFasta(fastarec):
a= data.readlines()
for i in range(a)
b= data.readline()
c= b.count('A')
print(c)
As output I would like to print the amount of times string "A" appears for every line from the data. The problem I get with this code is that the data doesn't get recognized.
First, you need to pass the data you are wanting to read into the second function, like so
def printTableFromFasta(data):
In order to get this from your first function, try returning the entire contents of the file
def getFastaFromFile(filename):
with open(filename, 'r') as inf: # handles open and close
data = inf.readlines() # Returns the entire file as a list of strings
return data
Your function call will look something like this
printTableFromFasta(getFastaFromFile(filename))
Then, in your second function, you don't need to call readlines, it's already a list.
def printTableFromFasta(data):
for line in data # look at each line
print(line.count('A')) # count 'A'
Edit:
To only read from the second function and not touch the first function
def printTableFromFasta(filename):
with open(filename, 'r') as inf: # handles open and close
for line in inf.readlines() # look at each line
print(line.count('A')) # count 'A'
Remember that the data variable in the first function is local. It cannot be accessed from outside the function it is defined in.
For example, the getName() function returns a variable which is locally called data but you access the value by calling the function.
def getName(user_id):
data = "Your name is " + str(user_id)
return data
# Throws an error, because data in undefined
name = getName("Bobby")
print(data)
# Working code, prints "Your name is Bobby"
name = getName("Bobby")
print(name)
There are no rules against calling one function from inside another.
Instead of a = data.readlines() try a = getFastaFromFile("dna.fasta') as well as changing data = inf.read() to data = inf.readlines()
This is probably a very basic question but I haven't been able to figure this out.
I'm currently using the following to append values to an empty list
shoes = {'groups':['running','walking']}
df_shoes_group_names = pd.DataFrame(shoes)
shoes_group_name=[]
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
shoes_group_name
['running', 'walking']
I'm trying to accomplish the same using a for loop, however, when I execute the loop the list comes back as blank
shoes_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
list_builder(df_shoes_group_names)
shoes_group_name
[]
Reason for the function is that eventually I'll have multiple DF's with different product's so i'd like to just have if statements within the function to handle the creation of each list
so for example future examples could look like this:
df_shoes_group_names
df_boots_group_names
df_sandals_group_names
shoes_group_name=[]
boots_group_name=[]
sandals_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
elif 'boots' in dataframe_name:
for type in df_boots_group_names['groups']:
boots_group_name.append(type)
elif 'sandals' in dataframe_name:
for type in df_sandals_group_names['groups']:
sandals_group_name.append(type)
list_builder(df_shoes_group_names)
list_builder(df_boots_group_names)
list_builder(df_sandals_group_names)
Not sure if I'm approaching this the right way so any advice would be appreciated.
Best,
You should never call or search a variable name as if it were a string.
Instead, use a dictionary to store a variable number of variables.
Bad practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
def foo(x):
if shoes in df_shoes_group_names: # <-- THIS WILL NOT WORK
# do something with x
Good practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
dfs = {'shoes': df_shoes_group_names,
'boots': df_boots_group_names,
'sandals': df_sandals_group_names}
def foo(key):
if 'shoes' in key: # <-- THIS WILL WORK
# do something with dfs[key]
I am new to Python. I want to call a function that detect null value in a excel table and passing "Fail" to a variable stat.
def checkBlank(tb):
book=xlrd.open_workbook(reportFile)
sheet = book.sheet_by_name(tb)
s= "Pass"
for i in range(sheet.nrows):
row = sheet.row_values(i)
for cell in row:
if cell =='':
s="Fail"
return s
print checkBlank('Sheet1')
Above code will return "Fail"
but below code will give: NameError: name 'stat' is not defined
def checkBlank(tb,stat):
book=xlrd.open_workbook(reportFile)
sheet = book.sheet_by_name(tb)
s= "Pass"
for i in range(sheet.nrows):
row = sheet.row_values(i)
for cell in row:
if cell =='':
s="Fail"
print checkBlank('Sheet1', stat)
print stat
How can I assign "Fail" to stat if the function find the empty cell?
It appears that you are attemping to pass a variable into a function to be used as the return value. This is not Pythonic and in general will not work. Instead, your function should be returning a value and you should be assigning that value to a new "variable."
def function(arg):
if arg:
return 'Pass'
else:
return 'Fail'
status = function(False)
print(status) # 'Fail'
In Python, you don't want to try to write a function that calls by reference, because variables don't exist in Python in the same way that they do in C. Instead, what we have are names which are more akin to placeholders and can be used to retrieve objects. This article and many other Stack Overflow answers go into this in more depth.
def make(node): # takes some input
for reg_names in reg.names # dont worry about reg_names and reg.names
if reg.size > 0: #reg.size is an inbuilt function
found_dict = {} # first dictionary
found_dict['reg.name'] = 'reg.size' # i want to save the name of the register : size of the register in the format name : size
else:
not_found_dict = {}
not_found_dict['reg.name'] = 'reg.size' #again, i want to save the name of the register : size of the register in the format name : size
return found_dict, not_found_dict
Ok, so can you tell me whether from the for loop above, if the constructs for creating the dictionaries (found_dict and not_found_dict) are correct assuming reg.name and reg.size are valid constructs?
I then want to use found_dict in function_one and not_found_dict in function_two like below:
def function_one(input): # should this input be the function 'make' as I only want found_dict?
for name, size in found_dict.items(): #just for the names in found_dict
name_pulled = found_dict['reg.name'] # save the names temporarily to name_pulled using the key reg.name of found_dict
final_names[] = final_names.append(name_pulled) #save names from name_pulled into the list final_names and append them through the for loop. will this work?
def function_two(input): # i need not_found_dict so what should this input be?
for name, size in not_found_dict.items(): #using the names in not_found_dict
discard_name_pulled = not_found_dict['reg.name'] # save the names temporarily to discard_name_pulled using on the 'reg.name' from not_found_dict which is essentially the key to the dict
not_used_names[] = not_used_names.append(discard_name_pulled) # in the same way in function_one, save the names to the list not_used_names and append them through the for loop. Will this construct work?
Main question is, since def make is returning two dictionaries (found_dict and not_found_dict) how do I correctly input found_dict in function_one and not_found_dict in function_two?
First of all in your first section in the for loop every time you do :found_dict = {} or not_found_dict = {} you are clearing the contents of the dictionary. I'm not sure if this is what you want.
Second if you want to return more than one thing from a function you could always return them as an array or a tuple, something like this:
return [found_dict, not_found_dict]
Look at this question for more information.
After you return your array or tuple you can then store it in another variable like this:
result=make(inputVariable)
this will let you use each element as you want.
result[0]
result[1]
you can input them into the functions you want like this:
def function_one(inputParameter, found_dict):
#code ....
def function_one(inputParameter, not_found_dict):
#code ....
function_one(inputVariable, result[0])
function_two(inputVariable, result[1])