I am working on a project and I have a list of lists containing names, monetary values, etc. I am running into trouble trying to update the individual sub-lists within the primary list when a user enters a value.
For example, my list contains 4 rows (constant) and an in-determinant number of columns based on user entries. I am including the whole program just for reference in case there are questions about what it all looks like:
spacing = '- ' * 45 # formatting for DONOR header
data_list = [['NAMES', 'DONATION AMOUNT', 'Number of Gifts', 'Avg Gifts'],
['Rudolph S', 1500, 3, 0],
['Josef M', 250, 5, 0],
['Joye A', 5000, 2, None],
['Joni M', 2750, 1, None],
['Rachelle L', 750, 3, None],
['Vena U', 1000, 7, None],
['Efrain L', 10000, 1, None],
['Mee H', 15000, 2, None],
['Tanya E', 50000, 1, None],
['Garrett H', 800, 2, None]]
def addtolist():
"""Method for sending 'Thank You' messages to Donors, using names *"""
while True:
print("Enter the name of the person you are writing to (or enter 'list' to see a list of names or Q to quit) ")
fname_prompt = input("First Name: ").strip().capitalize()
if fname_prompt.upper() == "Q":
break
elif fname_prompt.lower() == "list":
if len(data_list) - 1 % 2 != 0:
for i in range(0, int(len(data_list) - 1 / 2)):
cut_off = int((len(data_list)) / 2)
if i == 0:
print(spacing)
print('{:>44s}'.format(str(data_list[i][0])))
print(spacing)
elif cut_off + i >= len(data_list):
continue
else:
print('{:>30s}'.format(data_list[i][0]), '{:>35s}'.format(data_list[cut_off + i][0]))
else:
if i == 0:
print(spacing)
print('{:>20s}'.format(str(data_list[i])))
print(spacing)
else:
print('{:>15s}'.format(data_list[i][0]), '{:>30s}'.format(data_list[cut_off + i][0]))
else:
lname_prompt = input("Last Name: ").strip().capitalize()
if lname_prompt.upper() == "Q":
break
elif lname_prompt.lower() == "list":
if len(data_list) - 1 % 2 != 0:
for i in range(0, int(len(data_list) - 1 / 2)):
cut_off = int((len(data_list)) / 2)
if i == 0:
print(spacing)
print('{:>44s}'.format(str(data_list[i][0])))
print(spacing)
elif cut_off + i >= len(data_list):
continue
else:
print('{:>30s}'.format(data_list[i][0]), '{:>35s}'.format(data_list[cut_off + i][0]))
else:
if i == 0: # for each item in list / 2 (5 x)
print(spacing)
print('{:>20s}'.format(str(data_list[i][0])))
print(spacing)
else:
print('{:>15s}'.format(data_list[i][0]), '{:>30s}'.format(data_list[cut_off + i][0]))
else:
full_name = fname_prompt + " " + lname_prompt
if full_name != "List List" or full_name != "list ":
name_found = False
for vals in data_list:
if full_name in vals:
name_found = True
else:
name_found = False
if name_found is False:
add_name = input("That name is not in the Donor list. Do you want to add it to the list? ").upper()
if add_name == "Y":
data_list.append([full_name])
if len(data_list) - 1 % 2 != 0:
for i in range(0, int(len(data_list) - (len(data_list) - 2) / 2)):
cut_off = int((len(data_list)) / 2)
if i == 0:
print(spacing)
print('{:>44s}'.format(str(data_list[i][0])))
print(spacing)
elif cut_off + i >= len(data_list):
print('{:>30s}'.format(data_list[i][0]))
continue
else:
print('{:>30s}'.format(data_list[i][0]), '{:>35s}'.format(data_list[cut_off + i][0]))
else:
if i == 0: # for each item in list / 2 (5 x)
print(spacing)
print('{:>20s}'.format(str(data_list[i][0])))
print(spacing)
else:
print('{:>15s}'.format(data_list[i][0]), '{:>30s}'.format(data_list[cut_off + i][0]))
donation_amt = int(input("Enter in the donation amount from Donor {0}: $".format(full_name)))
print('{0} has donated ${1}'.format(full_name, donation_amt))
data_list.append(donation_amt) # difficulty HERE
print(data_list)
The main line(s) I am having difficult with are at the very end with a comment "difficult HERE".
data_list.append(donation_amt) # difficulty HERE
I am trying to work this so that when the user enters a new name and new donation amount (or if they simply select an existing name and attach a donation amount to it), that the program can either append/insert the monetary value to the associated sublist (the name it is attached to). the way I have it set up now it is just appending the numerical amount onto the end of the larger list but I have been unsuccessful in attaching the value to the sublist... Has anyone done anything like this before?
Two-dimensional lists in Python are merely lists of lists. Thus, each element of data_list is, itself, a list. Here is an example of accessing an element, the first element below your row of headers (thus, index 1):
>>> first_entry = data_list[1]
>>> first_entry
['Rudolph S', 1500, 3, 0]
Since data_list[1] (which we have stored as a variable called first_entry) is also a list we can access, we could access the fourth element (at index 3, since lists begin indexing at 0) of the first entry as follows:
>>> first_entry = data_list[1]
>>> fourth_element = first_entry[3]
>>> fourth_element
0
Or, more succinctly:
>>> data_list[1][3]
0
So, to begin to answer your question, if your goal was to update the donation amount of "Joye A", you would use data_list[3][1] = donation_amt. This is because Joye's entry is at index 3 of the main list and donations are recorded at index 1 of her sub list.
Unfortunately, this doesn't really solve your problem, since you want to take an arbitrary name for which to either create a new entry or update an existing entry. The real answer here is that you are using the wrong data structure. For the sake of educational value, though, I'll go ahead and describe how you could do this with your existing structure.
Using your matrix
First you would need to determine if the name already exists. As a result, it would be best to create an extra list which contains only the first column, a 1-d list. You could do this in any number of ways. I'll show it as a list comprehension:
>>> names_only = [e[0] for e in data_list]
>>> names_only
['NAMES', 'Rudolph S', 'Josef M', ...]
I won't explain this here, but there are plenty of threads explaining how list comprehensions work for any readers who aren't aware.
First, you check if the name already exists in the matrix:
>>> 'Josef M' in names_only
True
If so, you now need to find the index of the name you're looking for. Lists in Python have an index function:
>>> idx = names_only.index('Joesf M')
>>> idx
2
You now update his donation amount as described above:
>>> data_list[idx][1] = donation_amt
Now for if he wasn't in the matrix, we want to make a whole new row. Lets imagine we're processing a user named 'bob'. He's not yet in the matrix. Here you use your append function:
>>> data_list.append([bob, donation_amt, 1, None])
Where 1 and None can be replaced with whatever your default values are. Putting it all together as a function:
>>> def update_or_create(name, amt):
... names = [e[0] for e in data_list]
... if name in names:
... idx = names.index(name)
... data_list[idx][1] = amt
... else:
... data_list.append([name, amt, 1, None])
Should do what your asking for.
Finally
It would be better to use a different structure for this. I would propose a dict structure like:
new_structure = {NAME: {'donation': DONATION_AMT, 'num_gifts': NUM_GIFTS, 'avg_amt': AVERAGE_DONATION},...}
Without going into too much detail, following this format would allow the following function to perform the same task:
>>> def update_or_create(name, amt):
... if name in new_structure:
... new_structure[name]['donation'] = amt
... else:
... new_structure.update({name: {'donation': amt, 'num_gifts': 1, 'avg_amt': None}})
Much nicer.
3 dimensional lists have the ability to add on to either the main list or one of the sub-lists.
data_list.append(donation_amt)
would append to the main list, which would mean that if you had a list like yours, it would add to the end.
[['NAMES', 'DONATION AMOUNT', 'Number of Gifts', 'Avg Gifts'],
['Rudolph S', 1500, 3, 0],
['Josef M', 250, 5, 0],
['Joye A', 5000, 2, None],
['Joni M', 2750, 1, None],
['Rachelle L', 750, 3, None],
['Vena U', 1000, 7, None],
['Efrain L', 10000, 1, None],
['Mee H', 15000, 2, None],
['Tanya E', 50000, 1, None],
['Garrett H', 800, 2, None],
[donation_amt]]
If you wanted to add a donation amount to a specific index, use
data_list[index].append(donation_amt).
Please let me know if this doesn't work or if you want a better explanation, then it may be a different issue.
It seems you have 4 static columns and an indeterminate number of rows.
Have you considered using a list of dictionaries, something like a json doc?
data_list = [{
'NAME':'Rudolph S',
'DONATION AMOUNT' : 1500,
'Number of Gifts' : 3,
'Avg Gifts' : 0
},{
'NAME':'Josef M',
'DONATION AMOUNT' : 250,
'Number of Gifts' : 5,
'Avg Gifts' : None
}]
And so on. I think you might have an easier time working with the data if you can reference the individual keys and update their values, instead of working with lists and index values.
In order to append donation_amt properly to the correct sublist, you need to first determine the index in the list where the donor belongs. Once you find the index, you can then append the donation amount to the sublist at that index. To achieve this, replace:
data_list.append(donation_amt) # difficulty HERE
with:
# Determine index where the donor belongs
idx = -1
for item in range(0, len(data_list)):
if data_list[item][0] == full_name:
idx = item
break
# Append to the sublist
data_list[idx].append(donation_amt)
I tried this out and it works for me.
Related
I have a dataframe with 3 million rows. I need to transform the values in a column. The column contains strings joined together with ";". The transformation involves breaking up the string into its components and then choosing one of the strings based on some priority rules.
Here is the sample dataset and the function:
data = {'Name': ['X1', 'X2', 'X3', 'X4', 'X5','X6'], 'category': ['CatA;CatB', 'CatB', None, 'CatB;CatC;CatA', 'CatA;CatB', 'CatB;CatD;CatB;CatC;CatA']}
sample_dataframe = pd.DataFrame(data)
def cat_name(x):
if x:
x = pd.Series(x.split(";"))
y = x[(x!='CatA') & x.notna()]
custom_dict = {'CatC': 0, 'CatD':1, 'CatB': 2, 'CatE': 3}
if x.count() == 1:
return x.iloc[0]
elif y.count() > 1:
y = y.sort_values(key=lambda x: x.map(custom_dict))
if y.count() > 2:
return '3 or more'
else:
return y.iloc[0]+'+'
elif y.count() == 1:
return y.iloc[0]
else:
return None
else:
return None
I am using the apply method test_data = sample_dataframe['category'].apply(cat_name) to run the function on the column. For my dataset of 3 million rows, the function takes almost 10 minutes to run.
How can I optimize the function to run faster?
Also, I have two set of of category rules and the output category needs to be stored in two columns. Currently I am using the apply function twice. Kinda dumb and slow, I know, but it works.
Is there a way to run the function at the same time for a different priority dictionary and return two output values? I tried to use
test_data['CAT_NAME'], test_data['MAIN_CAT_NAME']=zip(*sample_dataframe['category'].apply(joint_cat_name)) with the function
def joint_cat_name(x):
cat_string = x
if cat_string:
string_series = pd.Series(cat_string.split(";"))
y = string_series[(string_series!='CatA') & string_series.notna()]
custom_dict = {'CatB': 0, 'CatC':1, 'CatD': 2, 'CatE': 3}
if string_series.count() == 1:
return string_series.iloc[0], string_series.iloc[0]
elif y.count() > 1:
y = y.sort_values(key=lambda x: x.map(custom_dict))
if y.count() > 2:
return '3 or more', y.iloc[0]
elif y.count() == 1:
return y.iloc[0]+'+', y.iloc[0]
elif y.count() == 1:
return y.iloc[0], y.iloc[0]
else:
return None, None
else:
return None, None
But I got an error TypeError: 'NoneType' object is not iterable when the zip function encountered tuple containing Nones. ie it threw an error when output was (None, None)
Thanks a lot in advance.
Your function does a lot of unnecessary work. Even if you just reorder some conditionals it will run much faster.
custom_dict = {"CatC": 0, "CatD": 1, "CatB": 2, "CatE": 3}
def cat_name(x):
if x is None:
return x
xs = x.split(";")
if len(xs) == 1:
return xs[0]
ys = [x for x in xs if x != "CatA"]
l = len(ys)
if l == 0:
return None
if l == 1:
return ys[0]
if l == 2:
return min(ys, key=lambda k: custom_dict[k]) + "+"
if l > 2:
return "3 or more"
Faster than running one Python method on each row might be to go through your dataframe multiple times, and each time use an optimized Pandas query. You'd have to rewrite your code something like this:
# select empty categories
no_cat = sample_dataframe['category'].isna()
# select categorie strings with only one category
single_cat = ~no_cat & (sample_dataframe['category'].str.count(";") == 0)
# get number of categories
num_cats = sample_dataframe['category'].str.count(";") + 1
three_or_more = num_cats > 2
# has a "CatA" category
has_cat_A = sample_dataframe['category'].str.contains("CatA", na=False)
# then also write these selected rows in a custom way
sample_dataframe["cat_name"] = ""
cat_name_col = sample_dataframe["cat_name"]
cat_name_col[no_cat] = None
cat_name_col[single_cat] = sample_dataframe["category"][single_cat]
cat_name_col[three_or_more] = "3 or more"
# continue with however complex you want to get to cover more cases, e.g.
two_cats_no_cat_A = (num_cats == 2) & ~has_cat_A
# then handle only the remaining cases with the apply
not_handled = ~no_cat & ~single_cat & ~three_or_more
cat_name_col[not_handled] = sample_dataframe["category"][not_handled].apply(cat_name)
Running these queries on 3 million rows should be plenty fast, even if you have to do a few of them and combine them. If it's still too slow, you can handle more special cases from the apply in the same vectorized fashion.
There is a hotel e.g. of size 7x5. I need to create a function where
a number is given as parameter for finding the amount of consecutive empty rooms
returns the number of floor and room number in that.
(depicted below: 0 is empty room and 1 is full)
e.g.:
if the parameter is 1, output will be
"floor no: 5, start from room no: 1"
if the parameter is 2, output will be
"floor no: 5, start from room no: 3"
if the parameter is 3, output will be
"floor no: 5, start from room no: 3"
if the parameter is 4, output will be
"floor no: 4, start from room no: 4"
if the parameter is 5, output will be
"floor no: 2, start from room no: 1"
if the parameter is 6 (or 7), output will be
"floor no: 1, start from room no: 1"
if the parameter is > 7, output will be
"not possible to find in one floor"
preferably without using itertools.grupby.
My try:
def adjacent_rooms (amount):
nested_list_temp = [[0]*7]*5
nested_list = [list(i) for i in nested_list_temp]
nested_list [1][5] = 1
nested_list [2][3] = 1
nested_list [2][4] = 1
nested_list [3][2] = 1
nested_list [4][1] = 1
nested_list [4][5] = 1
# [[0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 1, 0], [0, 0, 0, 1, 1, 0, 0], [0, 0, 1, 0, 0, 0, 0], [0, 1, 0, 0, 0, 1, 0]]
try:
for i in range(len(nested_list), 0, -1):
for j in range(len(nested_list[0])):
if nested_list[i-1][j] == 0:
count += 1
if count == amount:
return (i, j-amount+2)
else:
count = 0
except:
return "not possible to find in one floor"
Any effective hints or suggestions will be highly appreciated.
Use narrow try-except conditions. The first problem is your all-encompassing
error catching, which hides a wide variety of failures -- including outright
code bugs, such as your failure to initialize count. So the first thing I did
was to delete the try-except. You don't really need it here. And even if you
did, you want to declare one or more specific exceptions in the except clause
rather than leaving it wide open.
Work within Python's list indexing as long as possible. It seems that you
want to return human-oriented floor/room numbers (starting at 1) rather
than computer-oriented numbers (starting at 0). That's fine. But defer the
computer-to-human conversion as long as possible. Within the guts of your
algorithmic code, work with Python's indexing scheme. In your case, your
code straddles both, sometimes using 1-based indexing, sometimes 0-based.
That's confusing.
You are resetting count too often. It should be set whenever the room
is full. But you are resetting it whenever count does not equal amount.
As a result, count is almost always being reset to zero.
You are also resetting count too infrequently. It must be reset at the
start of each new floor.
If we make those changes, we get this:
def adjacent_rooms(nested_list, amount):
for i in range(len(nested_list), 0, -1):
count = 0
for j in range(len(nested_list[0])):
if nested_list[i-1][j] == 0:
count += 1
if count == amount:
return (i, j-amount+2)
else:
count = 0
Python lists are directly iterable. As a result, you almost never
need to mess around with list indexes and range() to process list data.
Just iterate directly to access the values. And for those cases where
you need both the value and the index, use enumerate().
Use more declarative variable names. Names like hotel, floor, and
room help the reader understand your code.
Return data, not textual messages. If a function returns a tuple of
integers upon success, what should it do upon non-serious failure? It depends
on the context, but you can either raise an exception or return some variant of
None. In your case, I would probably opt for a parallel tuple: (None, None). This allows the caller to interact with the function in a fairly
natural way and then simply check either value for None. But returning a
textual message is quite unhelpful for callers: the returned data bundle has a
different outer structure (string vs tuple), and it has a different inner data
type (string vs int).
Don't depend on global variables. Pass the hotel data into
the function, as a proper argument.
If we make those changes, we get something like this:
def adjacent_rooms(hotel, wanted):
for fi, floor in enumerate(hotel):
n = 0
for ri, room in enumerate(floor):
if room == 0:
n += 1
if n == wanted:
return (fi + 1, ri - n + 2)
else:
n = 0
return (None, None)
I'm getting the desired output with slightly different indentation:
nested_list_temp = [[0]*7]*5
nested_list = [list(i) for i in nested_list_temp]
nested_list [1][5] = 1
nested_list [2][3] = 1
nested_list [2][4] = 1
nested_list [3][2] = 1
nested_list [4][1] = 1
nested_list [4][5] = 1
def adjacent_rooms(amount):
for i in range(len(nested_list), 0, -1):
count = 0
for j in range(len(nested_list[0])):
if nested_list[i-1][j] == 0:
count += 1
if count == amount:
return (i, j-amount+2)
else:
count = 0
return "not possible to find in one floor"
Another solution:
def adjacent_rooms(nested_list, amount):
to_search = "0" * amount
for floor in range(len(nested_list) - 1, -1, -1):
try:
idx = "".join(map(str, nested_list[floor])).index(to_search)
return "floor no: {}, start from room no: {}".format(
floor + 1, idx + 1
)
except ValueError:
continue
return "not possible to find in one floor"
nested_list = [[0 for _ in range(7)] for _ in range(5)]
nested_list[1][5] = 1
nested_list[2][3] = 1
nested_list[2][4] = 1
nested_list[3][2] = 1
nested_list[4][1] = 1
nested_list[4][5] = 1
for f in range(1, 10):
print("f={}, result: {}".format(f, adjacent_rooms(nested_list, f)))
Prints:
f=1, result: floor no: 5, start from room no: 1
f=2, result: floor no: 5, start from room no: 3
f=3, result: floor no: 5, start from room no: 3
f=4, result: floor no: 4, start from room no: 4
f=5, result: floor no: 2, start from room no: 1
f=6, result: floor no: 1, start from room no: 1
f=7, result: floor no: 1, start from room no: 1
f=8, result: not possible to find in one floor
f=9, result: not possible to find in one floor
Background
We have a family tradition where my and my siblings' Christmas presents are identified by a code that can be solved using only numbers related to us. For example, the code could be birth month * age + graduation year (This is a simple one). If the numbers were 8 * 22 + 2020 = 2196, the number 2196 would be written on all my Christmas presents.
I've already created a Python class that solves the code with certain constraints, but I'm wondering if it's possible to do it recursively.
Current Code
The first function returns a result set for all possible combinations of numbers and operations that produce a value in target_values
#Master algorithm (Get the result set of all combinations of numbers and cartesian products of operations that reach a target_value, using only the number_of_numbers_in_solution)
#Example: sibling1.results[1] = [(3, 22, 4), (<built-in function add>, <built-in function add>), 29]. This means that 3 + 22 + 4 = 29, and 29 is in target_values
import operator
from itertools import product
from itertools import combinations
NUMBER_OF_OPERATIONS_IN_SOLUTION = 2 #Total numbers involved is this plus 1
NUMBER_OF_NUMBERS_IN_SOLUTION = NUMBER_OF_OPERATIONS_IN_SOLUTION + 1
TARGET_VALUES = {22,27,29,38,39}
def getresults( list ):
#Add the cartesian product of all possible operations to a variable ops
ops = []
opslist = [operator.add, operator.sub, operator.mul, operator.truediv]
for val in product(opslist, repeat=NUMBER_OF_OPERATIONS_IN_SOLUTION):
ops.append(val)
#Get the result set of all combinations of numbers and cartesian products of operations that reach a target_value
results = []
for x in combinations(list, NUMBER_OF_NUMBERS_IN_SOLUTION):
for y in ops:
result = 0
for z in range(len(y)):
#On the first iteration, do the operation on the first two numbers (x[z] and x[z+1])
if (z == 0):
#print(y[z], x[z], x[z+1])
result = y[z](x[z], x[z+1])
#For all other iterations, do the operation on the current result and x[z+1])
else:
#print(y[z], result, x[z+1])
result = y[z](result, x[z+1])
if result in TARGET_VALUES:
results.append([x, y, result])
#print (x, y)
print(len(results))
return results
Then a class that takes in personal parameters for each person and gets the result set
def getalpha( str, inverse ):
"Converts string to alphanumeric array of chars"
array = []
for i in range(0, len(str)):
alpha = ord(str[i]) - 96
if inverse:
array.append(27 - alpha)
else:
array.append(alpha)
return array;
class Person:
def __init__(self, name, middlename, birthmonth, birthday, birthyear, age, orderofbirth, gradyear, state, zip, workzip, cityfirst3):
#final list
self.listofnums = []
self.listofnums.extend((birthmonth, birthday, birthyear, birthyear - 1900, age, orderofbirth, gradyear, gradyear - 2000, zip, workzip))
self.listofnums.extend(getalpha(cityfirst3, False))
self.results = getresults(self.listofnums)
Finally, a "solve code" method that takes from the result sets and finds any possible combinations that produce the full list of target_values.
#Compares the values of two sets
def compare(l1, l2):
result = all(map(lambda x, y: x == y, l1, l2))
return result and len(l1) == len(l2)
#Check every result in sibling2 with a different result target_value and equal operation sets
def comparetwosiblings(current_values, sibling1, sibling2, a, b):
if sibling2.results[b][2] not in current_values and compare(sibling1.results[a][1], sibling2.results[b][1]):
okay = True
#If the indexes aren't alphanumeric, ensure they're the same before adding to new result set
for c in range(0, NUMBER_OF_NUMBERS_IN_SOLUTION):
indexintersection = set([index for index, value in enumerate(sibling1.listofnums) if value == sibling1.results[a][0][c]]) & set([index for index, value in enumerate(sibling2.listofnums) if value == sibling2.results[b][0][c]])
if len(indexintersection) > 0:
okay = True
else:
okay = False
break
else:
okay = False
return okay
#For every result, we start by adding the result number to the current_values list for sibling1, then cycle through each person and see if a matching operator list leads to a different result number. (Matching indices as well)
#If there's a result set for everyone that leads to five different numbers in the code, the values will be added to the newresult set
def solvecode( sibling1, sibling2, sibling3, sibling4, sibling5 ):
newresults = []
current_values = []
#For every result in sibling1
for a in range(len(sibling1.results)):
current_values = []
current_values.append(sibling1.results[a][2])
for b in range(len(sibling2.results)):
if comparetwosiblings(current_values, sibling1, sibling2, a, b):
current_values.append(sibling2.results[b][2])
for c in range(len(sibling3.results)):
if comparetwosiblings(current_values, sibling1, sibling3, a, c):
current_values.append(sibling3.results[c][2])
for d in range(len(sibling4.results)):
if comparetwosiblings(current_values, sibling1, sibling4, a, d):
current_values.append(sibling4.results[d][2])
for e in range(len(sibling5.results)):
if comparetwosiblings(current_values, sibling1, sibling5, a, e):
newresults.append([sibling1.results[a][0], sibling2.results[b][0], sibling3.results[c][0], sibling4.results[d][0], sibling5.results[e][0], sibling1.results[a][1]])
current_values.remove(sibling4.results[d][2])
current_values.remove(sibling3.results[c][2])
current_values.remove(sibling2.results[b][2])
print(len(newresults))
print(newresults)
It's the last "solvecode" method that I'm wondering if I can optimize and make into a recursive algorithm. In some cases it can be helpful to add or remove a sibling, which would look nice recursively (My mom sometimes makes a mistake with one sibling, or we get a new brother/sister-in-law)
Thank you for any and all help! I hope you at least get a laugh out of my weird family tradition.
Edit: In case you want to test the algorithm, here's an example group of siblings that result in exactly one correct solution
#ALL PERSONAL INFO CHANGED FOR STACKOVERFLOW
sibling1 = Person("sibling1", "horatio", 7, 8, 1998, 22, 5, 2020, "ma", 11111, 11111, "red")
sibling2 = Person("sibling2", "liem", 2, 21, 1995, 25, 4, 2018, "ma", 11111, 11111, "pho")
sibling3 = Person("sibling3", "kyle", 4, 21, 1993, 26, 3, 2016, "ma", 11111, 11111, "okl")
sibling4 = Person("sibling4", "jamal", 4, 7, 1991, 29, 2, 2014, "ma", 11111, 11111, "pla")
sibling5 = Person("sibling5", "roberto", 9, 23, 1990, 30, 1, 2012, "ma", 11111, 11111, "boe")
I just spent a while improving the code. Few things I need to mention:
It's not good practice to use python keywords(like list, str and zip) as variables, it will give you problems and it makes it harder to debug.
I feel like you should use the permutation function as combination gives unordered pairs while permutation gives ordered pairs which are more in number and will give more results. For example, for the sibling info you gave combination gives only 1 solution through solvecode() while permutation gives 12.
Because you are working with operators, there can be more cases with brackets. To solve that problem and to make the getresults() function a bit more optimized, I suggest you explore the reverse polish notation. Computerphile has an excellent video on it.
You don't need a compare function. list1==list2 works.
Here's the optimized code:
import operator
from itertools import product
from itertools import permutations
NUMBER_OF_OPERATIONS_IN_SOLUTION = 2 #Total numbers involved is this plus 1
NUMBER_OF_NUMBERS_IN_SOLUTION = NUMBER_OF_OPERATIONS_IN_SOLUTION + 1
TARGET_VALUES = {22,27,29,38,39}
def getresults(listofnums):
#Add the cartesian product of all possible operations to a variable ops
ops = []
opslist = [operator.add, operator.sub, operator.mul, operator.truediv]
for val in product(opslist, repeat=NUMBER_OF_OPERATIONS_IN_SOLUTION):
ops.append(val)
#Get the result set of all combinations of numbers and cartesian products of operations that reach a target_value
results = []
for x in permutations(listofnums, NUMBER_OF_NUMBERS_IN_SOLUTION):
for y in ops:
result = y[0](x[0], x[1])
if NUMBER_OF_OPERATIONS_IN_SOLUTION>1:
for z in range(1, len(y)):
result = y[z](result, x[z+1])
if result in TARGET_VALUES:
results.append([x, y, result])
return results
def getalpha(string, inverse):
"Converts string to alphanumeric array of chars"
array = []
for i in range(0, len(string)):
alpha = ord(string[i]) - 96
array.append(27-alpha if inverse else alpha)
return array
class Person:
def __init__(self, name, middlename, birthmonth, birthday, birthyear, age, orderofbirth, gradyear, state, zipcode, workzip, cityfirst3):
#final list
self.listofnums = [birthmonth, birthday, birthyear, birthyear - 1900, age, orderofbirth, gradyear, gradyear - 2000, zipcode, workzip]
self.listofnums.extend(getalpha(cityfirst3, False))
self.results = getresults(self.listofnums)
#Check every result in sibling2 with a different result target_value and equal operation sets
def comparetwosiblings(current_values, sibling1, sibling2, a, b):
if sibling2.results[b][2] not in current_values and sibling1.results[a][1]==sibling2.results[b][1]:
okay = True
#If the indexes aren't alphanumeric, ensure they're the same before adding to new result set
for c in range(0, NUMBER_OF_NUMBERS_IN_SOLUTION):
indexintersection = set([index for index, value in enumerate(sibling1.listofnums) if value == sibling1.results[a][0][c]]) & set([index for index, value in enumerate(sibling2.listofnums) if value == sibling2.results[b][0][c]])
if len(indexintersection) > 0:
okay = True
else:
okay = False
break
else:
okay = False
return okay
And now, the million dollar function or should i say two functions:
# var contains the loop variables a-e, depth keeps track of sibling number
def rec(arg, var, current_values, newresults, depth):
for i in range(len(arg[depth].results)):
if comparetwosiblings(current_values, arg[0], arg[depth], var[0], i):
if depth<len(arg)-1:
current_values.append(arg[depth].results[i][2])
rec(arg, var[:depth]+[i], current_values, newresults, depth+1)
current_values.remove(arg[depth].results[i][2])
else:
var.extend([i])
newresults.append([arg[0].results[var[0]][0], arg[1].results[var[1]][0], arg[2].results[var[2]][0], arg[3].results[var[3]][0], arg[4].results[var[4]][0], arg[0].results[var[0]][1]])
def solvecode(*arg):
newresults = []
for a in range(len(arg[0].results)):
current_values = [arg[0].results[a][2]]
rec(arg, var=[a], current_values=current_values, newresults=newresults, depth=1)
print(len(newresults))
print(newresults)
There is a need for two functions as the first one is the recursive one and the second one is like a packaging. I've also fulfilled your second wish, that was being able to have variable number of siblings' data that can be input into the new solvecode function. I've checked the new functions and they work together exactly like the original solvecode function. Something to be noted is that there is no significant difference in the version's runtimes although the second one has 8 less lines of code. Hope this helped. lmao took me 3 hours.
So i'm currently working on code, which solves simple differentials. For now my code looks something like that:
deff diff():
coeffs = []
#checking a rank of a function
lvl = int(raw_input("Tell me a rank of your function: "))
if lvl == 0:
print "\nIf the rank is 0, a differential of a function will always be 0"
#Asking user to write coefficients (like 4x^2 - he writes 4)
for i in range(0, lvl):
coeff = int(raw_input("Tell me a coefficient: "))
coeffs.append(coeff)
#Printing all coefficients
print "\nSo your coefficients are: "
for item in coeffs:
print item
And so what I want to do next? I have every coefficient in my coeffs[] list. So now I want to take every single one from there and assign it to a different variable, just to make use of it. And how can I do it? I suppose I will have to use loop, but I tried to do so for hours - nothing helped me. Sooo, how can I do this? It would be like : a=coeff[0], b = coeff[1], ..., x = coeff[lvl] .
Just access the coefficients directly from the list via their indices.
If you are wanting to use the values in a different context that entails making changes to the values but you want to keep the original list unchanged then copy the list to a new list,
import copy
mutableCoeffs = copy.copy(coeffs)
You do not need new variables.
You already have all you need to compute the coefficients for the derivative function.
print "Coefficients for the derivative:"
l = len(coeffs) -1
for item in coeffs[:-1]:
print l * item
l -=1
Or if you want to put them in a new list :
deriv_coeffs = []
l = len(coeffs) -1
for item in coeffs[:-1]:
deriv_coeffs.append(l * item)
l -=1
I guess from there you want to differenciate no? So you just assign the cofficient times it rank to the index-1?
deff diff():
coeffs = []
#checking a rank of a function
lvl = int(raw_input("Tell me a rank of your function: "))
if lvl == 0:
print "\nIf the rank is 0, a differential of a function will always be 0"
#Asking user to write coefficients (like 4x^2 - he writes 4)
for i in range(0, lvl):
coeff = int(raw_input("Tell me a coefficient: "))
coeffs.append(coeff)
#Printing all coefficients
print "\nSo your coefficients are: "
for item in coeffs:
print item
answer_coeff = [0]*(lvl-1)
for i in range(0,lvl-1):
answer_coeff[i] = coeff[i+1]*(i+1)
print "The derivative is:"
string_answer = "%d" % answer_coeff[0]
for i in range(1,lvl-1):
string_answer = string_answer + (" + %d * X^%d" % (answer_coeff[i], i))
print string_answer
If you REALLY want to assign a list to variables you could do so by accessing the globals() dict. For example:
for j in len(coeffs):
globals()["elm{0}".format(j)] = coeffs[j]
Then you'll have your coefficients in the global variables elm0, elm1 and so on.
Please note that this is most probably not what you really want (but only what you asked for).
I'm getting a sequence of day of the week. Python code of what I want to do:
def week_days_to_string(week_days):
"""
>>> week_days_to_string(('Sunday', 'Monday', 'Tuesday'))
'Sunday to Tuesday'
>>> week_days_to_string(('Monday', 'Wednesday'))
'Monday and Wednesday'
>>> week_days_to_string(('Sunday', 'Wednesday', 'Thursday'))
'Sunday, Wednesday, Thursday'
"""
if len(week_days) == 2:
return '%s and %s' % weekdays
elif week_days_consecutive(week_days):
return '%s to %s' % (week_days[0], week_days[-1])
return ', '.join(week_days)
I just need the week_days_consecutive function (the hard part heh).
Any ideas how I could make this happen?
Clarification:
My wording and examples caused some confusion. I do not only want to limit this function to the work week. I want to consider all days of the week (S, M, T, W, T, F). My apologies for not being clear about that last night. Edited the body of the question to make it clearer.
Edit: Throwing some wrenches into it
Wraparound sequence:
>>> week_days_to_string(('Sunday', 'Monday', 'Tuesday', 'Saturday'))
'Saturday to Tuesday'
And, per #user470379 and optional:
>>> week_days_to_string(('Monday, 'Wednesday', 'Thursday', 'Friday'))
'Monday, Wednesday to Friday'
I would approach this problem by:
Creating a dict mapping day names to their sequential index
Converting my input day names to their sequential indices
Looking at the resulting input indices and asking if they are sequential
Here's how you can do that, using calendar.day_name, range and some for comprehensions:
day_indexes = {name:i for i, name in enumerate(calendar.day_name)}
def weekdays_consecutive(days):
indexes = [day_indexes[d] for d in days]
expected = range(indexes[0], indexes[-1] + 1)
return indexes == expected
A few other options, depending on what you need:
If you need Python < 2.7, instead of the dict comprehension, you can use:
day_indexes = dict((name, i) for i, name in enumerate(calendar.day_name))
If you don't want to allow Saturday and Sunday, just trim off the last two days:
day_indexes = ... calendar.day_name[:-2] ...
If you need to wrap around after Sunday, it's probably easiest to just check that each item is one more than the previous item, but working in modulo 7:
def weekdays_consecutive(days):
indexes = [day_indexes[d] for d in days]
return all(indexes[i + 1] % 7 == (indexes[i] + 1) % 7
for i in range(len(indexes) - 1))
Update: For the extended problem, I would still stick with they day-to-index dict, but instead I would:
Find all the indexes where a run of sequential days stops
Wrap the days around if necessary to get the longest possible sequence of days
Group the days into their sequential spans
Here's code to do this:
def weekdays_to_string(days):
# convert days to indexes
day_indexes = {name:i for i, name in enumerate(calendar.day_name)}
indexes = [day_indexes[d] for d in days]
# find the places where sequential days end
ends = [i + 1
for i in range(len(indexes))
if (indexes[(i + 1) % len(indexes)]) % 7 !=
(indexes[(i) % len(indexes)] + 1) % 7]
# wrap the days if necessary to get longest possible sequences
split = ends[-1]
if split != len(days):
days = days[split:] + days[:split]
ends = [len(days) - split + end for end in ends]
# group the days in sequential spans
spans = [days[begin:end] for begin, end in zip([0] + ends, ends)]
# format as requested, with "to", "and", commas, etc.
words = []
for span in spans:
if len(span) < 3:
words.extend(span)
else:
words.append("%s to %s" % (span[0], span[-1]))
if len(days) == 1:
return words[0]
elif len(days) == 2:
return "%s and %s" % tuple(words)
else:
return ", ".join(words)
You might also try the following instead of that last if/elif/else block to get an "and" between the last two items and commas between everything else:
if len(words) == 1:
return words[0]
else:
return "%s and %s" % (", ".join(words[:-1]), words[-1])
That's a little different from the spec, but prettier in my eyes.
def weekdays_consecutive(inp):
days = { 'Monday': 0,
'Tuesday': 1,
'Wednesday': 2,
'Thursday': 3,
'Friday': 4 }
return [days[x] for x in inp] == range(days[inp[0]], days[inp[-1]] + 1)
As you have already checked for other cases, I think this will be good enough.
Here's my complete solution, you can use it however you want; (the code is being put into the public domain, but I won't accept any liability if anything happens to you or your computer as a consequence of using it and there's no warranty yadda yadda ya).
week_days = {
'monday':0,
'tuesday':1,
'wednesday':2,
'thursday':3,
'friday':4,
'saturday':5,
'sunday':6
}
week_days_reverse = dict(zip(week_days.values(), week_days.keys()))
def days_str_to_int(days):
'''
Converts a list of days into a list of day numbers.
It is case ignorant.
['Monday', 'tuesday'] -> [0, 1]
'''
return map(lambda day: week_days[day.lower()], days)
def day_int_to_str(day):
'''
Converts a day number into a string.
0 -> 'Monday' etc
'''
return week_days_reverse[day].capitalize()
def consecutive(days):
'''
Returns the number of consecutive days after the first given a sequence of
day numbers.
[0, 1, 2, 5] -> 2
[6, 0, 1] -> 2
'''
j = days[0]
n = 0
for i in days[1:]:
j = (j + 1) % 7
if j != i:
break
n += 1
return n
def days_to_ranges(days):
'''
Turns a sequence of day numbers into a list of ranges.
The days can be in any order
(n, m) means n to m
(n,) means just n
[0, 1, 2] -> [(0, 2)]
[0, 1, 2, 4, 6] -> [(0, 2), (4,), (6,)]
'''
days = sorted(set(days))
while days:
n = consecutive(days)
if n == 0:
yield (days[0],)
else:
assert n < 7
yield days[0], days[n]
days = days[n+1:]
def wrap_ranges(ranges):
'''
Given a list of ranges in sequential order, this function will modify it in
place if the first and last range can be put together.
[(0, 3), (4,), (6,)] -> [(6, 3), (4,)]
'''
if len(ranges) > 1:
if ranges[0][0] == 0 and ranges[-1][-1] == 6:
ranges[0] = ranges[-1][0], ranges[0][-1]
del ranges[-1]
def range_to_str(r):
'''
Converts a single range into a string.
(0, 2) -> "Monday to Wednesday"
'''
if len(r) == 1:
return day_int_to_str(r[0])
if r[1] == (r[0] + 1) % 7:
return day_int_to_str(r[0]) + ', ' + day_int_to_str(r[1])
return day_int_to_str(r[0]) + ' to ' + day_int_to_str(r[1])
def ranges_to_str(ranges):
'''
Converts a list of ranges into a string.
[(0, 2), (4, 5)] -> "Monday to Wednesday, Friday, Saturday"
'''
if len(ranges) == 1 and ranges[0] == (0, 6):
return 'all week'
return ', '.join(map(range_to_str, ranges))
def week_days_to_string(days):
'''
Converts a list of days in string form to a stringed list of ranges.
['saturday', 'monday', 'wednesday', 'friday', 'sunday'] ->
'Friday to Monday, Wednesday'
'''
ranges = list(days_to_ranges(days_str_to_int(days)))
wrap_ranges(ranges)
return ranges_to_str(ranges)
Features:
It supports more than one range,
You can enter in the days in any order,
It will wrap around,
Add comments if you find any problems and I'll do my best to fix them.
You would have to check the first date given, then have a list with all of the weekdays in it, check if the next given day is at the next index in the list, and repeat.
This can easily be done with a few loops, assuming the given days are in order.
I didn't test I must say.
def test(days):
days = list(days)
if len(days) == 1:
return days[0]
elif len(days) == 2:
return ' to '.join(days)
else:
return ''.join(days[:1] + [' to ' + days[-1]])
import itertools
#probably a better way to obtain this like with the datetime library
WEEKDAYS = (('Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'))
def weekdays_consecutive(days):
#assumes that days only contains valid weekdays
if len(days) == 0:
return True #or False?
iter = itertools.cycle(WEEKDAYS)
while iter.next() != days[0]: pass
for day in days[1:]:
if day != iter.next(): return False
return True
#...
>>> weekdays_consecutive(('Friday', 'Monday'))
True
>>> weekdays_consecutive(('Friday', 'Monday', 'Tuesday'))
True
>>> weekdays_consecutive(('Friday', 'Monday', 'Tuesday', 'Thursday'))
False
This would either take some intricate case-by-case logic, or a hard-coded storage of all days sequentially. I'd prefer the latter.
def weekdays_consecutive(x):
allDays = { 'Monday':1, 'Tuesday':2, 'Wednesday':3, 'Thursday':4, 'Friday':5, 'Saturday' : 6, 'Sunday' : 7}
mostRecent = x[0]
for i in x[1:]:
if allDays[i] % 7 != allDays[mostRecent] % 7 + 1: return False
mostRecent = i
return True
And this can sort the input : x.sort(lambda x, y: allDays[x] - allDays[y]). I don't know which function you'd prefer to use it in
>>>x = ['Tuesday', 'Thursday', 'Monday', 'Friday']
>>>x.sort(lambda x, y: allDays[x] - allDays[y])
>>>x
['Monday', 'Tuesday', 'Thursday', 'Friday']
This relies on no non-days being present. I imagine you'd want to deal with this in the weekdays_to_string function rather than here in weekdays_consecutive.
i also think you want to change the first case of your other function to 'and' instead of 'to' and add case for single-day inputs.
EDIT: had a pretty dumb mistake i just fixed, should work now!