I am attempting to make a plan, which is a list of classes that can only be added when the required classes have been completed or the co-requisite classes are being taken in the same semester.
Below I have my code that almost works but it always reuses the classes even though they have already been completed/used. I tried to prevent this with and (class_list[i][0] not in classes_done), I was hoping that it wouldn't go into the if statement but it seems like it's being ignored.
The rest of this if statement seems to work fine. (class_list[i][3] == '' or class_list[i][3] in classes_done) does this class have a required completed class if yes has it been completed?
(class_list[i][2] in classes_for_semester or class_list[i][2] == '')does this class have a co-requisite class if yes is it in the class_for_semester or already completed?
The class_list variable is organized like this['name', 'credit', 'co-requisite', 'required completed classes', 'empty']. I added the other variables as comments to show what they look like.
class PlanGenerator:
def generator(max_credit_allowed, min_credit_allowed, classes_done, class_list):
classes_for_semester = []
credits_for_semester = 0
semester = 0
full_plan = []
# class_list = [['MA 241 ', '4', '', '', ''], ['PS 150 ', '3', 'MA 241 ', '', ''], ['UNIV 101', '1', '', '', ''], ['COM 122', '3', '', '', ''], ...]
# max_credit_allowed = 16
# min_credit_allowed = 12
# classes_done=['UNIV 101']
while len(classes_done) != len(class_list): # keep going until all classes are used
while int(min_credit_allowed) > credits_for_semester: # keep going until at least the minimum credits are in the semester
semester += 1
for i in range(len(class_list)): # looping over the class list
if int(class_list[i][1]) + credits_for_semester < max_credit_allowed: #if this class was to be added would it go over the max credit for semester if yes go to next class
if (class_list[i][3] == '' or class_list[i][3] in classes_done) and (class_list[i][2] in classes_for_semester or class_list[i][2] in classes_done or class_list[i][2] == '') and (class_list[i][0] not in classes_done):
classes_for_semester.append(class_list[i][0])
credits_for_semester += int(class_list[i][1])
print('classes for semester', classes_for_semester)
print('semester credits', credits_for_semester)
classes_done.append(classes_for_semester)
full_plan.append(semester)
full_plan.append(classes_for_semester)
print('full plan', full_plan)
classes_for_semester = []
credits_for_semester = 0
print('done')
print(full_plan)
I hope my explanation makes sense.
Maybe somebody can understand my mistake and help me find a good solution.
Also if you have anything that you see would make this code more simple please let me know.
Much appreciated
First, your while int(min_credit_allowed) > credits_for_semester line is leading to an infinite loop. It needs to be changed to
while len(classes_done) != len(class_list) and int(min_credit_allowed) > credits_for_semester: # Remove the second while loop
Secondly, you're appending a list to a list, so you get a 2-D list for classes_done with
classes_done.append(classes_for_semester)
This should be
classes_done += classes_for_semester
so that you add the items from classes_for_semester into classes_done, rather than adding a list.
Your new code should look like this:
def generator(max_credit_allowed, min_credit_allowed, classes_done, class_list):
classes_for_semester = []
credits_for_semester = 0
semester = 0
full_plan = []
# class_list = [['MA 241 ', '4', '', '', ''], ['PS 150 ', '3', 'MA 241 ', '', ''], ['UNIV 101', '1', '', '', ''], ['COM 122', '3', '', '', ''], ...]
# max_credit_allowed = 16
# min_credit_allowed = 12
# classes_done=['UNIV 101']
while len(classes_done) != len(class_list) and int(min_credit_allowed) > credits_for_semester: # keep going until at least the minimum credits are in the semester
semester += 1
for i in range(len(class_list)): # looping over the class list
if int(class_list[i][1]) + credits_for_semester < max_credit_allowed: #if this class was to be added would it go over the max credit for semester if yes go to next class
if (class_list[i][3] == '' or class_list[i][3] in classes_done) and (class_list[i][2] in classes_for_semester or class_list[i][2] in classes_done or class_list[i][2] == '') and (class_list[i][0] not in classes_done):
classes_for_semester.append(class_list[i][0])
credits_for_semester += int(class_list[i][1])
print('classes for semester', classes_for_semester)
print('semester credits', credits_for_semester)
classes_done += classes_for_semester
full_plan.append(semester)
full_plan.append(classes_for_semester)
print('full plan', full_plan)
classes_for_semester = []
credits_for_semester = 0
print('done')
print(full_plan)
I would highly recommend using None instead of '' for the non-existent values, that way you can do a simple value is None check instead of an equality check to an empty string.
For the lists of class information you're passing in, I would change them to classes, dictionaries, or namedtuples (find out more about them here) so that you can easily refer to the values by name rather than numbers.
class_list[i].class_name or class_list[i]['class_name'] are a lot easier to debug in the future than magic indices. You can even change your for loop to use the actual class details as a variable instead of i in range(len(class_list)) like so:
for c in class_list:
if int(c.credits) .... # Using a class or namedtuple approach as suggested above
And one minor thing that probably isn't a huge issue but could become a concern if these lists were to grow long: consider using sets instead of lists for storing things like classes_done and classes_for_semester. It also prevents duplicates from being stored (assuming you don't want to store the same class more than once).
To provide a concrete example of the namedtuple suggestion, you can do the following:
from collections import namedtuple
ClassList = namedtuple('ClassList', ['class_name', 'credits', 'coreq', 'prereq'])
class_list = [
ClassList(class_name='MA 241', credits=4, coreq=None, prereq=None),
ClassList(class_name='PS 150', credits=3, coreq='MA 241', prereq=None),
# ...
]
So your for loop becomes
for c in class_list:
if c.credits + credits_for_semester < max_credits_allowed:
if (c.prereq is None or c.prereq in classes_done) and \
(c.coreq in classes_for_semester or c.coreq in classes_done or c.coreq is None) and \
(c.class_name not in classes_done):
classes_for_semester.append(c.class_name)
credits_for_semester += c.credits
classes_done += classes_for_semester
full_plan.append(semester)
full_plan.append(classes_for_semester)
classes_for_semester = []
credits_for_semester = 0
Related
Below are the few variables examples inside the Test text file that's going to become the web's subdomains.
with open("dictionaries/Combined.txt") as i:
Test = [line.rstrip() for line in i]
Test = ['cl', 'portal', 'find', 'signin', 'payment', 'update', 'group', 'verification',
'confirm', 'claim', 'summary', 'recovery', 'ssl', 'view', 'support']
delimeters = ['','-', '.']
web = 'web'
subdomain_count = 1
suffix = ['co.id', 'info', 'id', 'com']
output = []
in this code i can control how many subdomains and how many domain name i want to generate
for i in range(100):
outputs = []
for j in range(subdomain_count):
outputs.append(random.choice(Test))
prefix = random.choice(delimeters).join(outputs)
out = random.choice(delimeters).join([prefix, web])
tld = random.choice(suffix)
addr = '.'.join([out, tld])
output.append(addr)
Example.write(addr+'\n')
with open("dictionaries/examples.txt") as f:
websamples = [line.rstrip() for line in f]
To generate the number of output that I want I used this linefor i in range(100):, and to generate how long the domain name's going to be i used this for j in range(subdomain_count): this is the example what i mean by how long the domain can be. if i set the subdomain count to 1 then the output would become something like this cl.web.info if i were to add 2 it will become like this update-cl.web.info. As you can see the subdomain is now 2 characters longer where the update is now another subdomain has been added into the domain. Here is another example I'll show where i made the subdomain_count = 1.
From here I got a bit confused, since I have made the web variable at a fixed position where it's always going to be behind the suffix. is there any way where for example I give the subdomain_count = 3 and I want the web variable have more flexibility in its position like this cl.web-update.summary.info or web-cl.info.update.info where it doesn't just stay behind the suffix or the TLD (Top Level Domain)
Thank you!
use random.shuffle to randomize the items in a list. which gives web variable more flexibility in the domain.
import random
Test = ['cl', 'portal', 'find', 'signin', 'payment', 'update', 'group', 'verification',
'confirm', 'claim', 'summary', 'recovery', 'ssl', 'view', 'support']
delimeters = ['', '-', '.']
web = 'web'
subdomain_count = 3
suffix = ['co.id', 'info', 'id', 'com']
output = []
for i in range(5):
outputs = []
for j in range(subdomain_count):
outputs.append(random.choice(Test))
data = outputs + [web]
random.shuffle(data)
prefix = random.choice(delimeters).join(data)
out = random.choice(delimeters).join([prefix])
tld = random.choice(suffix)
addr = '.'.join([out, tld])
output.append(addr)
print(output)
>>> ['group-summary-web-signin.info', 'web-recovery-signin-verification.co.id', 'paymentupdatewebpayment.co.id', 'update-verification-update-web.id', 'find-group-web-portal.info']
last time I've gotten some help on making a website name generator. I feel bad but i'm stuck at the moment and I need some help again to improve it. in my code there's a .txt file called combined which included these lines.
After that i created a variable to add to the domain
web = 'web'
suffix = 'co.id'
And then i write it out so that the it would print the line output to the Combined.txt
output_count = 50
subdomain_count = 2
for i in range(output_count):
out = []
for j in range(subdomain_count):
out.append(random.choice(Test))
out.append(web)
out.append(suffix)
Example.write('.'.join(out)+"\n")
with open("dictionaries/examples.txt") as f:
websamples = [line.rstrip() for line in f]
I want the output where instead of just login.download.web.co.id there would be more variety like login-download.web.co.id or login.download-web.co.id In the code i used Example.write('.'.join(out)+"\n") so that the. would be a separator for each characters. I was thinking of adding more, by making a similar code line and save it to a different .txt files but I feel like it would be too long. Is there a way where I can variate each character separation with this symbol - or _ instead of just a . in the output?
Thanks!
Sure just iterate through a list of delimiters to add each of them to the output.
web = 'web'
suffix = 'co.id'
output_count = 50
subdomain_count = 2
delimeters = [ '-', '.']
for i in range(output_count):
out = []
for j in range(subdomain_count):
out.append(random.choice(Test))
for delimeter in delimeters:
addr = delimeter.join(out)
addrs = '.'.join([addr, web, suffix])
print(addrs)
Example.write(addrs + '\n')
output
my_pay.web.co.id
my-pay.web.co.id
my.pay.web.co.id
pay_download.web.co.id
pay-download.web.co.id
pay.download.web.co.id
group_login.web.co.id
group-login.web.co.id
group.login.web.co.id
install_group.web.co.id
install-group.web.co.id
install.group.web.co.id
...
...
update
import itertools
Test = ['download', 'login', 'my', 'ip', 'site', 'ssl', 'pay', 'install']
delimeters = [ '-', '.']
web = 'web'
suffix = 'co.id'
output_count = 50
subdomain_count = 2
for combo in itertools.combinations(Test, 2):
out = ''
for i, d in enumerate(delimeters):
out = d.join(combo)
out = delimeters[i-1].join([out, web])
addr = '.'.join([out, suffix])
print(addr)
# Example.write(addr+'\n')
output
download-login.web.co.id
download.login-web.co.id
download-my.web.co.id
download.my-web.co.id
download-ip.web.co.id
download.ip-web.co.id
download-site.web.co.id
download.site-web.co.id
download-ssl.web.co.id
download.ssl-web.co.id
download-pay.web.co.id
download.pay-web.co.id
download-install.web.co.id
download.install-web.co.id
login-my.web.co.id
login.my-web.co.id
login-ip.web.co.id
login.ip-web.co.id
login-site.web.co.id
login.site-web.co.id
login-ssl.web.co.id
login.ssl-web.co.id
login-pay.web.co.id
login.pay-web.co.id
login-install.web.co.id
login.install-web.co.id
my-ip.web.co.id
my.ip-web.co.id
my-site.web.co.id
my.site-web.co.id
my-ssl.web.co.id
my.ssl-web.co.id
my-pay.web.co.id
my.pay-web.co.id
my-install.web.co.id
my.install-web.co.id
ip-site.web.co.id
ip.site-web.co.id
ip-ssl.web.co.id
ip.ssl-web.co.id
ip-pay.web.co.id
ip.pay-web.co.id
ip-install.web.co.id
ip.install-web.co.id
site-ssl.web.co.id
site.ssl-web.co.id
site-pay.web.co.id
site.pay-web.co.id
site-install.web.co.id
site.install-web.co.id
ssl-pay.web.co.id
ssl.pay-web.co.id
ssl-install.web.co.id
ssl.install-web.co.id
pay-install.web.co.id
pay.install-web.co.id
As an alternative of replacing the final output, you could make the seperator random:
import random
seperators = ['-', '_', '.']
Example.write(random.choice(seperators).join(out)+"\n")
In order to ensure compliance with RFC 1035 I would suggest:
from random import choices as CHOICES, choice as CHOICE
output_count = 50
subdomain_count = 2
web = 'web'
suffix = 'co.id'
dotdash = '.-'
filename = 'output.txt'
Test = [
'auth',
'access',
'account',
'admin'
# etc
]
with open(filename, 'w') as output:
for _ in range(output_count):
sd = CHOICE(dotdash).join(CHOICES(Test, k=subdomain_count))
print('.'.join((sd, web, suffix)), file=output)
I have a CSV file that I've filtered into a list and grouped. Example:
52713
['52713', '', 'Vmax', '', 'Start Value', '', '\n']
['52713', '', 'Vmax', '', 'ECNumber', '1.14.12.17', '\n']
['52713', 'O2', 'Km', 'M', 'Start Value', '3.5E-5', '\n']
['52713', 'O2', 'Km', 'M', 'ECNumber', '1.14.12.17', '\n']
52714
['52714', '', 'Vmax', '', 'Start Value', '', '\n']
['52714', '', 'Vmax', '', 'ECNumber', '1.14.12.17', '\n']
['52714', 'O2', 'Km', 'M', 'Start Value', '1.3E-5', '\n']
['52714', 'O2', 'Km', 'M', 'ECNumber', '1.14.12.17', '\n']
From this, I create a nested dictionary with the structure:
dict = ID number:{Km:n, Kcat:n, ECNumber:n}
...for every ID in the list.
I use the following code to create this dictionary
dict = {}
for key, items in groupby(FilteredTable1[1:], itemgetter(0)):
#print key
for subitem in items:
#print subitem
dict[subitem[EntryID]] = {}
dict[subitem[EntryID]]['EC'] = []
dict[subitem[EntryID]]['Km'] = []
dict[subitem[EntryID]]['Kcat'] = []
if 'ECNumber' in subitem:
dict[subitem[EntryID]]['EC'] = subitem[value]
if 'Km' in subitem and 'Start Value' in subitem:
dict[subitem[EntryID]]['Km'] = subitem[value]
#print subitem
This works for the ECNumber value, but not the Km value. It can print the line, showing that it identifies the Km value as being present, but doesn't put it in the dictionary.
Example output:
{'Km': [], 'EC': '1.14.12.17', 'Kcat': []}
Any ideas?
Ben
The problem is that your inner for loop keeps reinitializing dict[subitem[EntryID]] even though it may already exist. That's fixed in the following by explicitly checking to see if it's already there:
dict = {}
for key, items in groupby(FilteredTable1[1:], itemgetter(0)):
#print key
for subitem in items:
#print ' ', subitem
if subitem[EntryID] not in dict:
dict[subitem[EntryID]] = {}
dict[subitem[EntryID]]['EC'] = []
dict[subitem[EntryID]]['Km'] = []
dict[subitem[EntryID]]['Kcat'] = []
if 'ECNumber' in subitem:
dict[subitem[EntryID]]['EC'] = subitem[value]
if 'Km' in subitem and 'Start Value' in subitem:
dict[subitem[EntryID]]['Km'] = subitem[value]
#print subitem
However this code could be made more efficient by using something like the following instead, which avoids recomputing values and double dictionary lookups. It also doesn't use the name of a built-in type for a variable name, which goes against the guidelines given in the PEP8 - Style Guide for Python Code. It also suggests using CamelCase only for class names, not for variable names like FilteredTable1 — but I didn't change that.
adict = {}
for key, items in groupby(FilteredTable1[1:], itemgetter(0)):
#print key
for subitem in items:
#print ' ', subitem
entry_id = subitem[EntryID]
if entry_id not in adict:
adict[entry_id] = {'EC': [], 'Km': [], 'Kcat': []}
entry = adict[entry_id]
if 'ECNumber' in subitem:
entry['EC'] = subitem[value]
if 'Km' in subitem and 'Start Value' in subitem:
entry['Km'] = subitem[value]
#print subitem
Actually, since you're building a dictionary of dictionaries, it's not clear that there's any advantage to using groupby to do so.
I'm posting this to follow-up and extend on my previous answer.
For starters, you could streamline the code a little further by eliminating the need to check for preexisting entries simply making the dictionary being created a collections.defaultdict dict subclass instead of a regular one:
from collections import defaultdict
adict = defaultdict(lambda: {'EC': [], 'Km': [], 'Kcat': []})
for key, items in groupby(FilteredTable1[1:], itemgetter(0)):
for subitem in items:
entry = adict[subitem[EntryID]]
if 'ECNumber' in subitem:
entry['EC'] = subitem[value]
if 'Km' in subitem and 'Start Value' in subitem:
entry['Km'] = subitem[value]
Secondly, as I mentioned in the other answer, I don't think you're gaining anything by using itertools.groupby() to do this — except making the process more complicated than needed. This is a because basically what you're doing is making a dictionary-of-dictionaries whose entries can all be randomly accessed, so there's no benefit in going to the trouble of grouping them before doing so. The code below proves this (in conjunction with using a defaultdict as shown above):
adict = defaultdict(lambda: {'EC': [], 'Km': [], 'Kcat': []})
for subitem in FilteredTable1[1:]:
entry = adict[subitem[EntryID]]
if 'ECNumber' in subitem:
entry['EC'] = subitem[value]
if 'Km' in subitem and 'Start Value' in subitem:
entry['Km'] = subitem[value]
Given a dictionary:
data = [{'id':'1234','name':'Jason','pw':'*sss*'},
{'id':'2345','name':'Tom','pw': ''},
{'id':'3456','name':'Art','pw': ''},
{'id':'2345','name':'Tom','pw':'*sss*'}]
I need to find that the always pw contains '' or *sss*.
I tried doing this:
for d in data:
if d['pw'] == ['*sss*' or '']
print "pw verified and it is '*sss*' or '' "
else:
print "pw is not any of two'*sss*' or ''"
Please help me to complete this. I need to find that the always pw contains ' ' or '*sss*'.
If possible I need to do it in a single line.
['*sss*' or ''] returns ['*sss*'] because '' is False and *sss* is considered True.
That means your list reads as [True or False]. And the True factor is chosen (in this case, the *sss*.
You probably meant to do something like:
if d['pw'] in ['*sss*', '']:
Or even:
if d['pw'] == '*sss*' or d['pw'] == '':
As a one liner (kinda):
>>> for res in ("pw verified and it is '*sss*' or '' " if i['pw'] in ['*sss', ''] else "pw is not any of two'*sss*' or ''" for i in data):
... print res
...
pw is not any of two'*sss*' or ''
pw verified and it is '*sss*' or ''
pw verified and it is '*sss*' or ''
pw is not any of two'*sss*' or ''
Use set to do it in one single line.
ans = {d['pw'] for d in data}.issubset({'','*sss*'})
ans is True if d['pw'] is always '' or '*sss*' else False
If you're looking for a one liner, use the all() function.
>>> data = [{'id':'1234','name':'Jason','pw':'*sss*'},
{'id':'2345','name':'Tom','pw': ''},
{'id':'3456','name':'Art','pw': ''},
{'id':'2345','name':'Tom','pw':'*sss*'}]
>>> all(elem['pw'] in ('', '*sss*') for elem in data)
True
For the if condition.
>>> "pw verified" if all(elem['pw'] in ('', '*sss*') for elem in data) else "pw not verified"
'pw verified'
I try to read data from a table in html. I read periodically and the table length always change and I don't know its length. However the table is always on the same format so I try to recognize some pattern and read data based on it's position.
The html is of the form:
<head>
<title>Some webside</title>
</head>
<body
<tr><td> There are some information coming here</td></tr>
<tbody><table>
<tr><td>First</td><td>London</td><td>24</td><td>3</td><td>19:00</td><td align="center"></td></tr>
<tr bgcolor="#cccccc"><td>Second</td><td>NewYork</td><td>24</td><td>4</td><td>20:13</td><td align="center"></td></tr>
<tr><td>Some surprise</td><td>Swindon</td><td>25</td><td>5</td><td>20:29</td><td align="center"></td></tr>
<tr bgcolor="#cccccc"><td>Third</td><td>Swindon</td><td>24</td><td>6</td><td>20:45</td><td align="center"></td></tr>
</tbody></table>
<tr><td> There are some information coming here</td></tr>
</body>
I convert html to a string and go over it to read the data but I want to read it only once. My code is:
def ReadTable(m):
refList = []
firstId = 1
nextId = 2
k = 1
helper = 1
while firstId != nextId:
row = []
helper = m.find('<td><a href="d?k=', helper) + 17
end_helper = m.find('">', helper)
rowId = m[helper : end_helper]
if k == 1: # to check if looped again
firstId = rowId
else:
nextId = rowId
row.append(rowId)
helper = end_helper + 2
end_helper = m.find('</a></td><td>', helper)
rowPlace = m[helper : end_helper]
row.append(rowPlace)
helper = m.find('</a></td><td>', end_helper) + 13
end_helper = m.find('</td><td>', helper)
rowCity = m[helper : end_helper]
row.append(rowCity)
helper = end_helper + 9
end_helper = m.find('</td><td>', helper)
rowDay = m[helper : end_helper]
row.append(rowDay)
helper = end_helper + 9
end_helper = m.find('</td><td>', helper)
rowNumber = m[helper : end_helper]
row.append(rowNumber)
helper = end_helper + 9
end_helper = m.find('</td>', helper)
rowTime = m[helper : end_helper]
row.append(rowTime)
refList.append(row)
k +=1
return refList
if __name__ == '__main__':
filePath = '/home/m/workspace/Tests/mainP.html'
fileRead = open(filePath)
myString = fileRead.read()
print myString
refList = ReadTable(myString)
print 'Final List = %s' % refList
I expect the outcome as a list with 4 lists inside like that:
Final List = [['101', 'First', 'London', '24', '3', '19:00'], ['102', 'Second', 'NewYork', '24', '4', '20:13'], ['201', 'Some surprise', 'Swindon', '25', '5', '20:29'], ['202', 'Third', 'Swindon', '24', '6', '20:45']]
I expect that after first loop the string is read again and the firstId is found again and my while-loop will terminate. Instead I have infinite loop and my list start to look like this:
Final List = [['101', 'First', 'London', '24', '3', '19:00'], ['102', 'Second', 'NewYork', '24', '4', '20:13'], ['201', 'Some surprise', 'Swindon', '25', '5', '20:29'], ['202', 'Third', 'Swindon', '24', '6', '20:45'], ['me webside</title>\n</head>\n<body \n<tr><td> There are some information coming here</td></tr>\n<tbody><table>\n<tr><td><a href="d?k=101', 'First', 'London', '24', '3', '19:00'], ['102', 'Second', 'NewYork', '24', '4', '20:13']...
I don't understand why my helper start to behave this way and I can't figure out how a program like that should be written. Can you suggest a good/effective way to write it or to fix my loop?
I would suggest you invest some time in looking at LXML. It allows you to look at all of the tables in an html file and work with the sub-elements of the things that make up the table (like rows and cells)
LXML is not hard to work with and it allows you to feed in a string with the
html.fromstring(somestring)
Further, there arte a lot of lxml questions that have been asked and answered here on SO so it is not to hard to find good examples to work from
You aren't checking the return from your find and it is returning -1 when it doesn't find a match.
http://docs.python.org/2/library/string.html#string.find
Return -1 on failure
I updated this section of the code and it returns as you expect now. First and last row below match what you have above so you can find the replacement.
row = []
helper = m.find('<td><a href="d?k=', helper)
if helper == -1:
break
helper += 17
end_helper = m.find('">', helper)