Annotation tips - python

Okay so I've annotated almost everything in my code, but I'm struggling slightly with annotating my for loop, I've got all of it except these two lines I just don't know how to explain with it making sense to anyone but me. Would be great if I could get some tips on this!
y = {} #
Positions = [] #
for i, word in enumerate (Sentence): #This starts a for look to go through every word within the variable 'Sentence' which is a list.
if not word in y: #This continues the code below if the word isn't found within the variable 'y'.
y[word] = (i+1) #This gives the word that wasn't found within the variable 'y' the next unused number plus 1 so that it doesn't confuse those unfamiliar with computer science starting at 0.
Positions = Positions + [y[word]] #This sets the variable 'Positions' to the variables 'Positions' and '[d[word]]'.

If you're going to comment a variable, then the comment should explain that the variable contains (or to be precise, since the purpose of the code is to populate these variables, our goal for what the variable will contain) and/or what it's expected to be used for. Since we don't see this data used for anything, I'll stick to the former:
y = {} # dictionary mapping words to the (1-based) index of their first occurrence in the sentence
Positions = [] # list containing, for each position in the sentence, the (1-based) index of the first occurrence of the word at that position.

In one you are declaring a dictionary:
y = {} #
in another a list:
Positions = [] #
Dictionaries store objects with keys. Lists are stacks of elements (position wise).

Related

Walking through a Directed Graph Python

I know this is not the place to ask for homework answers, but I just want some direction since I'm completely lost. I started a Python course at my college where the professor assumes no one has experience in the language. This is our first assignment:
Build on this code to create a subsequent loop that starts at W and takes a random walk of a particular maximum length specified as a variable. If there are multiple possible next states, choose one uniformly at random. If there are no next states, then you should stop the loop, even if you haven't reached your maximum length.
I understand that this code my professor provided is walking through each character of the string s, but I don't know how it's actually working. I don't get what c in enumerate(s) or next[c]=[] are doing. Any help explaining how this works or how to handle characters and strings in Python would be greatly appreciated. I've been coding in other languages for a few years and I have no idea how to even start this assignment.
next = {} # This will hold the directed graph
s = "Welcome to cs 477"
# This loops through all of the characters in s
# and keeps track of their indices
for i, c in enumerate(s):
if not c in next:
# If this is the first time seeing a particular
# character, we need to make a new key/value pair for it
next[c] = []
if i < len(s)-1: # If there is a character after this
# Record that s[i+1] is one of the following characters
next[c].append(s[i+1])
enumerate(s) allows you to iterate over a list getting both the ith element in the list and the position it's in, as detailed here. In your case, the variable i holds the number of the current element starting from 0 and c the current character. If you were to print i and c inside your for loop, the result would look like this:
for i, c in enumerate(s):
print(i, c)
# 0, W
# 1, e
# 2, l
# ...
Regarding next[c] = [], you need to first understand that next is a dictionary, which is basically a hash set. What next[c] = [] is doing, is adding the character c to the dictionary as the key, and it's value as an empty list.

Strange behavior of list append()

I am trying to identify from a text file, a set of words that appear at least some number of times within any single of the text file. I have a list to hold the qualifying words. The file is read line by line. In each line, words occurred in the line and their counts are put into a dictionary. Words with count number higher than threshold is appended to the list. The code operating on a single line looks as follows (I pseudo coded some parts that doesn't pertain to the problem):
words = []
candidates = {}
for line in text:
for word in line:
if word in dict:
candidates[word] += 1
else
candidates[word] = 1
for word in candidates:
if candidates[word] > threshold:
if word not in words:
words.append(word)
# candidates.clear()
At the end of each line, I was hoping to empty the dictionary and not carry useless content in it. However, the line which I put after the # now: dict.clear() erases the content of the list, and leaves only the qualifying words in the final line. When this line is removed, the output is correct.
Can someone please explain why this is happening? Does the append() method of list class make local copy of the data or just maintain a pointer? Does the dictionary clear() method not only releases the dict's reference to the key value pairs, but also releases other objects' reference to them?
#EDIT: to address some of the comments, the word extraction in each line is pseudocode. I did not think this step is relevant to the problem. If you guys are interested, here's the original code. https://github.com/muyezhu/python/blob/master/freqword
This code looks for frequently occurring short DNA piece in a long sequence. Sample data can be downloaded at this link: http://rosalind.info/problems/1d/
Trying your linked code with the linked dataset shows that you're only getting one set of updates to kmers because the outermost for loop only runs once.
This is due to the range call you're using: range(range(0, len(genome) - L + 1, L - k). In the example data, len(genome) is 100, L is 75 and k is 5. That means your range is range(0, 26, 70), which yields only 0 (the next value would be 70, which is much greater than the upper bounds of 26).
I'm pretty sure you don't want to give the L - k step argument to range. If you change the loop code to use range(len(genome) - L + 1), you get the expected results in kmers: ['CGACA', 'GAAGA', 'AATGT'].

Python recursive function seems to lose some variable values

I have 4x4 table of letters and I want to find all possible paths there. They are candidates for being words. I have problems with the variable "used" It is a list that includes all the places where the path has been already so it doesn't go there again. There should be one used-list for every path. But it doesn't work correctly. For example I had a test print that printed the current word and the used-list. Sometimes the word had only one letter, but path had gone through all 16 cells/indices.
The for-loop of size 8 is there for all possible directions. And main-function executes the chase-function 16 times - once for every possible starting point. Move function returns the indice after moving to a specific direction. And is_allowed tests for whether it is allowed to move to a certain division.
sample input: oakaoastsniuttot. (4x4 table, where first 4 letters are first row etc.)
sample output: all the real words that can be found in dictionary of some word
In my case it might output one or two words but not nearly all, because it thinks some cells are used eventhough they are not.
def chase(current_place, used, word):
used.append(current_place) #used === list of indices that have been used
word += letter_list[current_place]
if len(word)>=11:
return 0
for i in range(3,9):
if len(word) == i and word in right_list[i-3]: #right_list === list of all words
print word
break
for i in range(8):
if is_allowed(current_place, i) and (move(current_place, i) not in used):
chase(move(current_place, i), used, word)
The problem is that there's only one used list that gets passed around. You have two options for fixing this in chase():
Make a copy of used and work with that copy.
Before you return from the function, undo the append() that was done at the start.

Splitting a python list into multiple lists

for example, if i have a list like:
one = [1,2,3]
what function or method can i use to split each element into their own separate list like:
one = [1]
RANDOM_DYNAMIC_NAME = [2]
RANDOM_DYNAMIC_NAME_AGAIN = [3]
and at any given time, the unsplit list called one may have more than 1 element, its dynamic, and this algorithm is needed for a hangman game i am coding as self-given homework.
the algorithm is needed to complete this example purpose:
pick a word: mississippi
guess a letter: s
['_','_','s','s','_','s','s','_','_','_','_']
Here is my code:
http://pastebin.com/gcCZv67D
Looking at your code, if the part you're trying to solve is the comments in lines 24-26, you definitely don't need dynamically-created variables for that at all, and in fact I can't even imagine how they could help you.
You've got this:
enum = [i for i,x in enumerate(letterlist) if x == word]
The names of your variables are very confusing—something called word is the guessed letter, while you've got a different variable letterguess that's something else, and then a variable called letter that's the whole word… But I think I get what you're aiming for.
enum is a list of all of the indices of word within letterlist. For example, if letterlist is 'letter' and word is t, it will be [2, 3].
Then you do this:
bracketstrip = (str(w) for w in enum)
So now bracketstrip is ['2', '3']. I'm not sure why you want that.
z = int(''.join(bracketstrip))
And ''.join(bracketstrip) is '23', so z is 23.
letterguess[z] = word
And now you get an IndexError, because you're trying to set letterguess[23] instead of setting letterguess[2] and letterguess[3].
Here's what I think you want to replace that with:
enum = [i for i,x in enumerate(letterlist) if x == word]
for i in enum:
letterguess[i] = word
A few hints about some other parts of your code:
You've got a few places where you do things like this:
letterlist = []
for eachcharacter in letter:
letterlist.append(eachcharacter)
This is the same as letterlist = list(letter). But really, you don't need that list at all. The only thing you do with that is for i, x in enumerate(letterlist), and you could have done the exact same thing with letter in the first place. You're generally making things much harder for yourself than you have to. Make sure you actually understand why you've written each line of code.
"Because I couldn't get it to work any other way" isn't a reason—what were you trying to get to work? Why did you think you needed a list of letters? Nobody can keep all of those decisions in their head at once. The more skill you have, the more of your code will be so obvious to you that it doesn't need comments, but you'll never get to the point where you don't need any. When you're just starting out, every time you figure out how to do something, add a comment reminding yourself what you were trying to do, and why it works. You can always remove comments later; you can never get back comments that you didn't write.
for question one ,just list comprehension is good . it will return each element as a separate list
[ [x,] for x in one ]
As for a literal answer to your question, here's how you do it, though I can't immagine why you would want to to this. Generally, dynamic variable names are poor design. You probably just want a single list, or list of lists.
import random
for x in one:
name = 'x' + str(random.getrandbits(10))
globals()[name] = [x]

I have two very simple beginner questions for Python

import random
wordlist = {'Candy', 'Monkey'}
level = 0
while level == 0:
number = random.randint(1, 2)
if number == 1:
print 'Candy'
secword = 'Candy'
level = 2
elif number == 2:
print 'Monkey'
secword = 'Monkey'
level = 2
for i in secword:
print i
I have a couple of questions about the code I just randomly wrote (I'm a beginner)
1) How do I assign a word in a list to a variable?
ex. assign the word 'Candy' into a variable because I always get the error (List is not callable)
2) How do I assign the variable i (in the for loop) to a separate variable for each letter?
Thanks! Tell me if it's not specific enough.
It should be pointed out that wordlist is not actually a list, but a set. The difference is that a set does not allow duplicate values, whereas a list does. A list is created using hard-brackets: [], and a set is created using curly-brackets: {}.
This is important because you can't index a set. In other words, you can't get an element using wordlist[0]. It will give you a 'set does not support indexing' error. So, before you try to get an element out of wordlist, make sure you actually declare it as a list:
wordlist = ['Candy', 'Monkey']
I'm not sure what you're asking in your second question. Can you elaborate?
You are getting List is not callable because you are probably using small brackets (). If you use small brackets, and do wordlist(0), you actually make interpreter feel like wordlist is a method and 0 is it's argument.
s = worldlist[0] # to assign "Candy" to s.

Categories