I am trying to use random.choice() to select an item from a dictionary, however, I would like one of the items to be ignored entirely. For example:
mutationMarkers = {0: "original", 1: "point_mutation", 2: "frameshift_insertion",
3: "frameshift_deletion", 4: "copy_number_addition",
5: "copy_number_subtraction"}
mutator = choice(list(markers)) # output is 0, 1, 2, 3, 4, 5
Is it possible to use random.choice and ignore {0: "original"}?
You can use a list comprehension:
mutator = choice([x for x in mutationMarkers if x != 0])
An alternative solution using set:
mutator = choice(tuple(mutationMarkers.keys() - {0}))
Related
sorry if this is a noob question, I wasn't able to find a solution online (maybe I just don't know what to search for).
How do I return the "found" dictionary from this recursive function
(I am only able to return the nth number)
Note: simply returning found at the end does not work for multiple reasons
# Nth Fibonacci number generator
def nth_Rfib(n, found = {0:1, 1:1}):
if n in found:
return found[n]
else:
found[n] = nth_Rfib(n-1, found) + nth_Rfib(n-2, found)
#print(found)
return found[n] # return found ** Doesn't Work **
print(nth_Rfib(5)) # 8
# instead, it should return: {0: 1, 1: 1, 2: 2, 3: 3, 4: 5, 5: 8}
Thank you.
In both cases, you need to return found. But as your function returns dictionary, you need to access a needed value when you call it recursevly:
def nth_Rfib(n, found = {0:1, 1:1}):
if n in found:
return found
else:
found[n] = nth_Rfib(n-1, found)[n-1] + nth_Rfib(n-2, found)[n-2]
return found
print(nth_Rfib(5))
this returns:
{0: 1, 1: 1, 2: 2, 3: 3, 4: 5, 5: 8}
Note a possible issue with default mutable arguments like your found = {0:1, 1:1}, for example:
>>> print(nth_Rfib(3))
{0: 1, 1: 1, 2: 2, 3: 3}
>>> print(nth_Rfib(5))
{0: 1, 1: 1, 2: 2, 3: 3, 4: 5, 5: 8}
>>> print(nth_Rfib(3))
{0: 1, 1: 1, 2: 2, 3: 3, 4: 5, 5: 8}
nth_Rfib(3) after nth_Rfib(5) returns the same dictionary, because you never reset it to the default {0:1, 1:1}.
You need a function that returns a number, so that the recursive expression
found[n] = fib(n-1) + fib(n-2)
can make sense; and you also need a function that returns a dictionary, since that's what you want to return, ultimately.
Hence it makes sense to define two distinct functions, one that returns a number, and one that returns a dict.
def nth_Rfib(n):
found = {0: 0, 1: 1}
def fib(n):
if n not in found:
found[n] = fib(n-1) + fib(n-2)
return found[n]
fib(n)
return found
This makes found a variable which is local to nth_Rfib, but acts like a global variable during the recursive calls of fib.
It also completely eliminates any oddities of mutable default arguments.
>>> nth_Rfib(10)
{0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8, 7: 13, 8: 21, 9: 34, 10: 55}
>>> nth_Rfib(3)
{0: 0, 1: 1, 2: 1, 3: 2}
Problem
Given a sequence (list or numpy array) of 1's and 0's how can I find the number of contiguous sub-sequences of values? I want to return a JSON-like dictionary of dictionaries.
Example
[0, 0, 1, 1, 0, 1, 1, 1, 0, 0] would return
{
0: {
1: 1,
2: 2
},
1: {
2: 1,
3: 1
}
}
Tried
This is the function I have so far
def foo(arr):
prev = arr[0]
count = 1
lengths = dict.fromkeys(arr, {})
for i in arr[1:]:
if i == prev:
count += 1
else:
if count in lengths[prev].keys():
lengths[prev][count] += 1
else:
lengths[prev][count] = 1
prev = i
count = 1
return lengths
It is outputting identical dictionaries for 0 and 1 even if their appearance in the list is different. And this function isn't picking up the last value. How can I improve and fix it? Also, does numpy offer any quicker ways to solve my problem if my data is in a numpy array? (maybe using np.where(...))
You're suffering from Ye Olde Replication Error. Let's instrument your function to show the problem, adding one line to check the object ID of each dict in the list:
lengths = dict.fromkeys(arr, {})
print(id(lengths[0]), id(lengths[1]))
Output:
140130522360928 140130522360928
{0: {2: 2, 1: 1, 3: 1}, 1: {2: 2, 1: 1, 3: 1}}
The problem is that you gave the same dict as initial value for each key. When you update either of them, you're changing the one object to which they both refer.
Replace it with an explicit loop -- not a mutable function argument -- that will create a new object for each dict entry:
for key in lengths:
lengths[key] = {}
print(id(lengths[0]), id(lengths[1]))
Output:
139872021765576 139872021765288
{0: {2: 1, 1: 1}, 1: {2: 1, 3: 1}}
Now you have separate objects.
If you want a one-liner, use a dict comprehension:
lengths = {key: {} for key in lengths}
We have to return the frequency of the length of words in a .txt file.
E.g "My name is Emily" will be converted to a list: ["My", "name", "is", "Emily"], which I converted to a list of the lengths of each word: [2, 4, 2, 5] and then I use the function Counter which outputs a dictionary that looks like:
Counter({2: 2, 4: 1, 5: 1})
But I need it to include count of zero:
Counter({1: 0, 2: 2, 3: 0, 4: 1, 5: 1})
Any ideas?
Should I get rid of the Counter function all together?
Counter only counts the frequency of items, which means that it keeps the count of items which are present.
But, if the item you looking for is not there in the Counter object, it will return 0 by default.
For example,
print Counter()[1]
# 0
If you really need the items with zero count in it, then you can create a normal dictionary out of a Counter, like this
c = Counter({2: 2, 4: 1, 5: 1})
print {num:c[num] for num in xrange(1, max(c) + 1)}
# {1: 0, 2: 2, 3: 0, 4: 1, 5: 1}
Using the Counter class from the collections module it is indeed implicit:
txt = "My name is Emily"
d = collections.Counter([len(x) for x in txt.split()])
d variable contains the information you mentioned without the number 1:
Counter({2: 2, 4: 1, 5: 1})
d[1] returns 0
One can create dictionaries using generators (PEP-289):
dict((h,h*2) for h in range(5))
#{0: 0, 1: 2, 2: 4, 3: 6, 4: 8}
Is it syntactically possible to add some extra key-value pairs in the same dict() call? The following syntax is incorrect but better explains my question:
dict((h,h*2) for h in range(5), {'foo':'bar'})
#SyntaxError: Generator expression must be parenthesized if not sole argument
In other words, is it possible to build the following in a single dict() call:
{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 'foo': 'bar' }
Constructor:
dict(iterableOfKeyValuePairs, **dictOfKeyValuePairs)
Example:
>>> dict(((h,h*2) for h in range(5)), foo='foo', **{'bar':'bar'})
{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 'foo': 'foo', 'bar': 'bar'}
(Note that you will need to parenthesize generator expressions if not the sole argument.)
dict([(h,h*2) for h in range(5)] + [(h,h2) for h,h2 in {'foo':'bar'}.items()])
You could use itertools.chain (see Concatenate generator and item) to add your extra stuff into your call to dict().
It's probably clearer to do it the easy way, though: one call to dict and then add the extra items in explicitly.
I'm working through this thing on pyschools and it has me mystified.
Here's the code:
def convertVector(numbers):
totes = []
for i in numbers:
if i!= 0:
totes.append((numbers.index(i),i))
return dict((totes))
Its supposed to take a 'sparse vector' as input (ex: [1, 0, 1 , 0, 2, 0, 1, 0, 0, 1, 0])
and return a dict mapping non-zero entries to their index.
so a dict with 0:1, 2:1, etc where x is the non zero item in the list and y is its index.
So for the example number it wants this: {0: 1, 9: 1, 2: 1, 4: 2, 6: 1}
but instead gives me this: {0: 1, 4: 2} (before its turned to a dict it looks like this:
[(0, 1), (0, 1), (4, 2), (0, 1), (0, 1)]
My plan is for i to iterate through numbers, create a tuple of that number and its index, and then turn that into a dict. The code seems straightforward, I'm at a loss.
It just looks to me like numbers.index(i) is not returning the index, but instead returning some other, unsuspected number.
Is my understanding of index() defective? Are there known index issues?
Any ideas?
index() only returns the first:
>>> a = [1,2,3,3]
>>> help(a.index)
Help on built-in function index:
index(...)
L.index(value, [start, [stop]]) -> integer -- return first index of value.
Raises ValueError if the value is not present.
If you want both the number and the index, you can take advantage of enumerate:
>>> for i, n in enumerate([10,5,30]):
... print i,n
...
0 10
1 5
2 30
and modify your code appropriately:
def convertVector(numbers):
totes = []
for i, number in enumerate(numbers):
if number != 0:
totes.append((i, number))
return dict((totes))
which produces
>>> convertVector([1, 0, 1 , 0, 2, 0, 1, 0, 0, 1, 0])
{0: 1, 9: 1, 2: 1, 4: 2, 6: 1}
[Although, as someone pointed out though I can't find it now, it'd be easier to write totes = {} and assign to it directly using totes[i] = number than go via a list.]
What you're trying to do, it could be done in one line:
>>> dict((index,num) for index,num in enumerate(numbers) if num != 0)
{0: 1, 2: 1, 4: 2, 6: 1, 9: 1}
Yes your understanding of list.index is incorrect. It finds the position of the first item in the list which compares equal with the argument.
To get the index of the current item, you want to iterate over with enumerate:
for index, item in enumerate(iterable):
# blah blah
The problem is that .index() looks for the first occurence of a certain argument. So for your example it always returns 0 if you run it with argument 1.
You could make use of the built in enumerate function like this:
for index, value in enumerate(numbers):
if value != 0:
totes.append((index, value))
Check the documentation for index:
Return the index in the list of the first item whose value is x. It is
an error if there is no such item.
According to this definition, the following code appends, for each value in numbers a tuple made of the value and the first position of this value in the whole list.
totes = []
for i in numbers:
if i!= 0:
totes.append((numbers.index(i),i))
The result in the totes list is correct: [(0, 1), (0, 1), (4, 2), (0, 1), (0, 1)].
When turning it into again, again, the result is correct, since for each possible value, you get the position of its first occurrence in the original list.
You would get the result you want using i as the index instead:
result = {}
for i in range(len(numbers)):
if numbers[i] != 0:
result[i] = numbers[i]
index() returns the index of the first occurrence of the item in the list. Your list has duplicates which is the cause of your confusion. So index(1) will always return 0. You can't expect it to know which of the many instances of 1 you are looking for.
I would write it like this:
totes = {}
for i, num in enumerate(numbers):
if num != 0:
totes[i] = num
and avoid the intermediate list altogether.
Riffing on #DSM:
def convertVector(numbers):
return dict((i, number) for i, number in enumerate(numbers) if number)
Or, on re-reading, as #Rik Poggi actually suggests.