Why is a list needed for random.choice

Why is a list needed for random.choice - python

This is probably a very straight forward question but would love a simple explanation as to the why?
The below code requires a list in order to obtain a random card.
import random
card = random.choice (["hearts", "clubs", "frogs"])
I am puzzled as to why it requires a list and why I cannot do this.
import = random
card = random.choice("hearts" , "clubs", "frogs")
I'm fine that I can't do it I just would like to know why?

Because of Murphy's law: anything that can be done the wrong way will be done the wrong way by someone, some day. Your suggested API would require
random.choice(*lst)
when the values to choose from are in the list (or other sequence) lst. When someone writes
random.choice(lst)
instead, they would always get lst back instead of an exception. The Python principle that "explicit is better than implicit" then dictates that we have to type a few extra characters.
(Admitted, the result of random.choice("foobar") pointed out by others may be surprising to a beginner, but once you get used to the language you'll appreciate the way that works.)

The issue is that you're calling random.choice with 3 parameters, not a single parameter with 3 elements. Try random.choice(('one', 'two', 'three')) for instance.
Any sequence with a length and a suitable __getitem__ (for indexing) will do - since it picks a number between 0 and len(something) to choose the element.
So you could use a tuple instead if you so wanted.

because, the first snippet
["hearts","clubs","frogs"]
sends only one argument to the function (a list)
while the second one sends three strings to the function. The function choice is equipped to take only a single argument. So, you have to send it as a list or anything that can be indexed, so that it chooses a random index to return the value

random.choice will work for any sequence that supports indexing.
>>> random.choice("foobar") #string
'o'
>>> random.choice(("foo","bar","spam")) #tuple
'spam'
>>> random.choice(["foo","bar","spam"]) #list
'spam'
Will not work for sets:
>>> random.choice({"foo","bar","spam"})
Traceback (most recent call last):
File "<ipython-input-313-e97c3088a7ef>", line 1, in <module>
random.choice({"foo","bar","spam"})
File "/usr/lib/python2.7/random.py", line 274, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
TypeError: 'set' object does not support indexing
In random.choice("hearts" , "clubs", "frogs") you actually passed three arguments to choice, while random.choice expects only one parameter and that too must support indexing.
But random.choice can work for dict if the dict has numeric keys(that are between 0 to len(dict)-1), as internally it does something like this:
dic[int(random() * len(seq))]
Example:
>>> dic = dict(zip([1, 2, 3, 4, 5, 6], "abcdef"))
>>> random.choice(dic)
'b'
>>> random.choice(dic)
'd'
>>> random.choice(dic)
'd'
>>> random.choice(dic) #fails as 0 was not found in dic
Traceback (most recent call last):
File "<ipython-input-366-5cfa0e5f2911>", line 1, in <module>
random.choice(dic)
File "/usr/lib/python2.7/random.py", line 274, in choice
return seq[int(self.random() * len(seq))] # raises IndexError if seq is empty
KeyError: 0

There are several good answers above about why random.choice is implemented as it is, and why it's actually what you probably want.
You can wrap it yourself easily enough, if you want to be able to call choice with arbitrary numbers of arguments:
import random
def random_choice_of_arbitrary_args(*args):
return random.choice(args)
Of course you would probably name it something more concise.
This does have the following surprising behavior:
>>> random_choice_of_arbitrary_args([1, 2, 3])
[1, 2, 3]
Which is because you're ultimately telling random.choice to give you a random element of a sequence with one element. So a better implementation might be:
import random
def my_choice(*args):
if len(args) == 1:
return random.choice(args[0])
else:
return random.choice(args)

Have a look at the implementation. It picks a random number from 0 to len(input_sequence) and then uses that index to choose a random item.
Perhaps a better answer is because the documentation says that the input has to be a sequence.

Put simply, because that's not how the random.choice function was defined. My guess is that the decision was made just because accepting a single iterable argument is a cleaner practice than just a variable number of arbitrary arguments.
Therefore defining the function like this:
# This is (approximately) how it's actually written
def choice(iterable):
#choose from iterable
is cleaner than this:
# This is kind of ugly
def choice(*args):
#make sure *args exists, choose from among them.
The way it is actually written allows you to pass it one single iterable item of any kind, which is both clean and convenient. If it was defined the second way, it would be easy to screw up when using it: someone could call random.choice([1, 2, 3]) and they would always get back [1, 2, 3]!

Related

What is the most optimal method for "slicing" path objects, specifically the iterdir() function? [duplicate]

I would like to loop over a "slice" of an iterator. I'm not sure if this is possible as I understand that it is not possible to slice an iterator. What I would like to do is this:
def f():
for i in range(100):
yield(i)
x = f()
for i in x[95:]:
print(i)
This of course fails with:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-37-15f166d16ed2> in <module>()
4 x = f()
5
----> 6 for i in x[95:]:
7 print(i)
TypeError: 'generator' object is not subscriptable
Is there a pythonic way to loop through a "slice" of a generator?
Basically the generator I'm actually concerned with reads a very large file and performs some operations on it line by line. I would like to test slices of the file to make sure that things are performing as expected, but it is very time consuming to let it run over the entire file.
Edit:
As mentioned I need to to this on a file. I was hoping that there was a way of specifying this explicitly with the generator for instance:
import skbio
f = 'seqs.fna'
seqs = skbio.io.read(f, format='fasta')
seqs is a generator object
for seq in itertools.islice(seqs, 30516420, 30516432):
#do a bunch of stuff here
pass
The above code does what I need, however is still very slow as the generator still loops through the all of the lines. I was hoping to only loop over the specified slice

In general, the answer is itertools.islice, but you should note that islice doesn't, and can't, actually skip values. It just grabs and throws away start values before it starts yield-ing values. So it's usually best to avoid islice if possible when you need to skip a lot of values and/or the values being skipped are expensive to acquire/compute. If you can find a way to not generate the values in the first place, do so. In your (obviously contrived) example, you'd just adjust the start index for the range object.
In the specific cases of trying to run on a file object, pulling a huge number of lines (particularly reading from a slow medium) may not be ideal. Assuming you don't need specific lines, one trick you can use to avoid actually reading huge blocks of the file, while still testing some distance in to the file, is the seek to a guessed offset, read out to the end of the line (to discard the partial line you probably seeked to the middle of), then islice off however many lines you want from that point. For example:
import itertools
with open('myhugefile') as f:
# Assuming roughly 80 characters per line, this seeks to somewhere roughly
# around the 100,000th line without reading in the data preceding it
f.seek(80 * 100000)
next(f) # Throw away the partial line you probably landed in the middle of
for line in itertools.islice(f, 100): # Process 100 lines
# Do stuff with each line
For the specific case of files, you might also want to look at mmap which can be used in similar ways (and is unusually useful if you're processing blocks of data rather than lines of text, possibly randomly jumping around as you go).
Update: From your updated question, you'll need to look at your API docs and/or data format to figure out exactly how to skip around properly. It looks like skbio offers some features for skipping using seq_num, but that's still going to read if not process most of the file. If the data was written out with equal sequence lengths, I'd look at the docs on Alignment; aligned data may be loadable without processing the preceding data at all, by e.g by using Alignment.subalignment to create new Alignments that skip the rest of the data for you.

islice is the pythonic way
from itertools import islice
g = (i for i in range(100))
for num in islice(g, 95, None):
print num

You can't slice a generator object or iterator using a normal slice operations.
Instead you need to use itertools.islice as #jonrsharpe already mentioned in his comment.
import itertools
for i in itertools.islice(x, 95)
print(i)
Also note that islice returns an iterator and consume data on the iterator or generator. So you will need to convert you data to list or create a new generator object if you need to go back and do something or use the little known itertools.tee to create a copy of your generator.
from itertools import tee
first, second = tee(f())

let's clarify something first.
Spouse you want to extract the first values from your generator, based on the number of arguments you specified to the left of the expression. Starting from this moment, we have a problem, because in Python there are two alternatives to unpack something.
Let's discuss these alternatives using the following example. Imagine you have the following list l = [1, 2, 3]
1) The first alternative is to NOT use the "start" expression
a, b, c = l # a=1, b=2, c=3
This works great if the number of arguments at the left of the expression (in this case, 3 arguments) is equal to the number of elements in the list.
But, if you try something like this
a, b = l # ValueError: too many values to unpack (expected 2)
This is because the list contains more arguments than those specified to the left of the expression
2) The second alternative is to use the "start" expression; this solve the previous error
a, b, c* = l # a=1, b=2, c=[3]
The "start" argument act like a buffer list.
The buffer can have three possible values:
a, b, *c = [1, 2] # a=1, b=2, c=[]
a, b, *c = [1, 2, 3] # a=1, b=2, c=[3]
a, b, *c = [1, 2, 3, 4, 5] # a=1, b=2, c=[3,4,5]
Note that the list must contain at least 2 values (in the above example). If not, an error will be raised
Now, jump to your problem. If you try something like this:
a, b, c = generator
This will work only if the generator contains only three values (the number of the generator values must be the same as the number of left arguments). Elese, an error will be raise.
If you try something like this:
a, b, *c = generator
If the number of values in the generator is lower than 2; an error will be raise because variables "a", "b" must have a value
If the number of values in the generator is 3; then a=, b=(val_2>, c=[]
If the numeber of values in the generator is greater than 3; then a=, b=(val_2>, c=[, ... ]
In this case, if the generator is infinite; the program will be blocked trying to consume the generator
What I propose for you is the following solution
# Create a dummy generator for this example
def my_generator():
i = 0
while i < 2:
yield i
i += 1
# Our Generator Unpacker
class GeneratorUnpacker:
def __init__(self, generator):
self.generator = generator
def __iter__(self):
return self
def __next__(self):
try:
return next(self.generator)
except StopIteration:
return None # When the generator ends; we will return None as value
if __name__ == '__main__':
dummy_generator = my_generator()
g = GeneratorUnpacker(dummy_generator )
a, b, c = next(g), next(g), next(g)

Explain what would be the output

L=[1,2,3]
print(L[L[2]])
What will be the output?
I am a beginner to python and I am confused by this specific thing in List. I am not understanding what this means.

I 'm going to explain with this list (because you will get IndexError with current list):
L = [1, 2, 3, 4, 5]
print(L[L[2]])
First, Python sees the print() function, in order to call this function, Python has to know it's arguments. So it goes further to evaluate the argument L[L[2]].
L is a list, we can pass an index to the bracket to get the item at that index. So we need to know the index. We go further and calculate the expression inside the first []. It's L[2].
Now we can easily evaluate the expression L[2]. The result is 3.
Take that result and put it instead of L[2]. our full expression is print(L[3]) at the moment.
I think you get the idea now...
From inside to the outside:
step1 -- > print(L[L[2]])
step2 -- > print(L[3])
step3 -- > print(4)

Here what you are doing is, you are getting the value at 2 which should be 3, because python starts from 0 and goes up. So after that the interior braces should be done, and then you find the index at 3 which is too big. Because the indexes only go up to 2, and if I say I had three numbers, and you ask me what the fourth number is, that wouldn't make sense right? So that is the same thing. This would result in an error.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
This means that you are trying to get a value that does not exist. I hope this makes it clear.

L[L[2]] becomes L[3], since L[2] is equal to 3. Then, since there are three elements in L, and indexing starts at 0 in Python, this would result in an IndexError.

Python set usage, returning a union, tradeoffs for type flexibiltiy

I've been using sets quite a lot.
>>> s1
set(['a', 'b'])
Use of the methods allow for type conversion, while overloaded operators do not.
>>> s1.issubset('abc')
True
>>> s1 <= 'abc'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only compare to a set
>>> s1 <= set('abc')
True
I want to be able to unite a set with a union to another set in a function:
>>> s1 | set('bc') # returns the union without modifying either
set(['a', 'c', 'b'])
>>> s1.union('bc') # allows for type conversion.
set(['a', 'c', 'b'])
It seems my best options for a function that does this are:
def add_elements_strict(collector_set):
do_stuff()
return collector_set | more_elements()
or like this.
def add_elements_from_any_iterable(collector_set):
do_stuff()
return collector_set.union(more_elements())
Which would be the better choice? Clearly the first would give a TypeError if given anything but a set, but the second would give greater flexibility. My questions:
Do I gain anything from ensuring I'm always passing this function a set?
Is the flexibility of being able to pass any iterable worth it?

Do you want to be able to pass an arbitrary iterable? Are you going to pass anything other than a set? It's kind of tough to answer your question the way you've put it. The advantage of ensuring it's a set is you will get a loud warning if it isn't. Whether the flexibility is "worth it" depends on what else you're doing with the other things. If the things you're going to be unioning are sets that you need to manipulate as sets in their own right, might as well leave them as sets. If they're always going to be lists and/or tuples because you're using them as lists/tuples in other contexts, then maybe it makes sense to accept any iterable.

You can use set.update for an in-place update which returns an union
Quoting from the docs
Update the set, adding elements from all others.
Example
>>> s1 = set('a')
>>> s1.update('ba')
>>> s1
set(['a', 'b'])

Python Homework - creating a new list

The assignment:
Write a function called splitList(myList, option) that takes, as input, a list and an option, which is either 0 or 1. If the value of the option is 0, the function returns a list consisting of the elements in myList that are negative, and if the value of the option is 1, the function returns a list consisting of the elements in myList that are even.
I know how to determine if a number is even and if a number is negative. I'm struggling with how to return a new list of negative or even numbers based on "option"
This is what I've gotten so far:
def splitList(myList):
newlist = []
for i in range(len(myList)):
if (myList[i]%2 == 0):
newlist.append(myList [i])
return newlist
This program gives the following error:
Traceback (most recent call last): File "<string>", line 1, in <fragment>
builtins.TypeError: splitList() takes exactly 1 positional argument (4 given)

As I mentioned in my comment, you should standardize your indentation: four spaces is Python standard. You can usually set your editor to insert four spaces instead of tabs (don't want to mix tabs with spaces, either).
As to your actual question: try writing three total functions: one that returns all the negative values, one that returns even values, and one that calls the appropriate function based on option.
def splitlist(myList, option):
if option == 1:
return get_even_elements(myList)
elif option == 0:
return get_negative_elements(myList)
def get_even_elements(myList):
pass # Implementation this method here.
def get_negative_elements(myList):
pass # Implementation this method here.
# Test it out!
alist = [-1, 2, -8, 5]
print splitlist(alist, 0)
print splitlist(alist, 1)

Henry Keiter's comment was correct. Just add one space before newlist.append(myList [i]) and it works just fine.
Alternatively, if your teacher lets you, you could use tabs instead of spaces to avoid this problem altogether (just make sure you don't use both in the same file).

def splitList(myList, option):
return [i for i in myList if i<0] if option==0 else [i for i in myList if i>0]
# test
>>> myList=[2, 3, -1, 4, -7]
>>> splitList(myList, 0)
[-1, -7]
>>> splitList(myList, 1)
[2, 3, 4]
>>>

I tried your code as-is and I did not get any errors when I passed in a list of positive integers, so I don't know why your program is 'crashing', so I suspect something else is interfering with your debugging. (Although, as others have said, you really should use the same number of spaces at every indent level.)
Here's what I entered:
def splitList(myList):
newlist = []
for i in range(len(myList)):
if (myList[i]%2 == 0):
newlist.append(myList [i]) # Only 3-space indent here
return newlist
print splitList([1, 2, 3, 4, 5])
When I run it:
[2, 4]
Can you show how you're invoking the method and the exact error message?

My previous answer was incorrect, and I really should have tested before I opened my mouth... Doh!!! Live and learn...
edit
your traceback is giving you the answer. It is reading the list as an args list. Not sure if there is a super pythonic way to get around it, but you can use named arguments:
splitList(myList=yourList)
OR
splitList(myList=(1,2,3))
OR
splitList(myList=[1,2,3])
But your problem is really probably
splitList(1,2,3,4) # This is not a list as an argument, it is a list literal as your 'args' list
Which can be solved with:
splitList([1,2,3,4])
OR
splitList((1,2,3,4))
Is this the traceback you get?
Traceback (most recent call last):
File "pyTest.py", line 10, in <module>
print splitList(aList,1)
TypeError: splitList() takes exactly 1 argument (2 given)
The error is most likely telling you the issue, this would be that the method only takes one argument, but I tried passing in two arguments like your requirements state.

edit
Can you show the code where you are calling the splitList method?
this was incorrect
myList [i]
is your problem, you can't have a space in there. python is VERY VERY VERY strict about syntax
since people seemed to think they should downvote me, let me explain:
by having that space, you effectively said myList (which is a list), and [i], which is a list literal with the value of 'i' at index 0....
so you are passing two variables (one actual variable and one literal) into the append method, NOT separated by a comma, which will not work.

Python error when trying to access list by index - "List indices must be integers, not str"

I have the following Python code :
currentPlayers = query.getPlayers()
for player in currentPlayers:
return str(player['name'])+" "+str(player['score'])
And I'm getting the following error:
TypeError: list indices must be integers, not str
I've been looking for an error close to mine, but not sure how to do it, never got that error. So yeah, how can I transform it to integers instead of string? I guess the problem comes from str(player['score']).

Were you expecting player to be a dict rather than a list?
>>> player=[1,2,3]
>>> player["score"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str
>>> player={'score':1, 'age': 2, "foo":3}
>>> player['score']
1

player['score'] is your problem. player is apparently a list which means that there is no 'score' element. Instead you would do something like:
name, score = player[0], player[1]
return name + ' ' + str(score)
Of course, you would have to know the list indices (those are the 0 and 1 in my example).
Something like player['score'] is allowed in python, but player would have to be a dict.
You can read more about both lists and dicts in the python documentation.

players is a list which needs to be indexed by integers. You seem to be using it like a dictionary. Maybe you could use unpacking -- Something like:
name, score = player
(if the player list is always a constant length).
There's not much more advice we can give you without knowing what query is and how it works.
It's worth pointing out that the entire code you posted doesn't make a whole lot of sense. There's an IndentationError on the second line. Also, your function is looping over some iterable, but unconditionally returning during the first iteration which isn't usually what you actually want to do.

A list is a chain of spaces that can be indexed by (0, 1, 2 .... etc). So if players was a list, players[0] or players[1] would have worked. If players is a dictionary, players["name"] would have worked.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.