So I want to create functions for the median, mean and mode of a list.
The list must be a user input. How would I go about this? Thanks.
You do not have to create functions for median, mean, mode because they are implemented already and can be called explicitly using Numpy and Scipy libraries in Python. Implementing these functions would mean "reinventing the wheel" and could lead to errors and take time. Feel free to use libraries because in most cases they are tested and safe to use. For example:
import numpy as np
from scipy import stats
mylist = [0,1,2,3,3,4,5,6]
median = np.median(mylist)
mean = np.mean(mylist)
mode = int(stats.mode(mylist)[0])
To get user input you should use input(). See https://anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/io.html
If this supposed to be your homework, I'll give you some hint:
mean: iterate through the list, calculate the sum of elements and divide by element count
median: First, you have to sort the list elements in increasing order. Then find out whether the list length is even or odd. If odd, return the center element. If even, return center element and the element next to the the center and calculate their average.
mode: Create a 'helper' list first containing distinct elements of the input list. Then, create a function that has one parameter: a number to - to be counted - how many times it is in the input list. Run this function in a for cycle providing the distinct list elements as input. At each iteration, save the result in a tuple: a tuple consists of (element value, element count). Afterall you should have an array of tuples. When you have all this stuff, select the tuple that has the maximum "element count" and return the corresponding "element value".
Please note that these are just fast hints that can be useful in order to create your own implementation based on the right algorithm you prefer. This could be a good exercise to get started with algorithms and data structures, I hope you'll not skip it:) Good luck!
I have a list of ~30 floats. I want to see if a specific float is in my list. For example:
1 >> # For the example below my list has integers, not floats
2 >> list_a = range(30)
3 >> 5.5 in list_a
False
4 >> 1 in list_a
True
The bottleneck in my code is line 3. I search if an item is in my list numerous times, and I require a faster alternative. This bottleneck takes over 99% of my time.
I was able to speed up my code by making list_a a set instead of a list. Are there any other ways to significantly speed up this line?
The best possible time to check if an element is in list if the list is not sorted is O(n) because the element may be anywhere and you need to look at each item and check if it is what you are looking for
If the array was sorted, you could've used binary search to have O(log n) look up time. You also can use hash maps to have average O(1) lookup time (or you can use built-in set, which is basically a dictionary that accomplishes the same task).
That does not make much sense for a list of length 30, though.
In my experience, Python indeed slows down when we search something in a long list.
To complement the suggestion above, my suggestion will be subsetting the list, of course only if the list can be subset and the query can be easily assigned to the correct subset.
Example is searching for a word in an English dictionary, first subsetting the dictionary into 26 "ABCD" sections based on each word's initials. If the query is "apple", you only need to search the "A" section. The advantage of this is that you have greatly limited the search space and hence the speed boost.
For numerical list, either subset it based on range, or on the first digit.
Hope this helps.
Tuples is n
The birthday problem equation is this:
Question:
For n = 200, write an algorithm (in Python) for enumerating the number of tuples in the sample space that satisfy the condition that at least two people have the same birthday. (Note that your algorithm will need to scan each tuple)
import itertools
print(list(itertools.permutations([0,0,0]))
I am wondering for this question how do I insert a n into this?
"how to get n in there":
n = 200
space = itertools.permutations(bday_pairs, n)
I've left out a couple parts of your code:
itertools returns a list; you don't need to coerce it.
Printing this result is likely not what you want with n = 200; that's a huge list.
Now, all you need to do is to build bday_pairs, the list of all possible pairs of birthdays. For convenience, I suggest that you use the integers 1-365. Since you haven't attacked that part of the problem at all, I'll leave that step up to you.
You still need to do the processing to count the sets with at least one matching birthday, another part of the problem you haven't attacked. However, I trust that the above code solves your stated problem?
I'm working on a script which creates random solutions to the traveling salesman problem. I have a set of cities, as well as a set of distances (which I not yet need since I'm creating random solutions).
I am attempting to use Itertools.permutations on the list of cities in order to create unique routes. My questions are:
Is there any way in which I can make Itertools.permutations generate random solutions? Right now, it starts with cities[0]->cities[1]->cities[2] and so on. Can I randomize this?
Can I make itertools generate only n permutations before returning? I have a list of 29 cities, and as you might have guessed that's a whole lot of permutations!
Thanks in advance!
This Python code will generate the permutations based on a random starting point and will stop after 6 permutations:
from itertools import permutations
from random import shuffle
A=range(26)
shuffle(A)
for i,perm in enumerate(permutations(A)):
print perm
if i>=5:
break
Note that the permutations still have a lot of structure so the second permutation will be a lot like the first.
You may do better simply using shuffle(A) each time to try a different permutation. (It may regenerate a permutation but with very low probability for 29 cities.) For example,
for i in range(10):
shuffle(A)
print A
Say I have a list x with unkown length from which I want to randomly pop one element so that the list does not contain the element afterwards. What is the most pythonic way to do this?
I can do it using a rather unhandy combincation of pop, random.randint, and len, and would like to see shorter or nicer solutions:
import random
x = [1,2,3,4,5,6]
x.pop(random.randint(0,len(x)-1))
What I am trying to achieve is consecutively pop random elements from a list. (i.e., randomly pop one element and move it to a dictionary, randomly pop another element and move it to another dictionary, ...)
Note that I am using Python 2.6 and did not find any solutions via the search function.
What you seem to be up to doesn't look very Pythonic in the first place. You shouldn't remove stuff from the middle of a list, because lists are implemented as arrays in all Python implementations I know of, so this is an O(n) operation.
If you really need this functionality as part of an algorithm, you should check out a data structure like the blist that supports efficient deletion from the middle.
In pure Python, what you can do if you don't need access to the remaining elements is just shuffle the list first and then iterate over it:
lst = [1,2,3]
random.shuffle(lst)
for x in lst:
# ...
If you really need the remainder (which is a bit of a code smell, IMHO), at least you can pop() from the end of the list now (which is fast!):
while lst:
x = lst.pop()
# do something with the element
In general, you can often express your programs more elegantly if you use a more functional style, instead of mutating state (like you do with the list).
You won't get much better than that, but here is a slight improvement:
x.pop(random.randrange(len(x)))
Documentation on random.randrange():
random.randrange([start], stop[, step])
Return a randomly selected element from range(start, stop, step). This is equivalent to choice(range(start, stop, step)), but doesn’t actually build a range object.
To remove a single element at random index from a list if the order of the rest of list elements doesn't matter:
import random
L = [1,2,3,4,5,6]
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i] # swap with the last element
x = L.pop() # pop last element O(1)
The swap is used to avoid O(n) behavior on deletion from a middle of a list.
despite many answers suggesting use random.shuffle(x) and x.pop() its very slow on large data. and time required on a list of 10000 elements took about 6 seconds when shuffle is enabled. when shuffle is disabled speed was 0.2s
the fastest method after testing all the given methods above was turned out to be written by #jfs
import random
L = [1,"2",[3],(4),{5:"6"},'etc'] #you can take mixed or pure list
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i] # swap with the last element
x = L.pop() # pop last element O(1)
in support of my claim here is the time complexity chart from this source
IF there are no duplicates in list,
you can achieve your purpose using sets too. once list made into set duplicates will be removed. remove by value and remove random cost O(1), ie very effecient. this is the cleanest method i could come up with.
L=set([1,2,3,4,5,6...]) #directly input the list to inbuilt function set()
while 1:
r=L.pop()
#do something with r , r is random element of initial list L.
Unlike lists which support A+B option, sets also support A-B (A minus B) along with A+B (A union B)and A.intersection(B,C,D). super useful when you want to perform logical operations on the data.
OPTIONAL
IF you want speed when operations performed on head and tail of list, use python dequeue (double ended queue) in support of my claim here is the image. an image is thousand words.
Here's another alternative: why don't you shuffle the list first, and then start popping elements of it until no more elements remain? like this:
import random
x = [1,2,3,4,5,6]
random.shuffle(x)
while x:
p = x.pop()
# do your stuff with p
I know this is an old question, but just for documentation's sake:
If you (the person googling the same question) are doing what I think you are doing, which is selecting k number of items randomly from a list (where k<=len(yourlist)), but making sure each item is never selected more than one time (=sampling without replacement), you could use random.sample like #j-f-sebastian suggests. But without knowing more about the use case, I don't know if this is what you need.
One way to do it is:
x.remove(random.choice(x))
While not popping from the list, I encountered this question on Google while trying to get X random items from a list without duplicates. Here's what I eventually used:
items = [1, 2, 3, 4, 5]
items_needed = 2
from random import shuffle
shuffle(items)
for item in items[:items_needed]:
print(item)
This may be slightly inefficient as you're shuffling an entire list but only using a small portion of it, but I'm not an optimisation expert so I could be wrong.
This answer comes courtesy of #niklas-b:
"You probably want to use something like pypi.python.org/pypi/blist "
To quote the PYPI page:
...a list-like type with better asymptotic performance and similar
performance on small lists
The blist is a drop-in replacement for the Python list that provides
better performance when modifying large lists. The blist package also
provides sortedlist, sortedset, weaksortedlist, weaksortedset,
sorteddict, and btuple types.
One would assume lowered performance on the random access/random run end, as it is a "copy on write" data structure. This violates many use case assumptions on Python lists, so use it with care.
HOWEVER, if your main use case is to do something weird and unnatural with a list (as in the forced example given by #OP, or my Python 2.6 FIFO queue-with-pass-over issue), then this will fit the bill nicely.