Finding float average of random list in python - python

I have looked on several websites, books, and in the documentation and I can't figure out what I am doing wrong. I try to ask for help as a last resort, so that I can learn on my own, but I have spent far too long trying to figure this out, and I am sure it is something really simple that I am doing wrong, but I am learning. The code produces a single different result every time it is ran. The code produces the following error:
26.8
Traceback (most recent call last):
File "main.py", line 7, in
tot = sum(rand)/len(rand)
TypeError: 'float' object is not iterable
import random
for x in range (10000):
rand = random.uniform(10, 100)
print(round(rand, 1))
tot = sum(rand)/len(rand)
print (round(tot, 1))

You're not actually generating a list, you're generating individual values.
Do you really want to print out 10000 values along the way to your final result?
If the answer is "no!", then your code can be reduced to:
import random
N = 10000
print(round(sum(random.uniform(10, 100) for _ in range(N)) / N, 1))
or, if you prefer to break it out a little bit more for readability:
import random
N = 10000
total = sum(random.uniform(10, 100) for _ in range(N))
average = total / N
print(round(average, 1))
If this is beyond the scope of what you've learned, you can create total outside the loop initialized to zero, update it with each new value as you iterate through the loop, and then calculate the final answer:
import random
N = 10000
total = 0.0
for _ in range(N): # use '_' instead of x, since x was unused in your prog
total += random.uniform(10, 100)
average = total / N
print(round(average, 1))
This avoids wasting storage for a list of 10000 values and avoids the append() you're not yet familiar with. Of course, if you need the 10000 values later for some other purpose, you'll need to tuck them away in a list:
import random
N = 10000
l = [random.uniform(10, 100) for _ in range(N)]
total = sum(l)
print(round(total / N, 1))
Addendum
Just for jollies, you can also do this recursively:
import random
def sum_of_rands(n):
if n > 1:
half_n = n // 2
return sum_of_rands(half_n) + sum_of_rands(n - half_n)
elif n == 1:
return random.uniform(10, 100)
N = 10000
print(round(sum_of_rands(N) / N, 1))
print(sum_of_rands(0)) # returns None because nothing is being summed
Splitting the problem in half (on average) in each recursive call keeps the stack to O(log N).
I'd actually advise you to stick with list comprehension or looping, but wanted to show you there are lots of different ways to get to the same result.

In the sum function you must parse an iterable object but you're parsing a float object.
To avoid this error you should put two last lines outside the for loop and append rand to a list. I don't know if it's what you want to do but it shows you how use sum:
import random
l = []
for x in range(10000):
rand = random.uniform(10, 100)
l.append(rand)
print(round(rand, 1))
tot = sum(l)/len(l)
print(round(tot, 1))

Related

How to use multiproccessing to get the highest value?

I am writing a program that has to work through roughly 1000 candidates and find the best score. I need to use multiprocessing to work through a list because this will be done roughly 60000 times. How would we use multiprocessing in this situation. Say that the score is calculated like this:
def get_score(a, b):
return (a * b) / (a + b)
I know a in every case but it changes every time you go through the list of candidates because it adds the best candidate to the list. I want it to iterate through a list of candidates and then find the best score. A non-multiprocessing example would be like this:
s = [random.randint(0, 100)]
candidates = [random.randint(0, 100) for i in range(1000)]
for i in range(60000):
best_score = 0
best_candidate = candidates[0]
for j in candidates:
if get_score(s[-1], j) > best_score:
best_candidate = j
best_score = get_score(s[-1], j)
s.append(best_candidate)
I know that I could create a function but I feel like there is an easier way to do this. Sorry for the beginner question.:/
Your code has many inconsistencies like not updating best_score and still comparing with a 0- valued best score.
Your nested loop based design makes it hard to parralelize the solution, you also didn't provide more details like do order matters?
I'm giving a dummy multiprocessing based solution, which runs the 60000 range loop into n-cpus in parallel, and writes those solutions to numpy arrays. However, it's upto you how you'll merge the solution.
import random
import numpy as np
import multiprocessing as mp
s = [random.randint(0, 100)]
candidates = [random.randint(0, 100) for i in range(1000)]
n_cpu = mp.cpu_count()
def get_score(a, b):
return (a * b) / (a + b)
def partial_gen(num_segment):
part_arr = []
for i in range(60000//n_cpu): # breaking the loop into n_cpu segments
best_score = 0
best_candidate = candidates[0]
for j in candidates:
new_score = get_score(s[-1], j)
if new_score > best_score:
best_candidate = j
best_score = new_score # are you sure? you don't wanna update this?
part_arr.append(best_candidate)
part_arr = np.array(part_arr)
np.save(f'{num_segment}.npy', part_arr)
p = mp.Pool(n_cpu)
p.map(partial_gen, range(n_cpu))
One easy way to speed things up would be to use vectorization (as a first optimization step, rather than multiprocessing). You can achieve this by using numpy ndarrays.

Searching for the 'p' value in a coin toss simulation

Newbie to coding, attempting to find the 'p' value in a coin toss simulation.
Currently getting the attribute error:
'int' object has no attribute 'sum'.
How could it be? please Help.'''
import numpy as np
import random
attempts = 0
t = 0
for I in range (10000):
attempts = random.randint(2, 30)
if (attempts.sum >= 22 ):
t += 1
p = t / 10000
print(p)
If you are just trying to toss a coin 10,000 times and see how many turn up heads (or tails, if you prefer) then this is a simple way to do it. The random.random function returns a number such that 0 <= x < 1, so 50% of the time it should be less than .5.
import random
tosses = 100000
t = 0
for i in range(tosses):
if random.random() < .5:
t += 1
p = t / tosses
print(p)
attempts is the most recent random integer you generated. An int has no attribute (data field) sum. Since you haven't adequately described what you think your code does, we can't fix the problem.
Python's sum function adds up a sequence of items; see the documentation for examples.
You try to count something with variable m, but you give it no initial value.
You set t to 0, and later divide it by your loop limit, but you've never changed the value; this will be 0.0.
Update after OP comments
I think I understand now: you want to estimate the probability of getting at least 22 heads (or whatever side you choose) in a set of 30 tosses of a fair coin. I'll do my best to utilize your original code.
First of all, you have to toss a fair coin; the function call you made generates a random integer in the range [2, 30]. Instead, you need to do a call such as the below in groups of 30:
flip = random.randint(0,1)
This gives you a 0 or 1. Let's assume that we want to count 1 results: this allows us to simply add the series:
count = sum(random.randint(0,1) for _ in range(30))
This will loop 30 times, put the results in a list, and add them up; there's your count of desired flips. Now, do 10,000 of those 30-flip groups, checking each for 22 results:
import random
t = 0
for i in range (10000):
count = sum(random.randint(0,1) for _ in range(30))
if (count >= 22):
t += 1
p = t / 10000
print(p)
Now, if you want to tighten this even more, use the fact that a successful comparison (i.e. True) will evaluate to 1; False will be 0: make all 10,000 trials in an outer comprehension (in-line for):
t = sum(
sum(random.randint(0,1) for flip in range(30)) > 22
for trial in range(10000) )
print(t / 10000)
flip and trial are dummy loop variables; you can use whatever two you like.
Finally, it's usually better style to make named variables for your algorithms parameters, such as
threshhold = 22
trial_limit = 10000
flip_limit = 30
and use those names in your code.

Tossing a fair coin for 100 times and count the number of heads. Repeat this simulation 10**5 times

Write a program to simulate tossing a fair coin for 100 times and count the number of heads. Repeat this simulation 10**5 times to obtain a distribution of the head count.
I wrote below code to count number of heads 100 times, and outer loop should repeat my function 100K times to obtain distribution of the head:
import random
def coinToss():
return random.randint(0, 1)
recordList = []
for j in range(10**5):
for i in range(100):
flip = coinToss()
if (flip == 0):
recordList.append(0)
print(str(recordList.count(0)))
but each time I run my program, instead of getting a list of 100K heads probability, I get no#s higher, can anyone tell me what I doing wrong ?
42
89
136
....
392
442
491
Here's a version with numpy that allows you to more elegantly produce random numbers, as you can also specify a size attribute.
import numpy as np
n_sim = 10
n_flip = 100
sims = np.empty(n_sim)
for j in xrange(n_sim):
flips = np.random.randint(0, 2, n_flip)
sims[j] = np.sum(flips)
Since the original problem asks for a distribution of head counts, you need to keep track of two lists: one for the number of heads per 100-toss trial, and one for the number of heads in the current 100-toss trial.
import random
def coinToss():
return random.randint(0, 1)
experiments = [] # Number of heads per 100-toss experiment
for j in range(10**5):
cnt = [] # Number of heads in current 100-toss experiment
for i in range(100):
flip = coinToss()
if (flip == 0):
cnt.append(0)
experiments.append(cnt.count(0))
print(str(cnt.count(0)))
However, I would strongly suggest doing this in something like numpy which will greatly improve performance. You can do this is one line with numpy:
import numpy as np
experiments = np.random.binomial(n=100, p=0.5, size=10**5)
You can then analyze/plot the distribution of head counts with whatever tools you want (e.g. numpy, matplotlib).
You might notice that your number of heads is ~50 more each time. This is because you don't reset the record counter to [] each time you loop. If you add "recordList = []" straight after your print statement and with the same indentation, it will basically fix your code.
Another nifty way to do this would be to wrap the 100 coin flips experiment in a function and then call the function 10**5 times. You could also use list comprehension to make everything nice and concise:
import random
def hundred_flips():
result = sum([random.randint(0, 1) for i in range(100)])
return result
all_results = [hundred_flips() for i in range(10**5)]
You can simulate a matrix with all your coin flips and then do your calculations on the matrix.
from numpy import mean, std
from numpy.random import rand
N_flip = int(1e5)
N_trials = int(1e2)
coin_flips = rand(N_flip, N_trials) > 0.5
p = mean(coin_flips, axis=0) # Vector of length N_trials with estimated probabilites
print('Mean: %3.2f%%, Std: %3.2f%%' % (mean(p)*100, std(p)*100))

Optimizing the run time of the nested for loop

I am just getting started with competitive programming and after writing the solution to certain problem i got the error of RUNTIME exceeded.
max( | a [ i ] - a [ j ] | + | i - j | )
Where a is a list of elements and i,j are index i need to get the max() of the above expression.
Here is a short but complete code snippet.
t = int(input()) # Number of test cases
for i in range(t):
n = int(input()) #size of list
a = list(map(int, str(input()).split())) # getting space separated input
res = []
for s in range(n): # These two loops are increasing the run-time
for d in range(n):
res.append(abs(a[s] - a[d]) + abs(s - d))
print(max(res))
Input File This link may expire(Hope it works)
1<=t<=100
1<=n<=10^5
0<=a[i]<=10^5
Run-time on leader-board for C language is 5sec and that for Python is 35sec while this code takes 80sec.
It is an online judge so independent on machine.numpy is not available.
Please keep it simple i am new to python.
Thanks for reading.
For a given j<=i, |a[i]-a[j]|+|i-j| = max(a[i]-a[j]+i-j, a[j]-a[i]+i-j).
Thus for a given i, the value of j<=i that maximizes |a[i]-a[j]|+|i-j| is either the j that maximizes a[j]-j or the j that minimizes a[j]+j.
Both these values can be computed as you run along the array, giving a simple O(n) algorithm:
def maxdiff(xs):
mp = mn = xs[0]
best = 0
for i, x in enumerate(xs):
mp = max(mp, x-i)
mn = min(mn, x+i)
best = max(best, x+i-mn, -x+i+mp)
return best
And here's some simple testing against a naive but obviously correct algorithm:
def maxdiff_naive(xs):
best = 0
for i in xrange(len(xs)):
for j in xrange(i+1):
best = max(best, abs(xs[i]-xs[j]) + abs(i-j))
return best
import random
for _ in xrange(500):
r = [random.randrange(1000) for _ in xrange(50)]
md1 = maxdiff(r)
md2 = maxdiff_naive(r)
if md1 != md2:
print "%d != %d\n%s" % (md1, md2, r)
exit
It takes a fraction of a second to run maxdiff on an array of size 10^5, which is significantly better than your reported leaderboard scores.
"Competitive programming" is not about saving a few milliseconds by using a different kind of loop; it's about being smart about how you approach a problem, and then implementing the solution efficiently.
Still, one thing that jumps out is that you are wasting time building a list only to scan it to find the max. Your double loop can be transformed to the following (ignoring other possible improvements):
print(max(abs(a[s] - a[d]) + abs(s - d) for s in range(n) for d in range(n)))
But that's small fry. Worry about your algorithm first, and then turn to even obvious time-wasters like this. You can cut the number of comparisons to half, as #Brett showed you, but I would first study the problem and ask myself: Do I really need to calculate this quantity n^2 times, or even 0.5*n^2 times? That's how you get the times down, not by shaving off milliseconds.

Random prime Number in python

I currently have ↓ set as my randprime(p,q) function. Is there any way to condense this, via something like a genexp or listcomp? Here's my function:
n = randint(p, q)
while not isPrime(n):
n = randint(p, q)
It's better to just generate the list of primes, and then choose from that line.
As is, with your code there is the slim chance that it will hit an infinite loop, either if there are no primes in the interval or if randint always picks a non-prime then the while loop will never end.
So this is probably shorter and less troublesome:
import random
primes = [i for i in range(p,q) if isPrime(i)]
n = random.choice(primes)
The other advantage of this is there is no chance of deadlock if there are no primes in the interval. As stated this can be slow depending on the range, so it would be quicker if you cached the primes ahead of time:
# initialising primes
minPrime = 0
maxPrime = 1000
cached_primes = [i for i in range(minPrime,maxPrime) if isPrime(i)]
#elsewhere in the code
import random
n = random.choice([i for i in cached_primes if p<i<q])
Again, further optimisations are possible, but are very much dependant on your actual code... and you know what they say about premature optimisations.
Here is a script written in python to generate n random prime integers between tow given integers:
import numpy as np
def getRandomPrimeInteger(bounds):
for i in range(bounds.__len__()-1):
if bounds[i + 1] > bounds[i]:
x = bounds[i] + np.random.randint(bounds[i+1]-bounds[i])
if isPrime(x):
return x
else:
if isPrime(bounds[i]):
return bounds[i]
if isPrime(bounds[i + 1]):
return bounds[i + 1]
newBounds = [0 for i in range(2*bounds.__len__() - 1)]
newBounds[0] = bounds[0]
for i in range(1, bounds.__len__()):
newBounds[2*i-1] = int((bounds[i-1] + bounds[i])/2)
newBounds[2*i] = bounds[i]
return getRandomPrimeInteger(newBounds)
def isPrime(x):
count = 0
for i in range(int(x/2)):
if x % (i+1) == 0:
count = count+1
return count == 1
#ex: get 50 random prime integers between 100 and 10000:
bounds = [100, 10000]
for i in range(50):
x = getRandomPrimeInteger(bounds)
print(x)
So it would be great if you could use an iterator to give the integers from p to q in random order (without replacement). I haven't been able to find a way to do that. The following will give random integers in that range and will skip anything that it's tested already.
import random
fail = False
tested = set([])
n = random.randint(p,q)
while not isPrime(n):
tested.add(n)
if len(tested) == p-q+1:
fail = True
break
while n in s:
n = random.randint(p,q)
if fail:
print 'I failed'
else:
print n, ' is prime'
The big advantage of this is that if say the range you're testing is just (14,15), your code would run forever. This code is guaranteed to produce an answer if such a prime exists, and tell you there isn't one if such a prime does not exist. You can obviously make this more compact, but I'm trying to show the logic.
next(i for i in itertools.imap(lambda x: random.randint(p,q)|1,itertools.count()) if isPrime(i))
This starts with itertools.count() - this gives an infinite set.
Each number is mapped to a new random number in the range, by itertools.imap(). imap is like map, but returns an iterator, rather than a list - we don't want to generate a list of inifinite random numbers!
Then, the first matching number is found, and returned.
Works efficiently, even if p and q are very far apart - e.g. 1 and 10**30, which generating a full list won't do!
By the way, this is not more efficient than your code above, and is a lot more difficult to understand at a glance - please have some consideration for the next programmer to have to read your code, and just do it as you did above. That programmer might be you in six months, when you've forgotten what this code was supposed to do!
P.S - in practice, you might want to replace count() with xrange (NOT range!) e.g. xrange((p-q)**1.5+20) to do no more than that number of attempts (balanced between limited tests for small ranges and large ranges, and has no more than 1/2% chance of failing if it could succeed), otherwise, as was suggested in another post, you might loop forever.
PPS - improvement: replaced random.randint(p,q) with random.randint(p,q)|1 - this makes the code twice as efficient, but eliminates the possibility that the result will be 2.

Categories