Related
I would like to do the calculation loop using my function.
b=list(data.iloc[1])
def balance(rate, payment, os):
interest_amount=os*rate/100/12
principal_amount=payment-interest_amount
next_balance=os+interest_amount-principal_amount
return next_balance
c=balance(b[9], b[11], b[8])
d=balance(b[9], b[11], c)
e=balance(b[9], b[11], d)
I would have to start with b[8] as the amount for calculation. After I got the next amount from the function balance, the next amount will be the beginning of the third calculation and so on until the next amount eqaul of less than 0. It should stop the loop.
I need to append calculated values since b[8] until the last (before getting 0 or less).
Any suggestion on this, thank you!
Edit: based on Zaraki Kenpachi
b[8] is amount of money given 17183
b[9] is rate of interest given 3.39
b[11] is payment per month given 5759
The output, which I am trying to do is:
[17183, 11,521, 5,826, 99]
Perhaps something like this
x = b[8]
output = []
output.append(x)
while x > 0:
x = balance(b[9], b[11], X)
output.append(x)
Here you go:
def balance(rate, payment, os):
interest_amount=os*rate/100/12
principal_amount=payment-interest_amount
next_balance=os+interest_amount-principal_amount
return next_balance
next_balance = 17183 # = b[8]
results = []
results.append(next_balance)
while next_balance > 0:
next_balance = balance(3.39, 5759, next_balance) # b[9], b[11]
if next_balance > 0:
results.append(next_balance)
Output:
[17183, 11521.08395, 5827.1780743175, 101.10163043739522]
Python doesn't have a builtin way to unfold / iterate but the normal way you'd implenent a specific unfold is through a generator, which can keep computational state and yield values:
def balances(balance, rate, payment):
while True:
interest_amount = balance*rate/100/12
principal_amount = payment-interest_amount
balance = os+interest_amount-principal_amount
# TO DO: end the computation when balance <= 0, maybe find a way to
# note extra in the last payment, or reject the last payment entirely
# and handle that case separately outside the generator
yield balance
then you can either call next() on your balances iterator to get successive values
bs = balances(b[8], b[9], b[11])
c = next(bs)
d = next(bs)
e = next(bs)
or iterate the entire thing
for balance in balances(b[8], b[9], b[11]):
if balance < 0:
"do something when the last payment was excessive and stop the loop"
...
I had an interview with a hedge fund company in New York a few months ago and unfortunately, I did not get the internship offer as a data/software engineer. (They also asked the solution to be in Python.)
I pretty much screwed up on the first interview problem...
Question: Given a string of a million numbers (Pi for example), write
a function/program that returns all repeating 3 digit numbers and number of
repetition greater than 1
For example: if the string was: 123412345123456 then the function/program would return:
123 - 3 times
234 - 3 times
345 - 2 times
They did not give me the solution after I failed the interview, but they did tell me that the time complexity for the solution was constant of 1000 since all the possible outcomes are between:
000 --> 999
Now that I'm thinking about it, I don't think it's possible to come up with a constant time algorithm. Is it?
You got off lightly, you probably don't want to be working for a hedge fund where the quants don't understand basic algorithms :-)
There is no way to process an arbitrarily-sized data structure in O(1) if, as in this case, you need to visit every element at least once. The best you can hope for is O(n) in this case, where n is the length of the string.
Although, as an aside, a nominal O(n) algorithm will be O(1) for a fixed input size so, technically, they may have been correct here. However, that's not usually how people use complexity analysis.
It appears to me you could have impressed them in a number of ways.
First, by informing them that it's not possible to do it in O(1), unless you use the "suspect" reasoning given above.
Second, by showing your elite skills by providing Pythonic code such as:
inpStr = '123412345123456'
# O(1) array creation.
freq = [0] * 1000
# O(n) string processing.
for val in [int(inpStr[pos:pos+3]) for pos in range(len(inpStr) - 2)]:
freq[val] += 1
# O(1) output of relevant array values.
print ([(num, freq[num]) for num in range(1000) if freq[num] > 1])
This outputs:
[(123, 3), (234, 3), (345, 2)]
though you could, of course, modify the output format to anything you desire.
And, finally, by telling them there's almost certainly no problem with an O(n) solution, since the code above delivers results for a one-million-digit string in well under half a second. It seems to scale quite linearly as well, since a 10,000,000-character string takes 3.5 seconds and a 100,000,000-character one takes 36 seconds.
And, if they need better than that, there are ways to parallelise this sort of stuff that can greatly speed it up.
Not within a single Python interpreter of course, due to the GIL, but you could split the string into something like (overlap indicated by vv is required to allow proper processing of the boundary areas):
vv
123412 vv
123451
5123456
You can farm these out to separate workers and combine the results afterwards.
The splitting of input and combining of output are likely to swamp any saving with small strings (and possibly even million-digit strings) but, for much larger data sets, it may well make a difference. My usual mantra of "measure, don't guess" applies here, of course.
This mantra also applies to other possibilities, such as bypassing Python altogether and using a different language which may be faster.
For example, the following C code, running on the same hardware as the earlier Python code, handles a hundred million digits in 0.6 seconds, roughly the same amount of time as the Python code processed one million. In other words, much faster:
#include <stdio.h>
#include <string.h>
int main(void) {
static char inpStr[100000000+1];
static int freq[1000];
// Set up test data.
memset(inpStr, '1', sizeof(inpStr));
inpStr[sizeof(inpStr)-1] = '\0';
// Need at least three digits to do anything useful.
if (strlen(inpStr) <= 2) return 0;
// Get initial feed from first two digits, process others.
int val = (inpStr[0] - '0') * 10 + inpStr[1] - '0';
char *inpPtr = &(inpStr[2]);
while (*inpPtr != '\0') {
// Remove hundreds, add next digit as units, adjust table.
val = (val % 100) * 10 + *inpPtr++ - '0';
freq[val]++;
}
// Output (relevant part of) table.
for (int i = 0; i < 1000; ++i)
if (freq[i] > 1)
printf("%3d -> %d\n", i, freq[i]);
return 0;
}
Constant time isn't possible. All 1 million digits need to be looked at at least once, so that is a time complexity of O(n), where n = 1 million in this case.
For a simple O(n) solution, create an array of size 1000 that represents the number of occurrences of each possible 3 digit number. Advance 1 digit at a time, first index == 0, last index == 999997, and increment array[3 digit number] to create a histogram (count of occurrences for each possible 3 digit number). Then output the content of the array with counts > 1.
A million is small for the answer I give below. Expecting only that you have to be able to run the solution in the interview, without a pause, then The following works in less than two seconds and gives the required result:
from collections import Counter
def triple_counter(s):
c = Counter(s[n-3: n] for n in range(3, len(s)))
for tri, n in c.most_common():
if n > 1:
print('%s - %i times.' % (tri, n))
else:
break
if __name__ == '__main__':
import random
s = ''.join(random.choice('0123456789') for _ in range(1_000_000))
triple_counter(s)
Hopefully the interviewer would be looking for use of the standard libraries collections.Counter class.
Parallel execution version
I wrote a blog post on this with more explanation.
The simple O(n) solution would be to count each 3-digit number:
for nr in range(1000):
cnt = text.count('%03d' % nr)
if cnt > 1:
print '%03d is found %d times' % (nr, cnt)
This would search through all 1 million digits 1000 times.
Traversing the digits only once:
counts = [0] * 1000
for idx in range(len(text)-2):
counts[int(text[idx:idx+3])] += 1
for nr, cnt in enumerate(counts):
if cnt > 1:
print '%03d is found %d times' % (nr, cnt)
Timing shows that iterating only once over the index is twice as fast as using count.
Here is a NumPy implementation of the "consensus" O(n) algorithm: walk through all triplets and bin as you go. The binning is done by upon encountering say "385", adding one to bin[3, 8, 5] which is an O(1) operation. Bins are arranged in a 10x10x10 cube. As the binning is fully vectorized there is no loop in the code.
def setup_data(n):
import random
digits = "0123456789"
return dict(text = ''.join(random.choice(digits) for i in range(n)))
def f_np(text):
# Get the data into NumPy
import numpy as np
a = np.frombuffer(bytes(text, 'utf8'), dtype=np.uint8) - ord('0')
# Rolling triplets
a3 = np.lib.stride_tricks.as_strided(a, (3, a.size-2), 2*a.strides)
bins = np.zeros((10, 10, 10), dtype=int)
# Next line performs O(n) binning
np.add.at(bins, tuple(a3), 1)
# Filtering is left as an exercise
return bins.ravel()
def f_py(text):
counts = [0] * 1000
for idx in range(len(text)-2):
counts[int(text[idx:idx+3])] += 1
return counts
import numpy as np
import types
from timeit import timeit
for n in (10, 1000, 1000000):
data = setup_data(n)
ref = f_np(**data)
print(f'n = {n}')
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
try:
assert np.all(ref == func(**data))
print("{:16s}{:16.8f} ms".format(name[2:], timeit(
'f(**data)', globals={'f':func, 'data':data}, number=10)*100))
except:
print("{:16s} apparently crashed".format(name[2:]))
Unsurprisingly, NumPy is a bit faster than #Daniel's pure Python solution on large data sets. Sample output:
# n = 10
# np 0.03481400 ms
# py 0.00669330 ms
# n = 1000
# np 0.11215360 ms
# py 0.34836530 ms
# n = 1000000
# np 82.46765980 ms
# py 360.51235450 ms
I would solve the problem as follows:
def find_numbers(str_num):
final_dict = {}
buffer = {}
for idx in range(len(str_num) - 3):
num = int(str_num[idx:idx + 3])
if num not in buffer:
buffer[num] = 0
buffer[num] += 1
if buffer[num] > 1:
final_dict[num] = buffer[num]
return final_dict
Applied to your example string, this yields:
>>> find_numbers("123412345123456")
{345: 2, 234: 3, 123: 3}
This solution runs in O(n) for n being the length of the provided string, and is, I guess, the best you can get.
As per my understanding, you cannot have the solution in a constant time. It will take at least one pass over the million digit number (assuming its a string). You can have a 3-digit rolling iteration over the digits of the million length number and increase the value of hash key by 1 if it already exists or create a new hash key (initialized by value 1) if it doesn't exists already in the dictionary.
The code will look something like this:
def calc_repeating_digits(number):
hash = {}
for i in range(len(str(number))-2):
current_three_digits = number[i:i+3]
if current_three_digits in hash.keys():
hash[current_three_digits] += 1
else:
hash[current_three_digits] = 1
return hash
You can filter down to the keys which have item value greater than 1.
As mentioned in another answer, you cannot do this algorithm in constant time, because you must look at at least n digits. Linear time is the fastest you can get.
However, the algorithm can be done in O(1) space. You only need to store the counts of each 3 digit number, so you need an array of 1000 entries. You can then stream the number in.
My guess is that either the interviewer misspoke when they gave you the solution, or you misheard "constant time" when they said "constant space."
Here's my answer:
from timeit import timeit
from collections import Counter
import types
import random
def setup_data(n):
digits = "0123456789"
return dict(text = ''.join(random.choice(digits) for i in range(n)))
def f_counter(text):
c = Counter()
for i in range(len(text)-2):
ss = text[i:i+3]
c.update([ss])
return (i for i in c.items() if i[1] > 1)
def f_dict(text):
d = {}
for i in range(len(text)-2):
ss = text[i:i+3]
if ss not in d:
d[ss] = 0
d[ss] += 1
return ((i, d[i]) for i in d if d[i] > 1)
def f_array(text):
a = [[[0 for _ in range(10)] for _ in range(10)] for _ in range(10)]
for n in range(len(text)-2):
i, j, k = (int(ss) for ss in text[n:n+3])
a[i][j][k] += 1
for i, b in enumerate(a):
for j, c in enumerate(b):
for k, d in enumerate(c):
if d > 1: yield (f'{i}{j}{k}', d)
for n in (1E1, 1E3, 1E6):
n = int(n)
data = setup_data(n)
print(f'n = {n}')
results = {}
for name, func in list(globals().items()):
if not name.startswith('f_') or not isinstance(func, types.FunctionType):
continue
print("{:16s}{:16.8f} ms".format(name[2:], timeit(
'results[name] = f(**data)', globals={'f':func, 'data':data, 'results':results, 'name':name}, number=10)*100))
for r in results:
print('{:10}: {}'.format(r, sorted(list(results[r]))[:5]))
The array lookup method is very fast (even faster than #paul-panzer's numpy method!). Of course, it cheats since it isn't technicailly finished after it completes, because it's returning a generator. It also doesn't have to check every iteration if the value already exists, which is likely to help a lot.
n = 10
counter 0.10595780 ms
dict 0.01070654 ms
array 0.00135370 ms
f_counter : []
f_dict : []
f_array : []
n = 1000
counter 2.89462101 ms
dict 0.40434612 ms
array 0.00073838 ms
f_counter : [('008', 2), ('009', 3), ('010', 2), ('016', 2), ('017', 2)]
f_dict : [('008', 2), ('009', 3), ('010', 2), ('016', 2), ('017', 2)]
f_array : [('008', 2), ('009', 3), ('010', 2), ('016', 2), ('017', 2)]
n = 1000000
counter 2849.00500992 ms
dict 438.44007806 ms
array 0.00135370 ms
f_counter : [('000', 1058), ('001', 943), ('002', 1030), ('003', 982), ('004', 1042)]
f_dict : [('000', 1058), ('001', 943), ('002', 1030), ('003', 982), ('004', 1042)]
f_array : [('000', 1058), ('001', 943), ('002', 1030), ('003', 982), ('004', 1042)]
Image as answer:
Looks like a sliding window.
Here is my solution:
from collections import defaultdict
string = "103264685134845354863"
d = defaultdict(int)
for elt in range(len(string)-2):
d[string[elt:elt+3]] += 1
d = {key: d[key] for key in d.keys() if d[key] > 1}
With a bit of creativity in for loop(and additional lookup list with True/False/None for example) you should be able to get rid of last line, as you only want to create keys in dict that we visited once up to that point.
Hope it helps :)
-Telling from the perspective of C.
-You can have an int 3-d array results[10][10][10];
-Go from 0th location to n-4th location, where n being the size of the string array.
-On each location, check the current, next and next's next.
-Increment the cntr as resutls[current][next][next's next]++;
-Print the values of
results[1][2][3]
results[2][3][4]
results[3][4][5]
results[4][5][6]
results[5][6][7]
results[6][7][8]
results[7][8][9]
-It is O(n) time, there is no comparisons involved.
-You can run some parallel stuff here by partitioning the array and calculating the matches around the partitions.
inputStr = '123456123138276237284287434628736482376487234682734682736487263482736487236482634'
count = {}
for i in range(len(inputStr) - 2):
subNum = int(inputStr[i:i+3])
if subNum not in count:
count[subNum] = 1
else:
count[subNum] += 1
print count
Given two numbers a and b, we have to find the nth number which is divisible by a or b.
The format looks like below:
Input :
First line consists of an integer T, denoting the number of test cases.
Second line contains three integers a, b and N
Output :
For each test case, print the Nth
number in a new line.
Constraints :
1≤t≤105
1≤a,b≤104
1≤N≤10
Sample Input
1
2 3 10
Sample Output
15
Explanation
The numbers which are divisible by 2
or 3 are: 2,3,4,6,8,9,10,12,14,15 and the 10th number is 15
My code
test_case=input()
if int(test_case)<=100000 and int(test_case)>=1:
for p in range(int(test_case)):
count=1
j=1
inp=list(map(int,input().strip('').split()))
if inp[0]<=10000 and inp[0]>=1 and inp[1]<=10000 and inp[1]>=1 and inp[1]<=1000000000 and inp[1]>=1:
while(True ):
if count<=inp[2] :
k=j
if j%inp[0]==0 or j%inp[1] ==0:
count=count+1
j=j+1
else :
j=j+1
else:
break
print(k)
else:
break
Problem Statement:
For single test case input 2000 3000 100000 it is taking more than one second to complete.I want if i can get the results in less than 1 second. Is there a time efficient approach to this problem,may be if we can use some data structure and algorithms here??
For every two numbers there will be number k such that k=a*b. There will only be so many multiples of a and b under k. This set can be created like so:
s = set(a*1, b*1, ... a*(b-1), b*(a-1), a*b)
Say we take the values a=2, b=3 then s = (2,3,4,6). These are the possible values of c:
[1 - 4] => (2,3,4,6)
[5 - 8] => 6 + (2,3,4,6)
[9 - 12] => 6*2 + (2,3,4,6)
...
Notice that the values repeat with a predictable pattern. To get the row you can take the value of c and divide by length of the set s (call it n). The set index is the mod of c by n. Subtract 1 for 1 indexing used in the problem.
row = floor((c-1)/n)
column = `(c-1) % n`
result = (a*b)*row + s(column)
Python impl:
a = 2000
b = 3000
c = 100000
s = list(set([a*i for i in range(1, b+1)] + [b*i for i in range(1, a+1)]))
print((((c-1)//len(s)) * (a*b)) + s[(c - 1)%len(s)])
I'm not certain to grasp exactly what you're trying to accomplish. But if I get it right, isn't the answer simply b*(N/2)? since you are listing the multiples of both numbers the Nth will always be the second you list times N/2.
In your initial example that would be 3*10/2=15.
In the code example, it would be 3000*100000/2=150'000'000
Update:
Code to compute the desired values using set's and lists to speed up the calculation process. I'm still wondering what the recurrence for the odd indexes could be if anyone happens to stumble upon it...
a = 2000
b = 3000
c = 100000
a_list = [a*x for x in range(1, c)]
b_list = [b*x for x in range(1, c)]
nums = set(a_list)
nums.update(b_list)
nums = sorted(nums)
print(nums[c-1])
This code runs in 0.14s on my laptop. Which is significantly below the requested threshold. Nonetheless, this values will depend on the machine the code is run on.
I have a nice planning exercise, that I just can't get my head around (except for brute force, which is too big to handle). We are organizing a golf trip for 12 persons. We play 4 days of golf. With 3 flights every day. (so 12 flights in total).
We want to:
maximize the number of unique players that play each other in a flight,
but also want to minimize the number of double occurences (2 players playing each other more than once).
Since with 12 players and 4 players per flight I can roughly create 36k player combinations per flight per day, it becomes pretty compute intense. Is there any smarter way to solve this? My gut feeling says fibonacci can help out, but not sure how exactly.
This is the code I have so far:
import random
import itertools
import pandas as pd
def make_player_combi(day):
player_combis = []
for flight in day:
#print flight
for c in itertools.combinations(flight,2):
combi = list(sorted(c))
player_combis.append('-'.join(combi))
return player_combis
def make_score(a,b,c):
df = pd.DataFrame(a + b + c,columns=['player_combi'])['player_combi']
combi_counts = df.value_counts()
pairs_playing = len(combi_counts)
double_plays = combi_counts.value_counts().sort_index()
return pairs_playing, double_plays
players = ['A','B','C', 'D', 'E', 'F', 'G', 'H', 'I', 'J','K','L']
available_players = players[:]
n = 0
combinations_per_day = []
for players_in_flight_1 in itertools.combinations(players,4):
available_players_flight_1 = players[:]
available_players_flight_2 = players[:]
available_players_flight_3 = players[:]
for player in players_in_flight_1:
# players in flight1 can no longer be used in flight 2 or 3
available_players_flight_2.remove(player)
available_players_flight_3.remove(player)
for players_in_flight_2 in itertools.combinations(available_players_flight_2,4):
players_in_flight_3 = available_players_flight_3[:]
for player in players_in_flight_2:
players_in_flight_3.remove(player)
n = n + 1
print str(n), players_in_flight_1,players_in_flight_2,tuple(players_in_flight_3)
combinations_per_day.append([players_in_flight_1,players_in_flight_2,tuple(players_in_flight_3)])
n_max = 100 # limit to 100 entries max per day to save calculations
winning_list = []
max_score = 0
for day_1 in range(0,len(combinations_per_day[0:n_max])):
print day_1
for day_2 in range(0,len(combinations_per_day[0:n_max])):
for day_3 in range(0,len(combinations_per_day[0:n_max])):
a = make_player_combi(combinations_per_day[day_1])
b = make_player_combi(combinations_per_day[day_2])
x,y = make_score(a,b,c)
if x >= max_score:
max_score = x
my_result = {'pairs_playing' : x,
'double_plays' : y,
'max_day_1' : day_1,
'max_day_2' : day_2,
'max_day_3' : day_3
}
winning_list.append(my_result)
I chose the suboptimal ( but good enough) solution by running 5 mio samples and taking the lowest outcome... Thanks all for thinking along
You can brute-force this by eliminating symmetries. Let's call the 12 players a,b,c,d,e,f,g,h,i,j,k,l, write a flight with 4 players concatenated together: degj, and a day's schedule with the three flights: eg. abcd-efgh-ijkl.
The first day's flights are arbitrary: say the 3 flights are abcd-efgh-ijkl.
On the second and third day, you have fewer than (12 choose 4) * (8 choose 4) possibilities, because that counts each distinct schedule 6 times. For example, abcd-efgh-ijkl, efgh-abcd-ijkl and ijkl-efgh-abcd are all counted as separate, but they are essentially the same. In fact, you have (12 choose 4) * (8 choose 4) / 6 = 5775 different schedules.
Overall, this gives you 5775 * 5775 = 33350625, a manageable 33 million to check.
We can do a little better: we might as well assume that day 2 and day 3 schedules are different, and not count schedules that are the same but day 2 and day 3's are swapped over. This gives us another factor of very almost 2.
Here's code that does all that:
import itertools
import collections
# schedules generates all possible schedules for a given day, up
# to symmetry.
def schedules(players):
for c in itertools.combinations(players, 4):
for d in itertools.combinations(players, 4):
if set(c) & set(d):
continue
e = set(players) - set(c) - set(d)
sc = ''.join(sorted(c))
sd = ''.join(sorted(d))
se = ''.join(sorted(e))
if sc < sd < se:
yield sc, sd, se
# all_schedules generates all possible (different up to symmetry) schedules for
# the three days.
def all_schedules(players):
s = list(schedules(players))
d1 = s[0]
for d2, d3 in itertools.combinations(s, 2):
yield d1, d2, d3
# pcount returns a Counter that records how often each pair
# of players play each other.
def pcount(*days):
players = collections.Counter()
for d in days:
for flight in d:
for p1, p2 in itertools.combinations(flight, 2):
players[p1+p2] += 1
return players
def score(*days):
p = pcount(*days)
return len(p), sum(-1 for v in p.itervalues() if v > 1)
best = None
for days in all_schedules('abcdefghijkl'):
s = score(*days)
if s > best:
best = s
print days, s
It still takes some time to run (about 10-15 minutes on my computer), and produces this as the last line of output:
abcd-efgh-ijkl abei-cfgj-dhkl abhj-cekl-dfgi (48, -3)
That means there's 48 unique pairings over the three days, and there's 3 pairs who play each other more than once (ab, fg and kl).
Note that each of the three pairs who play each other more than once play each other on every day. That's unfortunate, and probably means you need to tweak your idea of how to score schedules. For example, excluding schedules where the same pair plays more than twice, and taking the soft-min of the number of players each player sees, gives this solution:
abcd-efgh-ijkl abei-cfgj-dhkl acfk-begl-dhij
This has 45 unique pairings, and 9 pairs that play each other more than once. But every player meets at least 7 different players and might be preferable in practice to the "optimal" solution above.
hi am a newbie to python and i am having a bit of hard time understanding this simple while loop.this program is supposed to calculate the time its takes for the bacteria to double.
time = 0
population = 1000 # 1000 bacteria to start with
growth_rate = 0.21 # 21% growth per minute
while population < 2000:
population = population + growth_rate * population
print population
time = time + 1
print "It took %d minutes for the bacteria to double." % time
print "...and the final population was %6.2f bacteria." % population
and the result is:
1210.0
1464.1
1771.561
2143.58881
It took 4 minutes for the bacteria to double.
...and the final population was 2143.59 bacteria.
what i dont get is why is the final result greater than 2000 cause its supposed to stop before 2000.i am i getting something wrong?
Your code reads: "As long as the population is less than 2000, calculate the population of the next generation and then check again". Hence, it will always calculate one generation too many.
Try this:
while True:
nextGen = population + growth_rate * population
if nextGen > 2000: break
population = nextGen
print population
time = time + 1
EDIT:
Or to get the exact result:
print (math.log (2) / math.log (1 + growth_rate) )
So the whole program could be:
import math
population = 1000
growth_rate = 0.21 # 21% growth per minute
t = math.log (2) / math.log (1 + growth_rate)
print 'It took {} minutes for the bacteria to double.'.format (t)
print '...and the final population was {} bacteria.'.format (2 * population)
Because prior to your last iteration (#4 below) it IS below 2,000.
Iteration #1: 1210.0
Iteration #2: 1464.1
Iteration #3: 1771.561
Iteration #4: 2143.58881
Another way to do this, although perhaps less elegant, would be to add a break in your While loop like this (assuming all you care about is not printing any number higher than 2,000):
while population < 2000:
population = population + growth_rate * population
if population >= 2000:
break
else:
print population
time = time + 1
At the penultimate iteration of the loop the population was less than 2,000, so there was another iteration. On the final iteration the population became more than 2,000 and so the loop exited.
If the population increased by 1 each time then you're correct; the loop would have exited at 2,000. You can see this behaviour using a simpler version:
i = 0
while i < 10:
i += 1
print i
Vary the amount that i increases by in order to see how it changes.
A while loop is an example of an "entry controlled" loop. This means that the condition is checked before entering the loop. Hence, this means if your loop condition was violated during the previous iteration, but not at the beginning, your loop will terminate after the first violation, not before it.
Here is a simple example:
>>> a = 1
>>> while a < 5:
... a = a+3
... print a
...
4
7
So, if you want that your loop must exit before a is greater than or equal to 5, you must do the check in the loop itself and break out of the loop:
>>> a = 1
>>> while a < 5:
... a = a+3
... if a < 5:
... print a
... else:
... break
...
4
So your code right here:
while population < 2000:
population = population + growth_rate*population
Say we enter that while loop and population = 1800, and growth_rate = .21. This satisfies all the requirements for the while loop to enter another loop. But, in the next line, you will set population = 1800 +(.21)*1800 which equals 2178. So, when you print out population, it will say 2178 even though it's higher than 2000
What you could do, is something like this:
while population < 2000:
if population == population + growth_rate * population < 2000:
population = population + growth_rate * population
time = time + 1
else:
break
print population
time = 0
population = 1000 # 1000 bacteria to start with
growth_rate = 0.21 # 21% growth per minute
while population != 2*population:
population = population + (growth_rate * population )
time = time + 1
print "It took %d minutes for the bacteria to double." % time
I think you will be getting the right output now.
I think this is a simple math problem - you're expecting the time to be an integer, although it probably is a floating point number. The loop can't stop when the population has reached 2000, because the way you're calculating it, it never has a value of 2000.