How to efficiently perform addition over large loops in python

How to efficiently perform addition over large loops in python - python

I am trying to perform addition in an efficient way in python over large loops . I am trying to loop over a range of 100000000.
from datetime import datetime
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
The output from the above code is
--- %s seconds ---0:00:16.662666
4999999950000000
When i try to do it in C, its taking 0.43 seconds
From what i read, python creates new memory everytime when you perform addition to variable. I read some articles and came to know how to perform string concatenation in these situations by avoiding '+' sign . But i dont find anything how to do with integers.

Consider using the sum() function if you can process the list as a whole, which loops entirely in C code and is much faster, and also avoids the creation of new Python objects.
sum(range(100000000))
In my computer, your code takes 07.189210 seconds, while the above statement takes 02.751251 seconds, increasing the processing speed more than 3 times.
Edit: as suggested by mtrw, numpy.sum() can speed up processing even more.

Here is a comparison of three methods: your original way, using sum(range(100000000)) as suggested by Alex Metsai, and using the NumPy numerical library's sum and range functions:
from datetime import datetime
import numpy as np
def orig():
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
def pyway():
start_time = datetime.now()
mysum = sum(range(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
def npway():
start_time = datetime.now()
sum = np.sum(np.arange(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
On my computer, I get:
>>> orig()
--- %s seconds ---0:00:09.504018
4999999950000000
>>> pyway()
--- %s seconds ---0:00:02.382020
4999999950000000
>>> npway()
--- %s seconds ---0:00:00.683411
4999999950000000
NumPy is the fastest, if you can use it in your application.
But, as suggested by Ethan in a comment, it's worth pointing out that calculating the answer directly is by far the fastest:
def mathway():
start_time = datetime.now()
mysum = 99999999*(99999999+1)/2
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
>>> mathway()
--- %s seconds ---0:00:00.000013
4999999950000000.0
I assume your actual problem is not so easily solved by pencil and paper :)

Related

is there a better function than 'computeSVD()' that uses mapreduce in term of execution time?

I used the function computeSVD() and i used a large matrix on it and the execution time is so long comparing to a function that normaly use mapreduce which normaly makes the execution time better.
i compared these two functions:
start_time = time.time()
number_of_documents=200
L,S,R=np.linalg.svd(X) <--- don't use mapreduce
exemple_three = time.time() - start_time
print("---Exemple three : %s seconds ---" % (exemple_three))
output:
---Exemple three : 5.322664976119995 seconds ---
and the second one computeSVD()
start_time = time.time()
number_of_documents=200
svd = mat.computeSVD(5, computeU=True) <--- use mapreduce
exemple_two = time.time() - start_time
print("---Exemple one : %s seconds ---" % (exemple_two))
output:
---Exemple one : 252.04261994361877 seconds ---
my goal is a similar function that uses mapreduce

ValueError: time data '6.9141387939453125e-06' does not match format '%H/%M/%S'

I have a function
start_time = time.time()
That is producing some wild numbers
i.e.
start time: 1611368981.2445016
That is causing the following error
ValueError: time data '6.9141387939453125e-06' does not match format '%H/%M/%S'
Why is time.time() producing such wild times and how to get it into normal format?
lines of code in question:
if plant_warning_flag == 0:
start_time = time.time() #start time flag for plant temperature warning
plant_warning_flag = 1
elapsed_time = time.time() - start_time
print("start time: ", start_time)
print('my elapsed time: ', elapsed_time)
newelaptime = time.strptime(str(elapsed_time), "%H/%M/%S")
newmthactime = pd.to_datetime(maxtime_heatac.strip(), format='%H:%M:%S')
if newcread > plant_warning + critical_threshold_ac:
session = requests.Session()
session.post(acserver_url,headers=headers,data=payload)
elif newlaptime > newmthactime:
payload = {'on1':'4000'}
session = requests.Session()
print('peek-a-boo')
how do I retrieve time.time() is regular date format?

Your error seem to be coming from this line:
newelaptime = time.strptime(str(elapsed_time), "%H/%M/%S")
because you trying to convert Unix time (number of secs of the Unix Epoch - counted from 1.01.1970) which is number - into string - and then using strptime trying to convert this into time object. ...which goes into error - because strptime converts time from human-readable-string into time-object.
Unix Time is great for operators and calculating time ranges but not good for reading by human :)
To get human readable value - try something like this:
datetime.datetime.utcfromtimestamp(YOUR_UNIX_TIMESTAMP).strftime('%Y-%m-%dT%H:%M:%SZ')
it will give you a string 'YYYY-mm-dd H:M:Secs'
...shorter version like this:
datetime.datetime.utcfromtimestamp(YOUR_UNIX_TIMESTAMP)
...will give you a time object - which is great when you want to operate with (less mathematic / more calendar) units like months years etc.

you can use strftime and pass the format of the time which you are interested to get from the local time.
from time import localtime, strftime
result = strftime("%H:%M:%S", localtime())
print(result)

Evaluating linear search execution speed

When I learned how to code someone told me that the "break" instruction was not elegant from an algorithmic perspective and it should not be used. However I've tried to compare the execution speed of two different versions of the linear search algorithm and the version with a for loop is always faster.
Any opinions?
import numpy as np
import random
import time
n = 100000;
x = np.arange(0,n)
random.shuffle(x)
k=30 # the number to search for
#--- OPTION 1: LINEAR SEARCH USING WHILE LOOP
start_time = time.time()
i=0
while (x[i]!=k) and (i<n-1):
i+=1
print(i)
print("V1 --- %s seconds ---" % (time.time() - start_time))
#--- OPTION 2: LINEAR SEARCH USING FOR LOOP w/ BREAK
start_time = time.time()
for i in range(0,n):
if x[i]==k:
break
print(i)
print("V2 --- %s seconds ---" % (time.time() - start_time))

Don't listen to "someone". Everyone uses break statements. There is absolutely nothing wrong with using them. The can make your code both easier to understand and simpler: a win-win as far as I'm concerned.

How to properly use time.time()

I am trying to time a running function. But I need to know how many hours/minutes/seconds does it takes. I am using time.time(), but I don't understand the output. How can I convert this output in terms of how many hours/minutes/seconds does a function took? Or, if there is another proper library?
import time
starttime = time.time()
x=0
for i in range(100000):
x+=i
endtime = time.time()
print('Job took: ', endtime-starttime)

I'd recommend using time.perf_counter instead of time.time, and using timedelta to format the units:
>>> from datetime import timedelta
>>> import time
>>> starttime = time.perf_counter()
>>> x=0
>>> for i in range(100000):
... x+=i
...
>>> duration = timedelta(seconds=time.perf_counter()-starttime)
>>> print('Job took: ', duration)
Job took: 0:00:00.015017
The benefit of using perf_counter is that it won't be impacted by weird things like the timezone or system clock changing while you're measuring, and its resolution is guaranteed to be as high as possible (which may be important if you're timing very quick events).
In either case, the return value is measured in seconds, but you need to know what function it came from in order to know what the float value corresponds to. timedelta is a nicer way to represent a duration than a pure float IMO because it includes the units.

time.time():
The time() function returns the number of seconds passed since epoch.
For Unix system, January 1, 1970, 00:00:00 at UTC is epoch (the point
where time begins).
import time
seconds = time.time()
print("Seconds since epoch =", seconds)
This might not be what you want

time.time() gives the seconds when you started a process. Therefore endtime-starttime gives you the amount of seconds between the beginning and the end of the loop.
A preferable way to stop time in python is to use datetime:
import datetime
starttime = datetime.datetime.now()
x=0
for i in range(100000):
x+=i
endtime = datetime.datetime.now()
diff = endtime - starttime
print('Job took: ', diff.days, diff.seconds, diff.microseconds)

time a script with less than seconds

i have this script but counts from seconds while the scripts ends in less than a second.
import time
start = time.time()
p=[1,2,3,4,5]
print('It took {0:0.1f} seconds'.format(time.time() - start))
python 3.7 uses a new function that can do that. I have 3.6.5. How do i do that?

time.perf_counter(), available since Python 3.3, lets you access a high-resolution wallclock.
t0 = time.perf_counter()
time.sleep(.1)
print(time.perf_counter() - t0)

It doesn't count in seconds. It counts in fractions of a second, it's just that the script ends faster than the precision allowed by the string formatted float, ie. much less than a second.
Try:
import time
start = time.time()
p=[1,2,3,4,5]
time.sleep(0.5)
print('It took {0:0.1f} seconds'.format(time.time() - start))
Also, for shorter sleep you may want to increase the precision of your float formatter (eg {0:0.3f}), so that for shorter sleeps (eg 0.007) you don't have a 0.0 printed to console.
import time
start = time.time()
p=[1,2,3,4,5]
time.sleep(0.007)
print('It took {0:0.3f} seconds'.format(time.time() - start))
Or just remove the formatter entirely (As commented by Inder):
import time
start = time.time()
p=[1,2,3,4,5]
time.sleep(0.007)
print ('It took ' + str(time.time()-start) + ' seconds')
See here for more details of timer resolution: https://docs.python.org/2/library/time.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to efficiently perform addition over large loops in python - python

Related

is there a better function than 'computeSVD()' that uses mapreduce in term of execution time?

ValueError: time data '6.9141387939453125e-06' does not match format '%H/%M/%S'

Evaluating linear search execution speed

How to properly use time.time()

time a script with less than seconds

Categories

Resources