Concurrent.futures Problems - python

So I have this code that needs to use Concurrent.futures module and for some reason it is telling me it does not exist. I have looked it up and I can not find what the problem is. I have tried installing the tools I need from it thinking that was the case but I can only get one of them to download.
Error message:
from concurrent.futures import ProcessPoolExecutor
ModuleNotFoundError: No module named 'concurrent.futures';
'concurrent' is not a package
my code:
import requests, time
from concurrent.futures import ProcessPoolExecutor
sites = ["http://www.youtube.com"]
def get_one(site):
resp = requests.get(site)
size = len(resp.content)
print(f"download {site} bytes from {site}")
return size
def main():
total_size = 0
start = time.perf_counter()
with ProcessPoolExecutor as exec:
total_size = sum(exec.map(get_one, sites))
end = time.perf_counter()
for site in sites:
total_size += size
#print(f"downlded {size} bytes from {site}")
#end = time.perf_counter()
print(f"elapsed time: {end - start} seconds")
print (f"downloaded a totla of {total_size} bytes")
if __name__== "__main__":
main()
I know that normally there should be a file when I say "from" but everything I look up says concurrent.futures is a part of python, but for some reason mine will not work properly. If it is out there do I have to install it?

I found that I had a file named concurrent.py in my folder that was messing everything up!

Related

Sensor data with pythin does not get written to File

I'm currently working on a script for my sensor on my Raspberry Pi. The code underneath should get the values of my sensor and write it into a the data.json file. My problem is, if I run the scipt with my the Thonny editor everything works but if I add the script to my crontab menu the data does not get written to the data.json file.
The Code:
import time
import board
import adafruit_dht
import psutil
import io
import json
import os
from gpiozero import LED
from datetime import date
from datetime import datetime
# We first check if a libgpiod process is running. If yes, we kill it!
for proc in psutil.process_iter():
if proc.name() == "libgpiod_pulsein" or proc.name() == "libgpiod_pulsei":
proc.kill()
sensor = adafruit_dht.DHT11(board.D23)
# init
temp_values = [10]
hum_values = [10]
counter = 0
dataLED = LED(13)
dataList = []
def errSignal():
for i in range(0,3):
dataLED.on()
time.sleep(0.1)
dataLED.off()
time.sleep(0.1)
#on startup
def runSignal():
for i in range(0,5):
dataLED.on()
time.sleep(0.2)
dataLED.off()
time.sleep(0.2)
def getExistingData():
with open('data.json') as fp:
dataList = json.load(fp)
print(dataList)
def startupCheck():
if os.path.isfile("data.json") and os.access("data.json", os.R_OK):
# checks if file exists
print("File exists and is readable.")
# get json data an push into arr on startup
getExistingData()
else:
print("Either file is missing or is not readable, creating file...")
# create json file
with open("data.json", "w") as f:
print("The json file is created.")#
def calc_avgValue(values):
sum = 0
for iterator in values:
sum += iterator
return sum / len(values)
def onOFF():
dataLED.on()
time.sleep(0.7)
dataLED.off()
# data led blinking on startup
runSignal()
# checks if file exists
startupCheck()
while True:
try:
temp_values.insert(counter, sensor.temperature)
hum_values.insert(counter, sensor.humidity)
counter += 1
time.sleep(6)
if counter >= 10:
print(
"Temperature: {}*C Humidity: {}% ".format(
round(calc_avgValue(temp_values), 2),
round(calc_avgValue(hum_values), 2)
)
)
# get time
today = date.today()
now = datetime.now()
# create json obj
data = {
"temperature": round(calc_avgValue(temp_values), 2),
"humidity": round(calc_avgValue(hum_values), 2),
"fullDate": str(today),
"fullDate2": str(today.strftime("%d/%m/%Y")),
"fullDate3": str(today.strftime("%B %d, %Y")),
"fullDate4": str(today.strftime("%b-%d-%Y")),
"date_time": str(now.strftime("%d/%m/%Y %H:%M:%S"))
}
# push data into list
dataList.append(data)
# writing to data.json
with open("data.json", "w") as f:
json.dump(dataList, f, indent=4, separators=(',',': '))
# if data is written signal appears
onOFF()
print("Data has been written to data.json...")
counter = 0
except RuntimeError as error:
continue
except Exception as error:
sensor.exit()
while True:
errSignal()
raise error
time.sleep(0.2)
Crontab Menu:
The line in the center is the script.
Investigation areas:
Do not put & in crontab, it serves no purpose.
You should capture the output of your scripts to see what is going on. You do this by adding >/tmp/stats.out 2>/tmp/stats.err (and similar for the other 2 lines). You will see what output and errors your scripts encounter.
cron does not run your scripts in the same environment, and from the same directory you are running them. Load what you require in the script.
cron might not have permissions to write into data.yml in the directory it is running from. Specify a full path, and ensure cron can write in that directory.
Look at https://unix.stackexchange.com/questions/109804/crontabs-reboot-only-works-for-root for usage of #reboot. Things that should occur at startup should be configured through systemd or init.d (I do not know what Rasperry Pie uses vs distro). Cron is to schedule jobs, not run things at startup.
It could be as simple as not having python3 in the PATH configured in cron.

python: monitor updates in /proc/mydev file

I wrote a kernel module that writes in /proc/mydev to notify the python program in userspace. I want to trigger a function in the python program whenever there is an update of data in /proc/mydev from the kernel module. What is the best way to listen for an update here? I am thinking about using "watchdog" (https://pythonhosted.org/watchdog/). Is there a better way for this?
This is an easy and efficient way:
import os
from time import sleep
from datetime import datetime
def myfuction(_time):
print("file modified, time: "+datetime.fromtimestamp(_time).strftime("%H:%M:%S"))
if __name__ == "__main__":
_time = 0
while True:
last_modified_time = os.stat("/proc/mydev").st_mtime
if last_modified_time > _time:
myfuction(last_modified_time)
_time = last_modified_time
sleep(1) # prevent high cpu usage
result:
file modified, time: 11:44:09
file modified, time: 11:46:15
file modified, time: 11:46:24
The while loop guarantees that the program keeps listening to changes forever.
You can set the interval by changing the sleep time. Low sleep time causes high CPU usage.
import time
import os
# get the file descriptor for the proc file
fd = os.open("/proc/mydev", os.O_RDONLY)
# create a polling object to monitor the file for updates
poller = select.poll()
poller.register(fd, select.POLLIN)
# create a loop to monitor the file for updates
while True:
events = poller.poll(10000)
if len(events) > 0:
# read the contents of the file if updated
print(os.read(fd, 1024))
sudo pip install inotify
Example
Code for monitoring a simple, flat path (see “Recursive Watching” for watching a hierarchical structure):
import inotify.adapters
def _main():
i = inotify.adapters.Inotify()
i.add_watch('/tmp')
with open('/tmp/test_file', 'w'):
pass
for event in i.event_gen(yield_nones=False):
(_, type_names, path, filename) = event
print("PATH=[{}] FILENAME=[{}] EVENT_TYPES={}".format(
path, filename, type_names))
if __name__ == '__main__':
_main()
Expected output:
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_MODIFY']
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_OPEN']
PATH=[/tmp] FILENAME=[test_file] EVENT_TYPES=['IN_CLOSE_WRITE']
I'm not sure if this would work for your situation, since it seems that you're wanting to watch a folder, but this program watches a file at a time until the main() loop repeats:
import os
import time
def main():
contents = os.listdir("/proc/mydev")
for file in contents:
f = open("/proc/mydev/" + file, "r")
init = f.read()
f.close()
while different = false:
f = open("/proc/mydev/" + file, "r")
check = f.read()
f.close()
if init !== check:
different = true
else:
different = false
time.sleep(1)
main()
# Write what you would want to happen if a change occured here...
main()
main()
main()
You could then write what you would want to happen right before the last usage of main(), as it would then repeat.
Also, this may contain errors, since I rushed this.
Hope this at least helps!
You can't do this efficiently without modifying your kernel driver.
Instead of using procfs, have it register a new character device under /dev, and write that driver to make new content available to read from that device only when new content has in fact come in from the underlying hardware, such that the application layer can issue a blocking read and have it return only when new content exists.
A good example to work from (which also has plenty of native Python clients) is the evdev devices in the input core.

Parallel processing in Python: options and alternative

I tried joblib, however, I got stuck at setting the processor affinity as explained here (the error is shown below along with my script).
Now I want to know if there are other options or alternatives that would allow me to accomplish the same goal, which is to run the same script in parallel, using my 8 cores (in a fashion that resembles GNU parallel).
Error:
AttributeError: 'Process' object has no attribute 'set_cpu_affinity'
My script:
from datetime import datetime
from subprocess import call
from joblib import Parallel, delayed
import multiprocessing
import psutil
import os
startTime = datetime.now()
pdb_name_list = []
for filename in os.listdir('/home/labusr/Documents/python_scripts/Spyder/Refinement'):
if filename.endswith(".pdb"):
pdb_name_list.append(filename)
num_cores = multiprocessing.cpu_count()
p = psutil.Process(os.getpid())
p.set_cpu_affinity(range(num_cores))
print(p.get_cpu_affinity())
inputs = range(2)
def strcuture_refine(file_name,i):
print('Refining strcuture %s round %s......\n' %(file_name, i))
call(['/home/labusr/rosetta/main/source/bin/rosetta_scripts.linuxgccrelease',
'-in::file::s', '/home/labusr/Documents/python_scripts/Spyder/_Refinement/%s' %file_name,
'-parser::protocol', '/home/labusr/Documents/A_asymm_refine.xml',
'-parser::script_vars', 'denswt=35', 'rms=1.5', 'reso=4.3', 'map=/home/labusr/Documents/tubulin_exercise/masked_map_center.mrc',
'testmap=/home/labusr/Documents/tubulin_exercise/mmasked_map_top_centered_resampled.mrc',
'-in:ignore_unrecognized_res',
'-edensity::mapreso', '4.3',
'-default_max_cycles', '200',
'-edensity::cryoem_scatterers',
'-beta',
'-out::suffix', '_%s' %i,
'-crystal_refine'])
print('Time for refining %s round %s is: \n' %(file_name, i), datetime.now() - startTime)
for file_name in pdb_name_list:
Parallel(n_jobs=num_cores)(delayed(strcuture_refine)(file_name,i) for i in inputs)
The simplest thing to do is to just launch multiple Python processes from e.g. a command line. To make each process handle its own file, you can pass it when invoking Python:
python myscript.py filename
The passed filename is then available in Python via
import sys
filename = sys.argv[1]

What File Descriptor object does Python AsyncIO's loop.add_reader() expect?

I'm trying to understand how to use the new AsyncIO functionality in Python 3.4 and I'm struggling with how to use the event_loop.add_reader(). From the limited discussions that I've found it looks like its for reading the standard out of a separate process as opposed to the contents of an open file. Is that true? If so it appears that there's no AsyncIO specific way to integrate standard file IO, is this also true?
I've been playing with the following code. The output of the following gives the exception PermissionError: [Errno 1] Operation not permitted from line 399 of /python3.4/selectors.py self._epoll.register(key.fd, epoll_events) that is triggered by the add_reader() line below
import asyncio
import urllib.parse
import sys
import pdb
import os
def fileCallback(*args):
pdb.set_trace()
path = sys.argv[1]
loop = asyncio.get_event_loop()
#fd = os.open(path, os.O_RDONLY)
fd = open(path, 'r')
#data = fd.read()
#print(data)
#fd.close()
pdb.set_trace()
task = loop.add_reader(fd, fileCallback, fd)
loop.run_until_complete(task)
loop.close()
EDIT
For those looking for an example of how to use AsyncIO to read more than one file at a time like I was curious about, here's an example of how it can be accomplished. The secret is in the line yield from asyncio.sleep(0). This essentially pauses the current function, putting it back in the event loop queue, to be called after all other ready functions are executed. Functions are determined to be ready based on how they were scheduled.
import asyncio
#asyncio.coroutine
def read_section(file, length):
yield from asyncio.sleep(0)
return file.read(length)
#asyncio.coroutine
def read_file(path):
fd = open(path, 'r')
retVal = []
cnt = 0
while True:
cnt = cnt + 1
data = yield from read_section(fd, 102400)
print(path + ': ' + str(cnt) + ' - ' + str(len(data)))
if len(data) == 0:
break;
fd.close()
paths = ["loadme.txt", "loadme also.txt"]
loop = asyncio.get_event_loop()
tasks = []
for path in paths:
tasks.append(asyncio.async(read_file(path)))
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
These functions expect a file descriptor, that is, the underlying integers the operating system uses, not Python's file objects. File objects that are based on file descriptors return that descriptor on the fileno() method, so for example:
>>> sys.stderr.fileno()
2
In Unix, file descriptors can be attached to files or a lot of other things, including other processes.
Edit for the OP's edit:
As Max in the comments says, you can not use epoll on local files (and asyncio uses epoll). Yes, that's kind of weird. You can use it on pipes, though, for example:
import asyncio
import urllib.parse
import sys
import pdb
import os
def fileCallback(*args):
print("Received: " + sys.stdin.readline())
loop = asyncio.get_event_loop()
task = loop.add_reader(sys.stdin.fileno(), fileCallback)
loop.run_forever()
This will echo stuff you write on stdin.
you cannot use add_reader on local files, because:
It cannot be done using select/poll/epoll
It depends on the operating system
It cannot be fully asynchronous because of os limitations (linux does not support async fs metadata read/write)
But, technically, yes you should be able to do async filesystem read/write, (almost) all systems have DMA mechanism for doing i/o "in the background". And no, local i/o is not really fast such that no one would want it, the CPU are in the order of millions times faster that disk i/o.
Look for aiofile or aiofiles if you want to try async i/o

How to specify 'logger' for apscheduler

I'm trying to learn how to use Python's apscheduler package, but periodically, it throws the following error:
No handlers could be found for logger "apscheduler.scheduler"
This message seems to be associated with errors in the scheduled jobs, for example, using jobTester as the scheduled job, the following code, which uses an undefined variable (nameStr0) in jobTester gives the above error message:
from apscheduler.scheduler import Scheduler
from apscheduler.jobstores.shelve_store import ShelveJobStore
from datetime import datetime, timedelta
from schedJob import toyJob
def jobTester(nameStr):
outFileName = nameStr0 + '.txt'
outFile = open(outFileName,'w')
outFile.write(nameStr)
outFile.close()
def schedTester(jobList):
scheduler = Scheduler()
scheduler.add_jobstore(ShelveJobStore('example.db'),'shelve')
refTime = datetime.now()
for index, currJob in enumerate(jobList):
runTime = refTime + timedelta(seconds = 15)
jobName = currJob.name + '_' + str(index)
scheduler.add_date_job(jobTester, runTime, name = jobName,
jobstore = 'shelve', args = [jobName])
scheduler.start()
stopTime = datetime.now() + timedelta(seconds = 45)
print "Starting wait loop .....",
while stopTime > datetime.now():
pass
print "Done"
def doit():
names = ['Alan','Barbara','Charlie','Dana']
jobList = [toyJob(n) for n in names]
schedTester(jobList)
This may be seen by running this code (stored in the file schedTester.py) as follows:
>>> import schedTester
>>> schedTester.doit()
No handlers could be found for logger "apscheduler.scheduler"
Starting wait loop ..... Done
However, when I replace nameStr0 with nameStr (i.e. proper spelling of variable name), the code runs fine without the error message.
How do I create a logger for apscheduler.scheduler? Am I missing something in the section of the docs dealing with configuring the scheduler
Am I correct in thinking of this logger as some sort of a stderr ? If so, where will I look for it (if that is not determined by the way I set it up)
You can just create a default logger and everything should go to it:
import logging
logging.basicConfig()
The reason that you only have a problem when you use a variable that hasn't been defined is that this causes the jobTester function to throw an error which apscheduler is catching and trying to write the error message with logging.error(). Since you haven't setup the logger it is going to complain.
If you read up on python logging you will see that there are many ways to configure it. You could have it log everything to a file or print it to stdout.

Categories