I write this post because I have not found solutions for my specific case. I refer to this article, which, however, did not work for me on Windows 10 version 1909.
I programmed a "python_code_a.py" script that has the task of uploading, one at a time, all the images contained in a local folder on a converter server and to download them, always one at a time, from the server to my PC in another folder. How the script works depends on the server, which is public and not owned by me, so it is possible, approximately every two and a half hours, that the script crashes due to an unexpected connection error. Obviously, it is not possible to consider the fact that he stays all day observing the Python shell and acting in case the script stops.
As reported in the article above, I compiled a second file with the name "python_code_b.py", which had the task of acting in case "python_code_a.py" had stopped by restarting the latter. When I try to get it to run from the "python.exe" CMD, however, the latter responds to the input with "...", nothing else.
I attach a general example of "python_code_a.py":
processnumber= 0
photosindex= 100000
photo = 0
path = 0
while photosindex<"number of photos in folder":
photo = str('your_path'+str(photoindex)+'.png')
path = str('your_path'+str(photoindex)+'.jpg')
print ('It\'s converting: '+ photo)
import requests
r = requests.post(
"converter_site",
files={
'image': open(photo , 'rb'),
},
headers={'api-key': 'your_api_key'}
)
file= r.json()
json_output = file['output_url']
import urllib.request
while photosindex<'number of photos in folder':
urllib.request.urlretrieve( json_output , path )
print('Finished process number: '+str(processnumber))
break
photosindex= photosindex +1
processnumber= processnumber +1
print(
)
print('---------------------------------------------------')
print('Every pending job has been completed.')
print(
)
How can I solve it?
you can use error capturing:
while photosindex<"number of photos in folder":
try:
#Your code
except:
print("Something else went wrong")
https://www.w3schools.com/python/python_try_except.asp
Related
Hello so I don't stream right but wanted to make a video on peoples reactions when they are suddenly hit with a lot of people (this would be accompanied by a chat bot too and ill tell them what it was as well as ask for use permissions). So I thought it would be fun to look at view bots for twitch and found one online (code below). so I ran in installed streamlink via Pip and windows executable and it seems to run "found matching plugin twitch for URL "Stream link"" but it doesn't actually increase viewership and I can only assume this is because its not actually opening the Vlc instances, so here I am wondering what I need to do I have the latest version of python and git isnt trying to download and install anything so im assuming streamlink is all I need but im kind confused why it woudnt be opening the VLC instance any help is most appreciated.
Edit: oh and I do have the proxies and using a small amount to try and get it to work first, and will buy more later but after I get this to work!
import concurrent.futures, time, random, os
#desired channel url
channel_url = 'https://www.twitch.tv/StreamerName'
#number of viewer bots
botcount = 10
#path to proxies.txt file
proxypath = "C:\Proxy\proxy.txt"
#path to vlc
playerpath = r'"C:\Program Files\VideoLAN\VLC\vlc.exe"'
#takes proxies from proxies.txt and returns to list
def create_proxy_list(proxyfile, shared_list):
with open(proxyfile, 'r') as file:
proxies = [line.strip() for line in file]
for i in proxies:
shared_list.append((i))
return shared_list
#takes random proxies from the proxies list and adds them to another list
def randproxy(proxylist, botcount):
randomproxylist = list()
for _ in range(botcount):
proxy = random.choice(proxylist)
randomproxylist.append(proxy)
proxylist.remove(proxy)
return (randomproxylist)
#launches a viewer bot after a short delay
def launchbots(proxy):
time.sleep(random.randint(5, 10))
os.system(f'streamlink --player={playerpath} --player-no-close --player-http --hls-segment-timeout 30 --hls-segment-attempts 3 --retry-open 1 --retry-streams 1 --retry-max 1 --http-stream-timeout 3600 --http-proxy {proxy} {channel_url} worst')
#calls the launchbots function asynchronously
def main(randomproxylist):
with concurrent.futures.ThreadPoolExecutor() as executer:
executer.map(launchbots, randomproxylist)
if __name__ == "__main__":
main(randproxy(create_proxy_list(proxypath, shared_list=list()), botcount))
I am downloading a lot of files from a website and want them to run parallel because they are heavy. Unfourtanetly I can't really share the website because to access the files I need a username and password which I can't share. The code below is my code, I know it can't really be run without the website and my username and password but I am 99% sure I am not allowed to share that information
import os
import requests
from multiprocessing import Process
dataset="dataset_name"
################################
def down_file(dspath, file, savepath, ret):
webfilename = dspath+file
file_base = os.path.basename(file)
file = join(savepath, file_base)
print('...Downloading',file_base)
req = requests.get(webfilename, cookies = ret.cookies, allow_redirects=True, stream=True)
filesize = int(req.headers['Content-length'])
with open(file, 'wb') as outfile:
chunk_size=1048576
for chunk in req.iter_content(chunk_size=chunk_size):
outfile.write(chunk)
return None
################################
##Download files
def download_files(filelist, c_DateNow):
## Authenticate
url = 'url'
values = {'email' : 'email', 'passwd' : "password", 'action' : 'login'}
ret = requests.post(url, data=values)
## Path to files
dspath = 'datasetwebpath'
savepath = join(path_script, dataset, c_DateNow)
makedirs(savepath, exist_ok = True)
#"""
processes = [Process(target=down_file, args=(dspath, file, savepath, ret)) for file in filelist]
print(["dspath, %s, savepath, ret\n"%(file) for file in filelist])
# kick them off
for process in processes:
print("\n", process)
process.start()
# now wait for them to finish
for process in processes:
process.join()
#"""
####### This works and it's what i want to parallelize
"""
##Download files
for file in filelist:
down_file(dspath, file, savepath, ret)
#"""
################################
def main(c_DateNow, c_DateIni, c_DateFin):
## Other code
files=["list of web file addresses"]
print(" ...Files being downladed\n ", "\n ".join(files), "\n")
## Doanlad files
download_files(files, c_DateNow)
I want to download 25 files. When I run the code all the print lines that have been printed before in the code are being reprinted even though the Process execution is not even near them. I am also getting the following error constantly
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
I googled the error and don't know how to fix it. Does it have to do with there not being enough cores? Is there a way to stop the Process depending on how many cores I have available? Or is it something else entirely?
In a question here, I read that the Process has to be within the __main__ function but this code is a module that gets imported in another code so when I run it I run it as
import this_code
import another1_code
import another2_code
#Step1
another1_code.main()
#Step2
c_DateNow, c_DateIni, c_DateFin = another2_code.main()
#Step3
this_code.main(c_DateNow, c_DateIni, c_DateFin)
#step4
## More code
So I need the process to be within a function and not in __main__
I appreciate any help or suggestions on how to correctly parallelize the above code in a way that allows me to use it as a module in another code.
I have problem because I can not find the reason why my function in Django views.py sometimes runs two times. When I go to url, which call function create_db in Django view, function read json files from directory, parse it and write the data in the database. Most of the time it works perfectly, but sometimes for no reason it runs two times and write the same data in the data base. Does anyone know what can be the reason why code is sometimes done twice and how can I solve the problem?
Here is my create_db function:
def create_db(request):
response_data = {}
try:
start = time.time()
files = os.listdir()
print(files)
for filename in files:
if filename.endswith('.json'):
print(filename)
with open(f'{filename.strip()}', encoding='utf-8') as f:
data = json.load(f)
for item in data["CVE_Items"]:
import_item(item)
response_data['result'] = 'Success'
response_data['message'] = 'Baza podatkov je ustvarjena.'
except KeyError:
response_data['result'] = 'Error'
response_data['message'] = 'Prislo je do napake! Podatki niso bili uvozeni!'
return HttpResponse(json.dumps(response_data), content_type='application/json')
The console output that I expect:
['nvdcve-1.0-2002.json', 'nvdcve-1.0-2003.json', 'nvdcve-1.0-2004.json', 'nvdcve-1.0-2005.json', 'nvdcve-1.0-2006.json', 'nvdcve-1.0-2007.json', 'nvdcve-1.0-2008.json', 'nvdcve-1.0-2009.json', 'nvdcve-1.0-2010.json', 'nvdcve-1.0-2011.json', 'nvdcve-1.0-2012.json', 'nvdcve-1.0-2013.json', 'nvdcve-1.0-2014.json', 'nvdcve-1.0-2015.json', 'nvdcve-1.0-2016.json', 'nvdcve-1.0-2017.json']
nvdcve-1.0-2002.json
nvdcve-1.0-2003.json
nvdcve-1.0-2004.json
nvdcve-1.0-2005.json
nvdcve-1.0-2006.json
nvdcve-1.0-2007.json
nvdcve-1.0-2008.json
nvdcve-1.0-2009.json
nvdcve-1.0-2010.json
nvdcve-1.0-2011.json
nvdcve-1.0-2012.json
nvdcve-1.0-2013.json
nvdcve-1.0-2014.json
nvdcve-1.0-2015.json
nvdcve-1.0-2016.json
nvdcve-1.0-2017.json
Console output when error happened:
['nvdcve-1.0-2002.json', 'nvdcve-1.0-2003.json', 'nvdcve-1.0-2004.json', 'nvdcve-1.0-2005.json', 'nvdcve-1.0-2006.json', 'nvdcve-1.0-2007.json', 'nvdcve-1.0-2008.json', 'nvdcve-1.0-2009.json', 'nvdcve-1.0-2010.json', 'nvdcve-1.0-2011.json', 'nvdcve-1.0-2012.json', 'nvdcve-1.0-2013.json', 'nvdcve-1.0-2014.json', 'nvdcve-1.0-2015.json', 'nvdcve-1.0-2016.json', 'nvdcve-1.0-2017.json']
nvdcve-1.0-2002.json
['nvdcve-1.0-2002.json', 'nvdcve-1.0-2003.json', 'nvdcve-1.0-2004.json', 'nvdcve-1.0-2005.json', 'nvdcve-1.0-2006.json', 'nvdcve-1.0-2007.json', 'nvdcve-1.0-2008.json', 'nvdcve-1.0-2009.json', 'nvdcve-1.0-2010.json', 'nvdcve-1.0-2011.json', 'nvdcve-1.0-2012.json', 'nvdcve-1.0-2013.json', 'nvdcve-1.0-2014.json', 'nvdcve-1.0-2015.json', 'nvdcve-1.0-2016.json', 'nvdcve-1.0-2017.json']
nvdcve-1.0-2002.json
nvdcve-1.0-2003.json
nvdcve-1.0-2003.json
nvdcve-1.0-2004.json
nvdcve-1.0-2004.json
nvdcve-1.0-2005.json
nvdcve-1.0-2005.json
nvdcve-1.0-2006.json
nvdcve-1.0-2006.json
nvdcve-1.0-2007.json
nvdcve-1.0-2007.json
nvdcve-1.0-2008.json
nvdcve-1.0-2008.json
nvdcve-1.0-2009.json
nvdcve-1.0-2009.json
nvdcve-1.0-2010.json
nvdcve-1.0-2010.json
nvdcve-1.0-2011.json
nvdcve-1.0-2011.json
nvdcve-1.0-2012.json
nvdcve-1.0-2012.json
nvdcve-1.0-2013.json
nvdcve-1.0-2013.json
nvdcve-1.0-2014.json
nvdcve-1.0-2014.json
nvdcve-1.0-2015.json
nvdcve-1.0-2015.json
nvdcve-1.0-2016.json
nvdcve-1.0-2016.json
nvdcve-1.0-2017.json
nvdcve-1.0-2017.json
The problem is not in the code which you show us. Enable logging for the HTTP requests which your application receives to make sure the browser sends you just a single request. If you see two requests, make sure they use the same session (maybe another user is clicking at the same time).
If it's from the same user, maybe you're clicking the button twice. Could be a hardware problem with the mouse. To prevent this, use JavaScript to disable the button after the first click.
Fairly new to python, I learn by doing, so I thought I'd give this project a shot. Trying to create a script which finds the google analytics request for a certain website parses the request payload and does something with it.
Here are the requirements:
Ask user for 2 urls ( for comparing the payloads from 2 diff. HAR payloads)
Use selenium to open the two urls, use browsermobproxy/phantomJS to
get all HAR
Store the HAR as a list
From the list of all HAR files, find the google analytics request, including the payload
If Google Analytics tag found, then do things....like parse the payload, etc. compare the payload, etc.
Issue: Sometimes for a website that I know has google analytics, i.e. nytimes.com - the HAR that I get is incomplete, i.e. my prog. will say "GA Not found" but that's only because the complete HAR was not captured so when the regex ran to find the matching HAR it wasn't there. This issue in intermittent and does not happen all the time. Any ideas?
I'm thinking that due to some dependency or latency, the script moved on and that the complete HAR didn't get captured. I tried the "wait for traffic to stop" but maybe I didn't do something right.
Also, as a bonus, I would appreciate any help you can provide on how to make this script run fast, its fairly slow. As I mentioned, I'm new to python so go easy :)
This is what I've got thus far.
import browsermobproxy as mob
from selenium import webdriver
import re
import sys
import urlparse
import time
from datetime import datetime
def cleanup():
s.stop()
driver.quit()
proxy_path = '/Users/bob/Downloads/browsermob-proxy-2.1.4-bin/browsermob-proxy-2.1.4/bin/browsermob-proxy'
s = mob.Server(proxy_path)
s.start()
proxy = s.create_proxy()
proxy_address = "--proxy=127.0.0.1:%s" % proxy.port
service_args = [proxy_address, '--ignore-ssl-errors=yes', '--ssl-protocol=any'] # so that i can do https connections
driver = webdriver.PhantomJS(executable_path='/Users/bob/Downloads/phantomjs-2.1.1-windows/phantomjs-2.1.1-windows/bin/phantomjs', service_args=service_args)
driver.set_window_size(1400, 1050)
urlLists = []
collectTags = []
gaCollect = 0
varList = []
for x in range(0,2): # I want to ask the user for 2 inputs
url = raw_input("Enter a website to find GA on: ")
time.sleep(2.0)
urlLists.append(url)
if not url:
print "You need to type something in...here"
sys.exit()
#gets the two user url and stores in list
for urlList in urlLists:
print urlList, 'start 2nd loop' #printing for debug purpose, no need for this
if not urlList:
print 'Your Url list is empty'
sys.exit()
proxy.new_har()
driver.get(urlList)
#proxy.wait_for_traffic_to_stop(15, 30) #<-- tried this but did not do anything
for ent in proxy.har['log']['entries']:
gaCollect = (ent['request']['url'])
print gaCollect
if re.search(r'google-analytics.com/r\b', gaCollect):
print 'Found GA'
collectTags.append(gaCollect)
time.sleep(2.0)
break
else:
print 'No GA Found - Ending Prog.'
cleanup()
sys.exit()
cleanup()
This might be a stale question, but I found an answer that worked for me.
You need to change two things:
1 - Remove sys.exit() -- this causes your programme to stop after the first iteration through the ent list, so if what you want is not the first thing, it won't be found
2 - call new_har with the captureContent option enabled to get the payload of requests:
proxy.new_har(options={'captureHeaders':True, 'captureContent': True})
See if that helps.
EDIT: I think I've figured out a solution using subprocess.Popen with separate .py files for each stream being monitored. It's not pretty, but it works.
I'm working on a script to monitor a streaming site for several different accounts and to record when they are online. I am using the livestreamer package for downloading a stream when it comes online, but the problem is that the program will only record one stream at a time. I have the program loop through a list and if a stream is online, start recording with subprocess.call(["livestreamer"... The problem is that once the program starts recording, it stops going through the loop and doesn't check or record any of the other livestreams. I've tried using Process and Thread, but none of these seem to work. Any ideas?
Code below. Asterisks are not literally part of code.
import os,urllib.request,time,subprocess,datetime,random
status = {
"********":False,
"********":False,
"********":False
}
def gen_name(tag):
return stuff <<Bunch of unimportant code stuff here.
def dl(tag):
subprocess.call(["livestreamer","********.com/"+tag,"best","-o",".\\tmp\\"+gen_name(tag)])
def loopCheck():
while True:
for tag in status:
data = urllib.request.urlopen("http://*******.com/" + tag + "/").read().decode()
if data.find(".m3u8") != -1:
print(tag + " is online!")
if status[tag] == False:
status[tag] = True
dl(tag)
else:
print(tag+ " is offline.")
status[tag] = False
time.sleep(15)
loopCheck()