python break a function nested inside a loop - python

I have the following piece of code:
for x in Listofurls:
function(urlquery)
function(htmlmining)
how the statement in function should be written
so that i can continue the loop moving to the next item
when the query does not match my research like
def(urlquery):
url=urlquery
Urlopen = urllib.request.urlopen(url)
Url_read = parse(Urlopen)
if 'text' not in Url_read.read():
#here is where i want a statement to stop and go to the next
#item in the loop like 'continue' in a for loop

You can use StopIteration to exit from all loops until it's caught;
try:
for i in range(10):
if i == 5:
raise StopIteration
else:
print i
except StopIteration:
print "Caught"
gives:
0
1
2
3
4
Caught
The StopIteration Exception is exactly that, an exception, not an error;
Raised by an iterator‘s next() method to signal that there are no
further values. This is derived from Exception rather than
StandardError, since this is not considered an error in its normal
application.
You can put it as deeply as you want in your nested loops, but you have to catch it at the level you want to break out of (i.e. the level at which you want to stop iteration) to.
Looking at your question, and trying to make sense of what you've written, it looks like you want to do something like this (maybe?)
for url in listOfURLs:
if urlquery(url):
htmlmining(url)
def urlquery(url):
page = parse(urllib.request.urlopen(url))
return 'text' in page.read():
#here is where i want a statement to stop and go to the next
#item in the loop like 'continue' in a for loop
This will then only run htmlmining(url) when 'text' is in the page you're parsing. If it's not, it will skip that entry and move onto the next one.

Have the inner function return True if you want to continue:
def urlquery(url):
urlopen = urllib.request.urlopen(url)
url_read = parse(urlopen)
if 'text' not in url_read.read():
# here is where I want a statement to stop and go to the next
return True
Then, the outer function can be:
for x in list_of_urls:
if urlquery(x):
continue
htmlmining(x)
Note that the code you posted was not valid Python. The above is my best guess as to what you meant.
Also, please read the Python style guide.

i finally found a solution to the question:
def urlquery(url):
urlopen = urllib.request.urlopen(url)
url_read = parse(urlopen)
if 'text' not in url_read.read():
return
else:
myurl='text' in url_read.read()
return myurl
and the for loop as follows:
for x in Listofurls:
TextAvailable=function(urlquery)
if not TextAvailable:
continue
function(htmlmining)
i am not sure this is the cleanest way to proceed but it works.

Related

How to store a number as a variable

I have this code below that runs on a page, finds the element input-optionXXX where XXX is a 3 digit number that changes between 300 and 400, and clicks on it. I would like to store the numeric value that it finds on the page so that i can use that straight away in my other lines of code. Right now, in the section print(i), it shows the correct value. I just need some way of storing that value.
for i in range(300, 400):
try:
driver.find_element_by_id(f'input-option{i}').click()
print(i)
except NoSuchElementException:
continue
If I understand the intention of your code correctly, assigning the value to a new variable that is not i, and then breaking out of the for loop should do the trick. As long as you do not need to continue looking for "input-optionXXX", this should work, as break will only be reached if the try: succeeded which it would only do if it manages to find "input-optionXXX"
for i in range(300, 400):
try:
driver.find_element_by_id(f'input-option{i}').click()
number = i
break
except NoSuchElementException:
continue

try/except while true in a for loop

The following code will gather data from an API and the try/except clause will help to handle several errors (from authentication, index, anything).
There's only one error (an authentication error) that I'm using the while True to repeat the API call to make sure I get the data and it will after a try or two. However if by any means I get another error, it'll be infinitely looping and I can't break it so it goes to the next iteration. I tried to create a counter and if the counter reaches to a number then (pass or continue or break) but it's not working.
## Create a array to loop to:
data_array_query = pd.date_range(start_date,end_date,freq='6H')
#This is my idea but is not working
#Create a counter
counter = 0
#Loop through the just created array
for idx in range(len(data_array_query)-1):
## If counter reaches move on to next for loop element
while True:
if counter>=5:
break
else:
try:
start_date = data_array_query[idx]
end_date = data_array_query[idx+1]
print('from',start_date,'to',end_date)
df = api.query(domain, site_slug, resolution, data_series_collection, start_date=str(start_date), end_date=str(end_date), env='prod', from_archive=True, phase='production').sort_index()
print(df.info())
break
except Exception as e:
print(e)
counter +=1
print(counter)
So the output of running this code for a couple of days show that when it runs 5 times (that's the counter max I set up) it does break but it breaks the whole loop and I only want it to move to the next date.
Any help will be appreciated,
You need to use a break statement to get out of a while True loop. pass and continue work for for loops that have a fixed number of iterations. While loops can go on forever (hence the different names)

Python - If json.object is empty, repeat the function until new value?

So I have been trying to find out a more beautiful way to actually do some fault and errors in a script I am working on.
Basically I have a json_resp = resp.json() who either gives me values or [] meaning either there is something or not.
Now the question I am having trouble with is that I don't know which way is the best to repeat a function if it is empty, shall I repeat the whole function or what else would be most "best reason" to solve it in a good way?
What I have done is that I changed the objects from the json resp to a len. If its 0 then repeat else do other stuff:
#json_resp['objects'] either has empty [] or not always.
json_resp = resp.json()
if len(json_resp['objects']) == 0:
print('Sleeping in 2 sec')
time.sleep(2)
run_method() #Shall I call the function to start over?
else:
print(len(json_resp['objects']))
continue do rest of the code
As you can see right now I am compare with len of the json_resp but what makes me unsure is that if it is a good way to actually call the function again? Wouldn't it have a limit or maybe delay the whole process... Im not sure but what is your thoughts of making this function "better, smarter, faster"?
My thought was maybe to either put a try except or while loop that? Let me know what you guys think
Python lists are faulty so you can just use if json_resp:
You can use recursion. Just make sure you have somewhere to break
I'd like to revise your code into:
max_iteration = 5
current_iteration = 0
def run_method():
current_iteration += 1
# Do other stuff. Requests I guess?
response = resp.json
if response:
# do something with the response
else:
if current_iteration == max_iteration:
return 'Maximum iterations reached: {}'.format(max_iteration)
timer = 2
print('Sleeping in {} seconds'.format(timer))
time.sleep(timer)
run_method()

Looping through a list of proxies

I am currently working on a function that is to loop through a list of functions and then restart back at the top once it reaches the bottom. So far this is the code that I have:
import time
createLimit = 100
proxyFile = 'proxies.txt'
def getProxies():
proxyList = []
with open(proxyFile, 'r') as f:
for line in f:
proxyList.append(line)
return proxyList
proxyList = getProxies()
def loopProxySwitch():
print("running")
current_run = 0
while current_run <= createLimit:
if current_run >= len(proxyList):
lengthOfList = len(proxyList)
useProxy = proxyList[current_run%lengthOfList]
print("Current Ip: "+useProxy)
print("Current Run: "+current_run)
print("Using modulus")
return useProxy
else:
useProxy = proxyList[current_run]
print("Current Ip: "+useProxy)
print("Current Run: "+current_run)
return useProxy
time.sleep(2)
print("Script ran")
loopProxySwitch()
The problem that I am having is that the loopProxySwitch function does not return or print anything within the while loop, however I don't see how it would be false. Here is the format of the text file with fake proxies:
111.111.111.111:2222
333.333.333.333:4444
444.444.444.444:5555
777.777.777.777:8888
919.919.919.919:0000
Any advice on this situation? I intend to incorporate this into a program that I am working on, however instead of cycling through the file on a timed interval, it would only loop on a certain returned condition (such as a another function letting the loop function know that some function has ran and that it is time to switch to the next proxy). If this is a bit confusing, I will be happy to elaborate and clear any confusion. Any suggestions, ideas, or fixes are appreciated. Thanks!
EDIT: Thanks to the comments below, I fixed the printing issue. However, the function does not loop through all the proxies... Any suggestions?
Nothing is printed because you return something before printing.
The loop will break the first time condition is met as it will return a value and exit the function without reaching the print statements(functions) and/or the next iteration.
BTW if you actually want to print the returned value you can print the function itself:
print(loopProxySwitch())

Python - better solution for loops - Re-run after getting a error & ignore that error after 3 attempts

I created below for loop to run a function to get price data from pandas for a list of tickers. Basically, the loop will re-run the function if getting RemoteDataError and ignore that error after 3 times attempts.
Even though below for loop is working fine for this purpose, I do think there have a better solution since I can not define the times of attempts from below loop, like putting a while loop for times of attempt outside the for loop. I tried to define a variable named attempts = 0, every time it re-run, one attempts will be added. The logic is attempts += 1. If attempts reached 3, use continue to ignore the error. However, it didn't work. Probably I set something wrongly.
for ticker in tickers:
print(ticker)
try:
get_price_for_ticker()
except RemoteDataError:
print('No information for {}'.format(ticker))
try:
get_price_for_ticker()
print('Got data')
except RemoteDataError:
print('1st with no data')
try:
get_price_for_ticker()
print('Got data')
except RemoteDataError:
print('2nd with no data')
try:
get_price_for_ticker()
print('Got data')
except RemoteDataError:
print('3rd with no data (should have no data in the database)')
continue
Is there a better method for this purpose?
Is there a better method for this purpose?
Yes, there is. Use a while loop and a counter.
count = 0
while count < 3:
try:
get_price_for_ticker()
break # reach on success
except RemoteDataError:
print('Retrying {}'.format(count + 1))
count += 1 # increment number of failed attempts
if count == 3:
... # if count equals 3, the read was not successful
This code should go inside your outer for loop. Alternatively, you could define a function with the while + error handling code that accepts a ticker parameter, and you can call that function at each iteration of the for loop. It's a matter of style, and upto you.

Categories