Apply a Function to a Dictionary in Python - python

I have an ELIF function to determine whether or not a website exists. The elif works, but is incredibly slow. I'd like to create a dictionary to apply the ELIF function to the list of URLs I have. Ideally I'm looking to get the outputs into a new table listing the URL and the result from the function.
I'm creating a dictionary for the potential outputs outlined in the elif statement posted below
check = {401:'web site exists, permission needed', 404:'web site does not exist'}
for row in df['sp_online']:
r = requests.head(row)
if r.status_code == 401:
print ('web site exists, permission needed')
elif r.status_code == 404:
print('web site does not exist')
else:
print('other')
How can I get the results of the confirmation function to show each url's result as a new column in the dataframe?

I think your should try a Thread or Multiprocessing approach. Instead of requesting one site at a time, you can pool n websites and wait for their responses. With ThreadPool you can achieve this with a few extra lines. Hope this is of use to you!
import requests
from multiprocessing.pool import ThreadPool
list_sites = ['https://www.wikipedia.org/', 'https://youtube.com', 'https://my-site-that-does-not-exist.com.does.not']
def get_site_status(site):
try:
response = requests.get(site)
except requests.exceptions.ConnectionError:
print("Connection refused")
return 1
if response.status_code == 401:
print('web site exists, permission needed')
elif response.status_code == 404:
print('web site does not exist')
else:
print('other')
return 0
pool = ThreadPool(processes=1)
results = pool.map_async(get_site_status, list_sites)
print('Results: {}'.format(results.get()))

I think you are looking for Series.map
df = pd.DataFrame({'status': [401, 404, 500]})
check = {401:'web site exists, permission needed', 404:'web site does not exist'}
print(df['status'].map(check))
prints
0 web site exists, permission needed
1 web site does not exist
2 NaN
Name: status, dtype: object
Assign to a new column in the normal way
df['new_col'] = df['status'].map(check)

Related

Iterating over links using Requests and adding to Pandas DF

I have a problem which I can't sort out for a day, and kindly asking for help of the community.
I have prepared Pandas DF which looks like this:
What I need to do:
Create URL links using 'name' and 'namespace' columns of df
Ask each URL
Save parameters from page if code = 200 and I have data, otherwise - find error code and save to the new column if df, called 's2s_status'
What I have for now:
start_slice = 1
total_end = 20
while start_slice <= total_end:
end_slice = start_slice + 1
if end_slice>total_end:
end_slice=total_end
var_name=df.loc[df.index == start_slice,'name'].values[0]
var_namespace=df.loc[df.index == start_slice, 'namespace'].values[0]
url=f"http://{var_name}.{var_namespace}.prod.s.o3.ru:84/config"
r=requests.get(url, timeout=(12600,1))
data=r.json()['Values']['s2s_auth_requests_sign_grpc']
if r.status_code == 404:
df['s2s_status']="404 error"
elif r.status_code == 500:
df['s2s_status']="500 Error"
elif r.status_code == 502:
df['s2s_status']="502 Error"
elif r.status_code == 503:
df['s2s_status']="503 Error"
else:
data=r.json()['Values']['s2s_auth_requests_sign_grpc']
df['s2s_status']="sign"
if end_slice == total_end:
break
else:
start_slice = end_slice
print(r)
print(url)
print(df)
This code iterates over first 20 records, but:
Brings wrong errors, e.g. page like 'http://exteca-auth.eea.prod.s.o3.ru:84/config' not found at all, but it gives me 404 error.
Less important, but still - didn't imagine how to handle cases when page won't return anything (no data with 200 code, not 404/500/502/503/etc. error)
Thank you in advance.

keyerror when adding key in dict

I cleaned some keys of a dictionary and tried to add them into a new dict, so i can only work with them. But when i try to encode and decode keys such as Radaufhängung or Zündanlage and add them into the new dict i get an error. My question is if there is a way to go around thir or if there is a better solution to handle this (line: 49)?
my code:
import requests
import json
import time
from requests.exceptions import HTTPError
attempts = 0
def get_data_from_url(url):
try:
response = requests.get(url)
# If the response was successful, no Exception will be raised
response.raise_for_status()
except HTTPError:
return "HTTPError"
else:
response_dict = json.loads(response.text)
return response_dict
url = get_data_from_url("http://160.85.252.148/")
#(1) upon no/invalid response, the query is resubmitted with exponential backoff waiting time in between requests to avoid server overload;
while url == "HTTPError":
attempts += 1
time.sleep(attempts * 1.5)
url = get_data_from_url("http://160.85.252.148/")
print(url)
#(2) material records with missing or invalid cost are ignored;
valid_values = {}
for key in url:
if type(url[key]) == int or type(url[key]) == float and url[key].isdigit() == True:
#(3) wrongly encoded umlauts are repaired.
key = key.encode('latin1').decode('utf8')
#key = key.replace('é', 'Oe').replace('ä', 'ae').replace('ü', 'ue')
valid_values[key]=abs(url[key])
print(valid_values)
You're trying to access the dictionary with a modified key. Store the original key and use it when accessing the dictionary:
okey = key
key = key.encode('latin1').decode('utf8')
...
valid_values[key]=abs(url[okey])

AWS chalice local works but not chalice deploy

I am pretty new to coding and aws chalice. I tried writing a code that gets messages from trading-view and executes orders depending on the signals.
I tested the code locally and everything worked fine, but when I test the Rest API I get the following error:
{"message":"Missing Authentication Token"}
I set up my credentials via "aws configure" as explained here: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html
I also created a config.txt file in my aws folder and checked my settings via "aws configure get" and they were fine.
The index function in the beginning worked too, so there should be a problem within my code?
I changed some values and cut some functions and the strategy part out, but the code looks somewhat like this:
from chalice import Chalice
from datetime import datetime
from binance.client import Client
from binance.enums import *
import ccxt
exchange = ccxt.binance({
'apiKey': 'KEY',
'secret': 'SECRET',
'enableRateLimit': True,
'options': {
'defaultType': 'future',
},
})
def buy_order(quantity, symbol, order_type = ORDER_TYPE_MARKET,side=SIDE_BUY,recvWindow=5000):
try:
print("sending order")
order = client.futures_create_order(symbol = symbol, type = order_type, side = side, quantity = quantity,recvWindow=recvWindow)
print(order)
except Exception as e:
print("an exception occured - {}".format(e))
return False
return True
app = Chalice(app_name='tradingview-webhook-alert')
indicator1 = "x"
indicator2 = "y"
TRADE_SYMBOL = "Test123"
in_position = False
def diff_time(time1, time2):
fmt = '%Y-%m-%dT%H:%M:%SZ'
tstamp1 = datetime.strptime(time1, fmt)
tstamp2 = datetime.strptime(time2, fmt)
if tstamp1 > tstamp2:
td = tstamp1 - tstamp2
else:
td = tstamp2 - tstamp1
td_mins = int(round(td.total_seconds() / 60))
return td_mins
#app.route('/test123', methods=['POST'])
def test123():
global indicator1, indicator2
request = app.current_request
message = request.json_body
indicator = message["indicator"]
price = message["price"]
value = message["value"]
if indicator == "indicator1":
indicator1 = value
if indicator == "indicator2":
indicator2 = value
if in_position == False:
if (indicator1 >123) & (indicator2 < 321):
balance = exchange.fetch_free_balance()
usd = float(balance['USDT'])
TRADE_QUANTITY = (usd / price)*0.1
order_succeeded = buy_order(TRADE_QUANTITY, TRADE_SYMBOL)
if order_succeeded:
in_position = True
return {"test": "123"}
I tested it locally with Insomnia and tried the Rest API link there and in my browser, both with the same error message. Is my testing method wrong or is it the code? But even then, why isn't the Rest API link working, when I include the index function from the beginning again? If I try the index function from the beginning, I get the {"message": "Internal server error"} .
This is probably a very very basic question but I couldn't find an answer online.
Any help would be appreciated!
I am not pretty sure if that helps you because I don't really understand your question but:
You are using a POST-request which will not be executed by opening a URL.
Try something like #app.route('/test123', methods=['POST', 'GET']) so that if you just open the URL, it will execute a GET-request
Some more information:
https://www.w3schools.com/tags/ref_httpmethods.asp

Mocking requests.post [duplicate]

This question already has answers here:
How can I mock requests and the response?
(20 answers)
Closed 2 years ago.
This is my first time writing unit tests, apologies for the annoyances inevitably present, despite my best efforts. I am trying to mock requests.post but my test function is not having the desired effect, to induce a 404 status code so that I can test error handling.
mymodule.py
def scrape(data):
logger.debug(f'\nBeginning scrape function')
result = {}
exceptions = {}
for id, receipts in data.items():
logger.debug(f'Looking up Id # {id} and receipts: \n{receipts}')
dispositions = []
for receipt in receipts:
logger.debug(f'The length of receipts is:' + str(len(receipts)))
attempts = 1
while attempts < 6:
logger.debug(
f'This is attempt number {attempts} to search for {receipt}')
payload = {'receipt': 'receipt',
'button': 'CHECK+STATUS', }
try:
NOW = datetime.today().strftime('%c')
logger.debug(NOW)
logger.debug(f'Making post request for: {receipt}')
response = requests.post(URL, data=payload, headers=HEADERS, timeout=10)
except Exception as e:
logger.debug(f'There was an exception: {e}')
exceptions[id] = receipt + f': {e}'
time.sleep(3)
attempts += 1
else:
logger.debug(f'It worked {response.status_code}')
attempts = 6
disp = parse(response)
dispositions.append(f'{receipt}: {disp}')
result[id] = dispositions
logger.debug(f'Here is the result: {result}')
return result
test_mymodule.py
def test_scrape(self):
print(f'\ntest_scrape running')
# mock a 404 in scrape() here
with patch("mymodule.requests") as patched_post:
# mock a request response
patched_post.return_value.status_code = 404
print('404 mocked')
# verify the function returns nothing due to 404
result = scrape(test_data)
print(f'\n{result}')
mock_requests.post.assert_called_once()
self.assertEqual(result, {})
def test_scrape(self):
print(f'\ntest_scrape running')
# mock a 404 in scrape() here
with patch("mymodule.requests") as patched_post:
# mock a request response
patched_post.return_value.status_code = 404
print('404 mocked')
# verify the function returns nothing due to 404
result = scrape(test_data)
print(f'\n{result}')
mock_requests.post.assert_called_once()
self.assertEqual(result, {})

Super-performatic comparison

I have a python code which recovers information from an HTTP API using the requests module. This code is run over and over again with an interval of few milliseconds between each call.
The HTTP API which I'm calling can send me 3 different responses, which can be:
text 'EMPTYFRAME' with HTTP status 200
text 'CAMERAUNAVAILABLE' with HTTP status 200
JPEG image with HTTP status 200
This is part of the code which handles this situation:
try:
r = requests.get(url,
auth=(username, pwd),
params={
'camera': camera_id,
'ds': int((datetime.now() - datetime.utcfromtimestamp(0)).total_seconds())
}
)
if r.text == 'CAMERAUNAVAILABLE':
raise CameraManager.CameraUnavailableException()
elif r.text == 'EMPTYFRAME':
raise CameraManager.EmptyFrameException()
else:
return r.content
except ConnectionError:
# handles the error - not important here
The critical part is the if/elif/else section, this comparison is taking way too long to complete and if I completely remove and simply replace it by return r.content, I have the performance I wish to, but checking for these other two responses other than the image is important for the application flow.
I also tried like:
if len(r.text) == len('CAMERAUNAVAILABLE'):
raise CameraManager.CameraUnavailableException()
elif len(r.text) == len('EMPTYFRAME'):
raise CameraManager.EmptyFrameException()
else:
return r.content
And:
if r.text[:17] == 'CAMERAUNAVAILABLE':
raise CameraManager.CameraUnavailableException()
elif r.text[:10] == 'EMPTYFRAME':
raise CameraManager.EmptyFrameException()
else:
return r.content
Which made it faster but still not as fast as I think this can get.
So is there a way to optimize this comparison?
EDIT
With the accepted answer, the final code is like this:
if r.headers['content-type'] == 'image/jpeg':
return r.content
elif len(r.text) == len('CAMERAUNAVAILABLE'):
raise CameraManager.CameraUnavailableException()
elif len(r.text) == len('EMPTYFRAME'):
raise CameraManager.EmptyFrameException()
Checking the response's Content-Type provided a much faster way to assure an image was received.
Comparing the whole r.text (which may contain the JPEG bytes) is probably slow.
You could compare the Content-Type header the server should set:
ct = r.headers['content-type']
if ct == "text/plain":
# check for CAMERAUNAVAILABLE or EMPTYFRAME
else:
# this is a JPEG

Categories