Script to check status code of URLs using Python

Script to check status code of URLs using Python - python

I want to write script that accept multiple URLs through list or text file and append some string at the end of each URL and check https status code (200, 401 and 403)of each URL and save in separate files.
Here's my code so far:
lst = {'back.sql',
'backup.sql',
'accounts.sql',
'backups.sql',
'clients.sql',
'customers.sql',
'data.sql',
'database.sql',
'database.sqlite',
'users.sql',
'db.sql',
'db.sqlite',
'db_backup.sql',
'dbase.sql',
'dbdump.sql',
'setup.sql',
'sqldump.sql',
'dump.sql',
'mysql.sql',
'sql.sql',
'temp.sql'
}
url_test = 'http://www.Holiday.com/%s/' #This can be modified to accept multiple URLs
for i in lst:
url = url_test %i
print(url) #This can be modified to save results for each http status code

If you want, check status code you have to request each page one by one
from requests import get
lst = {'back.sql',
'backup.sql',
'accounts.sql',
'backups.sql',
'clients.sql',
'customers.sql',
'data.sql',
'database.sql',
'database.sqlite',
'users.sql',
'db.sql',
'db.sqlite',
'db_backup.sql',
'dbase.sql',
'dbdump.sql',
'setup.sql',
'sqldump.sql',
'dump.sql',
'mysql.sql',
'sql.sql',
'temp.sql'
}
url_test = ['http://www.Holiday.com/%s/'] #Create list of url
result_dict = dict()
for i in lst:
for url_from_list in url_test:
url = url_from_list %i
# request and get status code from each page one by one
result_dict[url] = get(url).status_code
result_dict will be a dictionary which will contain url as key and response code as value
Then save it to file
with open('filename.txt', 'w') as file:
for url, status_code in result_dict.items():
line = url+" "+str(status_code)+"\n"
file.write(line)

Related

How i can make a request for every token from a txt file in python

I have a text file called tokens.txt.
Ex: 12463,4126,6343,6345.
And i want to send a post request with each tokens and use multi threading.
For some reasons my code only gets the last token from the txt file and only uses that.
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import time
url_list = [
"https://www.google.com/api/"
]
file_lines = open("tokens.txt", "r").readlines()
for line in file_lines:
tokens = {
'Token':line.replace('/n','')
}
def makerequest(url):
while True:
html = requests.post(url,stream=True, data=tokens)
print(tokens)
return html.content
start = time()
processes = []
with ThreadPoolExecutor(max_workers=200) as executor:
for url in url_list:
processes.append(executor.submit(makerequest, url))
for task in as_completed(processes):
print(task.result())
print(f'Time taken: {time() - start}')
How can i send for each token a request?

In your case tokens = {"Token": <last_token>}
Modify your code like this so that for each token one request can be sent.
tokens = set()
'''
<- You can use list also but in this case set is better as it will ensure only
one request for one token even if your tokens file contains duplicate line.
'''
url_list = [
"https://www.google.com/api/"
]
tokens = set()
with open("tokens.txt", "r") as f:
file_lines = f.readlines()
for line in file_lines:
tokens.add(line.strip())
token_data = {"Token": None}
def makerequest(url):
for token in tokens:
token_data["Token"] = token
html = requests.post(url,stream=True, data=token_data)
print(token)
# do something with html here
# don't return or break

You are doing
data = tokens
and at that point tokens is the assignment from the last line. If you want all tokens, you need to do something likej:
tokens = set()
for line file_lines:
tokens.add(......)

The problem with your code is the creation of the tokens dictionary - you loop ofer the tokens but you alway overwrite the value mapped to the "Token" key.
Moreover there are a few bad practices in you code.
please be careful with the inline opening of files like you did
file_lines = open("tokens.txt", "r").readlines()
Rather use it as a context manager
with open("tokens.txt", "r") as file:
file_lines = file.readlines()
This makes sure that the file gets closed again after you read it - in your case you would need to make sure that the file gets closed (even in a crash etc.)
Secondly, avoid using global variables in functions. According to you code I assume that you want to query the different urls with each token - so the fucntion should accept both as arguments. Respectively i would then create a list of combinations like
url_token_combs = [(url, token.strip()) for url in url_list for token in file_lines]
And finally, change your function to use the arguements handed to it rather than global ones like:
def makerequest(url_token ):
url , token = url_token
html = requests.post(url,stream=True, data=token)
return html.content
That allows you now to loop over your code with thread like:
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
from time import time
def makerequest(url_token):
url , token = url_token
html = requests.post(url,stream=True, data=tokens)
print(tokens)
return html.content
if __name__ == "__main__":
start = time()
url_list = [
"https://www.google.com/api/"
]
with open("tokens.txt", "r") as file:
file_lines = file.readlines()
tokens = [{'Token':line.replace('/n','') }for line in file_lines ]
url_tokens = [(url, token.strip()) for url in url_list for token in tokens]
processes = []
with ThreadPoolExecutor(max_workers=200) as executor:
for url_token in url_tokens:
processes.append(executor.submit(makerequest, url_token))
for task in as_completed(processes):
print(task.result())
print(f'Time taken: {time() - start}')

Request Status Code 500 when running Python Script

This is what i am suppose to do:
List all files in data/feedback folder
Scan all the files, and make a nested dictionary with Title, Name, Date & Feedback (All the files are in Title,Name, Date & Feedback format with each in a different line of file, that’s why using rstrip function)
Post the dictionary in The given url
Following is my code:
#!/usr/bin/env python3
import os
import os.path
import requests
import json
src = '/data/feedback/'
entries = os.listdir(src)
Title, Name, Date, Feedback = 'Title', 'Name', 'Date', 'Feedback'
inputDict = {}
for i in range(len(entries)):
fileName = entries[i]
completeName = os.path.join(src, fileName)
with open(completeName, 'r') as f:
line = f.readlines ()
line tuple = (line[0],line[1],line[2],line[3])
inputDict[fileName] = {}
inputDict[fileName][Title] = line_tuple[0].rstrip()
inputDict[fileName][Name] = line_tuple[1].rstrip()
inputDict[fileName][Date] = line_tuple[2].rstrip()
inputDict[fileName][Feedback] = line_tuple[3].rstrip()
x = requests.get ("http://website.com/feedback")
print (x.status_code)
r = requests.post ("http://Website.com/feedback” , data=inputDict)
print (r.status_code)
After i run it, get gives 200 code but post gives 500 code.
I just want to know if my script is causing the error or not ?

r = requests.post ("http://Website.com/feedback” , data=inputDict)
If your rest api endpoint is expecting json data then the line above is not doing that; it is sending the dictionary inputDict as form-encoded, as though you were submitting a form on an HTML page.
You can either use the json parameter in the post function, which sets the content-type in the headers to application/json:
r = requests.post ("http://Website.com/feedback", json=inputDict)
or set the header manually:
headers = {'Content-type': 'application/json'}
r = requests.post("http://Website.com/feedback", data=json.dumps(inputDict), headers=headers)

passing value from panda dataframe to http request

I'm not sure how I should ask this question. I'm looping through a csv file using panda (at least I think so). As I'm looping through rows, I want to pass a value from a specific column to run an http request for each row.
Here is my code so far:
def api_request(request):
fs = gcsfs.GCSFileSystem(project=PROJECT)
with fs.open('gs://project.appspot.com/file.csv') as f:
df = pd.read_csv(f,)
value = df[['ID']].to_string(index=False)
print(value)
response = requests.get(REQUEST_URL + value,headers={'accept': 'application/json','ClientToken':TOKEN }
)
json_response = response.json()
print(json_response)
As you can see, I'm looping through the csv file to get the ID to pass it to my request url.
I'm not sure I understand the issue but looking at the console log it seems that print(value) is in the loop when the response request is not. In other words, in the console log I'm seeing all the ID printed but I'm seeing only one http request which is empty (probably because the ID is not correctly passed to it).
I'm running my script with cloud functions.

Actually, forgo the use of the Pandas library and simply iterate through csv
import csv
def api_request(request):
fs = gcsfs.GCSFileSystem(project=PROJECT)
with fs.open('gs://project.appspot.com/file.csv') as f:
reader = csv.reader(f)
next(reader, None) # SKIP HEADERS
for row in reader: # LOOP THROUGH GENERATOR (NOT PANDAS SERIES)
value = row[0] # SELECT FIRST COLUMN (ASSUMED ID)
response = requests.get(
REQUEST_URL + value,
headers={'accept': 'application/json', 'ClientToken': TOKEN }
)
json_response = response.json()
print(json_response)

Give this a try instead:
def api_request(request):
fs = gcsfs.GCSFileSystem(project=PROJECT)
with fs.open('gs://project.appspot.com/file.csv') as f:
df = pd.read_csv(f)
for value in df['ID']:
response = requests.get(
REQUEST_URL + value,
headers = {'accept': 'application/json', 'ClientToken': TOKEN }
)
json_response = response.json()
print(json_response)
As mentioned in my comment, you haven't iterated through the data. What you are seeing is just the string representation of it with linebreaks (which might be why you mistakenly thought to be looping).

Working with Tenor's API

My problem is that I don't know how to work with the result of the search of a gif. I used an example, I know how to modify some parameters but I don't know how to build the gifs of the result. Code:
import requests
import json
# set the apikey and limit
apikey = "MYKEY" # test value
lmt = 8
# load the user's anonymous ID from cookies or some other disk storage
# anon_id = <from db/cookies>
# ELSE - first time user, grab and store their the anonymous ID
r = requests.get("https://api.tenor.com/v1/anonid?key=%s" % apikey)
if r.status_code == 200:
anon_id = json.loads(r.content)["anon_id"]
# store in db/cookies for re-use later
else:
anon_id = ""
# our test search
search_term = "love"
# get the top 8 GIFs for the search term
r = requests.get(
"https://api.tenor.com/v1/search?q=%s&key=%s&limit=%s&anon_id=%s" %
(search_term, apikey, lmt, anon_id))
if r.status_code == 200:
# load the GIFs using the urls for the smaller GIF sizes
top_8gifs = json.loads(r.content)
print (top_8gifs)
else:
top_8gifs = None
I would like to download the file. I know I can do it with urllib and request, but the problem is that I don't even know what is top_8gifs.
I hope someone could help me. I'm waiting you answer, thanks for your attention!!

First of all you have to use a legitimate key instead of MYKEY. Once you have done that you'll observe this code will print the output of the GET request that you have sent. It is a json file which is similar to a dictionary in python. So now you can exploit this dictionary and obtain the urls. The best strategy is to simply print out the output of json and observe the structure of dictionary carefully and extract the url from it. If you want more clarity we can use pprint module in python. It is pretty awesome and will show you how a json file looks properly. Here is the modified version of your code which pretty prints the json file, prints the gif urls and downloads the gif files. You can improve upon it and play with it if you want.
import requests
import json
import urllib.request,urllib.parse,urllib.error
import pprint
# set the apikey and limit
apikey = "YOURKEY" # test value
lmt = 8
# load the user's anonymous ID from cookies or some other disk storage
# anon_id = <from db/cookies>
# ELSE - first time user, grab and store their the anonymous ID
r = requests.get("https://api.tenor.com/v1/anonid?key=%s" % apikey)
if r.status_code == 200:
anon_id = json.loads(r.content)["anon_id"]
# store in db/cookies for re-use later
else:
anon_id = ""
# our test search
search_term = "love"
# get the top 8 GIFs for the search term
r = requests.get(
"https://api.tenor.com/v1/search?q=%s&key=%s&limit=%s&anon_id=%s" %
(search_term, apikey, lmt, anon_id))
if r.status_code == 200:
# load the GIFs using the urls for the smaller GIF sizes
pp = pprint.PrettyPrinter(indent=4)
top_8gifs = json.loads(r.content)
pp.pprint(top_8gifs) #pretty prints the json file.
for i in range(len(top_8gifs['results'])):
url = top_8gifs['results'][i]['media'][0]['gif']['url'] #This is the url from json.
print (url)
urllib.request.urlretrieve(url, str(i)+'.gif') #Downloads the gif file.
else:
top_8gifs = None

Python read html addresses from a file and post all results

target = open("addresses.txt", 'r+')
for line in target:
number = requests.get(line)
print (number)
this is clearly wrong but I'm stuck it should extract the addresses from .txt check on the net via api and print the result
So that it should read the content of each address i.e
0
0
my addresses.txt contain
http://chainz.cryptoid.info/cure/api.dws?Key=3972cc3ec73f&q=getbalance&a=BPzWE91tLTGRAqTByE4AvbX79vgYGGc9ye
http://chainz.cryptoid.info/cure/api.dws?Key=3972cc3ec73f&q=getbalance&a=BPzWE91tLTGRAqTByE4AvbX79vgYGGc9pt

Do this:
with open("addresses.txt") as addresses:
for address in addresses.readlines():
response = requests.get(address)
print(response)

Try something like:
import requests
with open("addresses.txt", 'r') as target:
for line in target:
r = requests.get(line)
print(r.text)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Script to check status code of URLs using Python - python

Related

How i can make a request for every token from a txt file in python

Request Status Code 500 when running Python Script

passing value from panda dataframe to http request

Working with Tenor's API

Python read html addresses from a file and post all results

Categories

Resources