How to get all results in a http request python - python

I am trying to get all results from https://www.ncl.com/. I found that the request must be GET and sent to this link:https://www.ncl.com/search_vacations
so far i got the first 12 results and there is no problem parsing them. The problem is i cannot find a way to "change" the page of the results. I get 12 of 499 and i need to get them all. I've tried to do this https://www.ncl.com/search_vacations?current_page=1 and increment it every time but i get the same (first) result every time. Tried adding json body to the request json = {"current_page": '1'} again with no success.
This is my code so far:
import math
import requests
session = requests.session()
proxies = {'https': 'https://97.77.104.22:3128'}
headers = {
"authority": "www.ncl.com",
"method": "GET",
"path": "/search_vacations",
"scheme": "https",
"accept": "application/json, text/plain, */*",
"connection": "keep-alive",
"referer": "https://www.ncl.com",
"cookie": "AkaUTrackingID=5D33489F106C004C18DFF0A6C79B44FD; AkaSTrackingID=F942E1903C8B5868628CF829225B6C0F; UrCapture=1d20f804-718a-e8ee-b1d8-d4f01150843f; BIGipServerpreprod2_www2.ncl.com_http=61515968.20480.0000; _gat_tealium_0=1; BIGipServerpreprod2_www.ncl.com_r4=1957341376.10275.0000; MP_COUNTRY=us; MP_LANG=en; mp__utma=35125182.281213660.1481488771.1481488771.1481488771.1; mp__utmc=35125182; mp__utmz=35125182.1481488771.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); utag_main=_st:1481490575797$ses_id:1481489633989%3Bexp-session; s_pers=%20s_fid%3D37513E254394AD66-1292924EC7FC34CB%7C1544560775848%3B%20s_nr%3D1481488775855-New%7C1484080775855%3B; s_sess=%20s_cc%3Dtrue%3B%20c%3DundefinedDirect%2520LoadDirect%2520Load%3B%20s_sq%3D%3B; _ga=GA1.2.969979116.1481488770; mp__utmb=35125182; NCL_LOCALE=en-US; SESS93afff5e686ba2a15ce72484c3a65b42=5ecffd6d110c231744267ee50e4eeb79; ak_location=US,NY,NEWYORK,501; Ncl_region=NY; optimizelyEndUserId=oeu1481488768465r0.23231006365903206",
"Proxy-Authorization": "Basic QFRLLTVmZjIwN2YzLTlmOGUtNDk0MS05MjY2LTkxMjdiMTZlZTI5ZDpAVEstNWZmMjA3ZjMtOWY4ZS00OTQxLTkyNjYtOTEyN2IxNmVlMjlk"
}
def get_count():
response = requests.get(
"https://www.ncl.com/search_vacations?cruise=1&cruiseTour=0&cruiseHotel=0&cruiseHotelAir=0&flyCruise=0&numberOfGuests=4294953449&state=undefined&pageSize=10&currentPage=",
proxies=proxies)
tmpcruise_results = response.json()
tmpline = tmpcruise_results['meta']
total_record_count = tmpline['aggregate_record_count']
return total_record_count
total_cruise_count = get_count()
total_page_count = math.ceil(int(total_cruise_count) / 10)
session.headers.update(headers)
cruises = []
page_counter = 1
while page_counter <= total_page_count:
url = "https://www.ncl.com/search_vacations?current_page=" + str(page_counter) + ""
page = requests.get(url, headers=headers, proxies=proxies)
cruise_results = page.json()
for line in cruise_results['results']:
cruises.append(line)
print(line)
page_counter += 1
print(cruise_results['pagination']["current_page"])
print("----------")
print(len(cruises))
Using requests and a proxy. Any ideas how to do that?

The website claims to have 12264 search results (for a blank search), organised in pages of 12.
The search url takes a parameter Nao which seems to define the search result offset from which your result page will start.
So fetching https://www.ncl.com/uk/en/search_vacations?Nao=45
should get a "page" of 12 search results, starting with result 46.
and sure enough:
"pagination": {
"starting_record": "46",
"ending_record": "57",
"current_page": "4",
"start_page": "1",
...
So to page through all results, start with Nao = 0 and add 12 for each fetch.

Related

Pinterest API 'bookmark' always returning the same string and data

I am trying to use pagination the way it is instructed in the Pinterest API Documentation, by passing 'bookmark' as a parameter to the next GET request in order to get the next batch of data.
However, the data returned is the EXACT same as the initial data I had received (without passing 'bookmark') and the value of 'bookmark' is also the same!
With this issue present, I keep receiving the same data over and over and can't get the entirety of the data. In my case I'm trying to list all campaigns.
Here is my python code:
url = f'https://api.pinterest.com/v5/ad_accounts/{ad_account_id}/campaigns'
payload = f"page_size=25"
headers = {
"Accept": "text/plain",
"Content-Type": "application/x-www-form-urlencoded",
"Authorization": f"Bearer {access_token}"
}
response = requests.request("GET", url, data=payload, headers=headers)
print(response)
feed = response.json()
print(feed)
bookmark=''
if 'bookmark' in feed:
bookmark = feed['bookmark']
print(bookmark)
while(bookmark != '' and bookmark != None and bookmark != 'null'):
url = f'https://api.pinterest.com/v5/ad_accounts/{ad_account_id}/{level}s'
payload = f"page_size=25&bookmark={bookmark}"
headers = {
"Accept": "text/plain",
"Content-Type": "application/x-www-form-urlencoded",
"Authorization": f"Bearer {access_token}"
}
response = requests.request("GET", url, data=payload, headers=headers)
print(response)
feed = response.json()
print(feed)
bookmark = feed['bookmark']
print(bookmark)
I think your condition in while is wrong, therefore you end up in the same loop. I'm currently also working with Pinterest API and below is my modified implementation how to get a list of ad accounts.
Basically you're testing if the bookmark is None. If yes, then you can return the result, otherwise you append the bookmark into query parameters and call the endpoint again.
from app.api.http_client import HttpClient
class PinterestAccountsClient():
def get_accounts(self, credentials) -> list[dict]:
headers = {
'Authorization': f"Bearer {credentials('access_token')}"
}
params = {
'page_size': 25
}
accounts = []
found_last_page = False
while not found_last_page:
try:
response = HttpClient.get(self.listing_url, headers=headers, params=params)
items = response.get('items', [])
bookmark = response.get('bookmark')
if bookmark is None:
found_last_page = True
else:
params['bookmark'] = bookmark
accounts.extend([{
'id': account['id'],
'name': account['name']
} for account in items])
return accounts

How to exclude particular websites in Bing Web Search API result?

I'm using Bing Web Search API to search for some info in Bing.com. However, for some search queries, I'm getting websites from search, which are not relevant for my purposes.
For example, if the search query is books to read this summer sometimes Bing returns youtube.com or amazon.com as a result. However, I don't want to have these kinds of websites in my search results. So, I want to block these kinds of websites before the search starts.
Here is my sample code:
params = {
"mkt": "en-US",
"setLang": "en-US",
"textDecorations": False,
"textFormat": "raw",
"responseFilter": "Webpages",
"q": "books to read this summer",
"count": 20
}
headers = {
"Ocp-Apim-Subscription-Key": MY_APY_KEY_HERE,
"Accept": "application/json",
"Retry-After": "1",
}
response = requests.get(WEB_SEARCH, headers=headers, params=params, timeout=15)
response.raise_for_status()
search_data = response.json()
search_data variable contains links I want to block without iteration over the response object. Particularly, I want Bing not to include youtube and amazon in the result.
Any help will be appreciated.
Edited: WEB_SEARCH is the web search endpoint
After digging Bing Web Search documentation I found an answer. Here it is.
LINKS_TO_EXCLUDE = ["-site:youtube.com", "-site:amazon.com"]
def bing_data(user_query):
params = {
"mkt": "en-US",
"setLang": "en-US",
"textDecorations": False,
"textFormat": "raw",
"responseFilter": "Webpages",
"count": 30
}
headers = {
"Ocp-Apim-Subscription-Key": MY_APY_KEY_HERE,
"Accept": "application/json",
"Retry-After": "1",
}
# This is the line what I needed to have
params["q"] = user_query + " " + " ".join(LINKS_TO_EXCLUDE)
response = requests.get(WEB_SEARCH_ENDPOINT, headers=headers, params=params)
response.raise_for_status()
search_data = response.json()
return search_data

How to unlist(Remove) number from truecaller with using script?

Hey Guys I am trying to (remove) unlist number from Truecaller using following link
https://www.truecaller.com/unlisting
I want to automate this process but because of recaptcha of google the requests are limited and cant possible so it it any way to do this using any library like unofficial libraries of Truecaller like Trunofficial .
https://www.truecaller.com/unlisting
Unlisting your phone number requires reCaptcha, you can bypass it.
This my be helpful
import requests
from requests.structures import CaseInsensitiveDict
url = "https://asia-south1-truecaller-web.cloudfunctions.net/api/noneu/unlist/v1"
headers = CaseInsensitiveDict()
headers["Authorization"] = "Bearer null"
headers["Content-Type"] = "application/json"
headers["Content-Length"] = "612"
headers["accept"] = "*/*"
headers["sec-gpc"] = "1"
headers["origin"] = "https://www.truecaller.com"
headers["sec-fetch-site"] = "cross-site"
headers["sec-fetch-mode"] = "cors"
headers["sec-fetch-dest"] = "empty"
headers["referer"] = "https://www.truecaller.com/"
headers["accept-encoding"] = "gzip, deflate, br"
headers["accept-language"] = "en-IN,en-GB;q=0.9,en-US;q=0.8,en;q=0.7"
data = {
"phoneNumber": "+919912345678",
"recaptcha": "03ANYolqtbEiFqaQ8wBrDF3kKqkCzIaH4r79oA2hCNd80gZGENvff9fPKocccytf6QXpPvQfQ12WMvgfdP1IKggff6lTY_0ucZxFB7r6A_dbNjfp_NSYtrkU4NX1h_LBQgnCO0ALkWS8CMjaIEjhxclfeClFv4EmFNEQis1OvrSVgvB8nJipuUxGakpa0eB8yWrEQCUfy0Gs7VA2hO4VaeLRTwr6BaxYsJsCP_3-vaMP2crZDDrIm8on_0H0vqh-1S44y69b0rSM6_ornuVZxeNQkpe_3NvPjQQxqQtdyQl
d55OQkK67PH7OH_A7s3GVgMa0VCOuX_UdBsPkd8mKf708GgutggfggvVrbe3DrBsUnpXMYchsv_revkhknej0G_SxAtqtwQoGPtt5iKSKHRmlelDJpYuQs6Lwi-4Umn_E
clRPT2iaohxZ3r8O_4jaGP9yhRiMyVkgTm6mutJn50nPFbyabjSqgC2ShlMEI7IoOqWp9g90b2bl4qw4h6k9vP4AVy36sCx2z_gksBEgxT1zsM3P77PQ_guo12k7rtFlUAmvdqhqgwowaKQFMMBfjWDo40"
}
resp = requests.post(url, headers=headers, data=data)
print(resp.json())
# {
# "lastUpdated": "2022-07-11T03:27:30.306713Z",
# "phoneNumber": "919912345678",
# "status": "unlisted"
# }

PythonAPI Request

Source=["SGD"]
Destination=["USD"]
Amount=[5000]
```import requests
url = "https://api.currencyfair.com/comparisonQuotes"
payload = "{\"currencyFrom\":\"SGD\",\"currencyTo\":\"EUR\",\"type\":\"SELL\",\"amountInfo\":
{\"amount\":50000,\"scale\":2}}"
headers = {
'user-agent': "vscode-restclient",
'content-type': "application/json",
'accept': "application/json"
}
response = requests.request("POST", url, data=payload, headers=headers)
print(response.text)
I need to pass values to payload string -
payload = "{"currencyFrom":"SGD","currencyTo":"EUR","type":"SELL","amountInfo":{"amount":50000,"scale":2}}"
Need to pass values to payload using 3 list created above```
Could you please try to explain a bit more what you're after?
I suspect what you mean is that you'd like to dynamically update the values in the text payload every time you call the function to post data.
I'd usually go about doing this by creating a placeholder string then updating that by replacing placeholder values at runtime.
payload = "{\"currencyFrom\":\"#currencyFrom\",\"currencyTo\":\"#currencyTo\",\"type\":\"SELL\",\"amountInfo\": {\"amount\":#amountInfo,\"scale\":2}}"
currencyFrom = 'USD'
currencyTo = 'EUR'
amountInfo = 50000
payload = payload.replace('#currencyFrom', currencyFrom).replace('#currencyTo', currencyTo).replace('#amountInfo', amountInfo)
Looking at the API you're trying to interact with, this is a sample of what it expects:
{"currencyFrom": "EUR",
"currencyTo": "GBP",
"type": "SELL",
"amountInfo": {"amount": 100},
"ignoreFee": false}
This is a JSON object that follows a specific format, if you try and pass a list as opposed to a string in the "currencyFrom", "currencyTo" fields etc you'll get an error.
To get multiple values as responses simply conduct multiple requests to the API, for example:
payload = "{\"currencyFrom\":\"#currencyFrom\",\"currencyTo\":\"#currencyTo\",\"type\":\"SELL\",\"amountInfo\": {\"amount\":#amountInfo,\"scale\":2}}"
currencyFrom = ['USD', 'GBP']
currencyTo = ['EUR', 'CHF']
amountInfo = 50000
payload = payload.replace('#currencyFrom', currencyFrom).replace('#currencyTo', currencyTo).replace('#amountInfo', amountInfo)
for currFrom in currencyFrom:
for currTo in currencyTo:
for amount in amountInfo:
payload = payload.replace('#currencyFrom', currFrom ).replace('#currencyTo', currTo ).replace('#amountInfo', amount )
response = requests.request("POST", url, data=payload, headers=headers)
Hope this makes sense!
Edit: Updating code as per your comments.

Update Sharepoint 2013 using Python3

I am currently trying to update a Sharepoint 2013 list.
This is the module that I am using using to accomplish that task. However, when I run the post task I receive the following error:
"b'{"error":{"code":"-1, Microsoft.SharePoint.Client.InvalidClientQueryException","message":{"lang":"en-US","value":"Invalid JSON. A token was not recognized in the JSON content."}}}'"
Any idea of what I am doing wrong?
def update_item(sharepoint_user, sharepoint_password, ad_domain, site_url, sharepoint_listname):
login_user = ad_domain + '\\' + sharepoint_user
auth = HttpNtlmAuth(login_user, sharepoint_password)
sharepoint_url = site_url + '/_api/web/'
sharepoint_contextinfo_url = site_url + '/_api/contextinfo'
headers = {
'accept': 'application/json;odata=verbose',
'content-type': 'application/json;odata=verbose',
'odata': 'verbose',
'X-RequestForceAuthentication': 'true'
}
r = requests.post(sharepoint_contextinfo_url, auth=auth, headers=headers, verify=False)
form_digest_value = r.json()['d']['GetContextWebInformation']['FormDigestValue']
item_id = 4991 # This id is one of the Ids returned by the code above
api_page = sharepoint_url + "lists/GetByTitle('%s')/items(%d)" % (sharepoint_listname, item_id)
update_headers = {
"Accept": "application/json; odata=verbose",
"Content-Type": "application/json; odata=verbose",
"odata": "verbose",
"X-RequestForceAuthentication": "true",
"X-RequestDigest": form_digest_value,
"IF-MATCH": "*",
"X-HTTP-Method": "MERGE"
}
r = requests.post(api_page, {'__metadata': {'type': 'SP.Data.TestListItem'}, 'Title': 'TestUpdated'}, auth=auth, headers=update_headers, verify=False)
if r.status_code == 204:
print(str('Updated'))
else:
print(str(r.status_code))
I used your code for my scenario and fixed the problem.
I also faced the same problem. I think the way that data passed for update is not correct
Pass like below:
json_data = {
"__metadata": { "type": "SP.Data.TasksListItem" },
"Title": "updated title from Python"
}
and pass json_data to requests like below:
r= requests.post(api_page, json.dumps(json_data), auth=auth, headers=update_headers, verify=False).text
Note: I used SP.Data.TasksListItem as it is my type. Use http://SharePointurl/_api/web/lists/getbytitle('name')/ListItemEntityTypeFullName to find the type

Categories