Scrape brand suggestions from Keepa with Python

Scrape brand suggestions from Keepa with Python - python

I'm trying to scrape brand suggestions from https://keepa.com/#!finder
For example when you're typing 'ni' in field "brand" the list with suggestions appears. It includes all brands started with 'ni' like 'nike'.
I know that a websocket session is responsible for that. When you press a key, it sends gzipped JSON string to that session and receives response like this.
So the yellow items are gzipped strings.
When I'm trying to recreate this in Python, it fails.
The following code
from websocket import create_connection
import zlib
websocket_resource_url = 'wss://dyn.keepa.com/apps/cloud/?app=keepaWebsite&version=1.6'
ws = create_connection(websocket_resource_url)
msg = {"path":"pro/autocomplete","content":"nike","domainId":1,"maxResults":25,"type":"brand","version":3,"id":111111,"user":"user"}
ws.send(zlib.compress((json.dumps(msg)+'\n').encode('utf-8')))
print('Result: {}'.format(zlib.decompress(ws.recv())))
returns {"status":108, "n":73}
But expected output is somethins like:
{"startTimeStamp":1637576348774,"id":7596,"timeStamp":1637576348774,"status":200,"version":3,"lastUpdate":0,"suggestions":[{"value":"nike","count":1100713},{"value":"nike golf","count":7497},{"value":"nikeelando","count":358},{"value":"nikea","count":195},{"value":"nike sb","count":87}]}
What I'm doing wrong?

Related

How to get the status of the pull request of an issue in jira via python

I am not able to understand how to fetch the status of the pull request via Python jira API.
I have gone through https://jira.readthedocs.io/en/latest/examples.html,
and searched the internet for it. But I was not able to link the jira issue with the pull request, I saw that the pull request is linked to jira issue id, but was not able to understand how to implement it.
I am using python 3.7
from jira import JIRA
issue = auth_jira.issue('XYZ-000')
pull_request = issue.id.pullrequest
I am getting this error:
AttributeError: 'str' object has no attribute 'pullrequest'
I am not sure how to access pullrequest data in jira.
Any leads would help.

I did something similar with another python wrapper for the jira-API: atlassian-python-api.
Look if it works in your case:
from atlassian import Jira
from pprint import pprint
import json
jira = Jira(
url='https://your.jira.url',
username=user,
password=pwd)
issue = jira.get_issue(issue_key)
# get the custom field ref of the "Development" field (I don't know if it's always the same):
dev_field_string = issue["fields"]["customfield_13900"]
# the value of this field is a huge string containing a json, that we must parse ourselves:
json_str = dev_field_string.split("devSummaryJson=")[1][:-1]
# we load it with the json module (this ensures json is converted as dict, i.e. 'true' is interpreted as 'True'...)
devSummaryJson = json.loads(json_str)
# the value of interest are under cachedValue/summary:
dev_field_dic = devSummaryJson["cachedValue"]["summary"]
pprint(dev_field_dic)
# you can now access the status of your pull requests (actually only the last one):
print(dev_field_dic['pullrequest']['overall']['state'])

Using Slack RTM API with Django 2.0

I am a beginner to the Django framework and I am building a Django app that uses the Slack RTM API.
I have a coded a program in python that performs the OAuth authentication process like so :
def initialize():
url="https://slack.com/api/rtm.connect"
payload={"token":"xxx"}
r=requests.post(url,payload)
res=json.loads(r.text)
url1=res['url']
ws = create_connection(url1)
return ws
My Requirement:
The stream of events I receive (from my slack channel that my slack app is added to) is processed to filter out events of the type - message ,then match the message with a regex pattern and then store the matched string in a database.
As a stand alone python program I am receiving the stream of events from my channel.
My questions:
How do I successfully integrate this code to Django so that I can
fulfill my requirement?
Do I put the code in templates/views? What is the
recommended method to process this stream of data?

def initialize():
url = "https://slack.com/api/rtm.connect"
r = requests.get(url, params={'token': '<YOUR TOKEN>'})
res = r.json()
url1=res['url']
ws = create_connection(url1) #Note: There is no function called create_connnection() so it will raise an error
return ws
if you read the API web methods, you see :
Preferred HTTP method: GET
See here: Slack rtm.connect method
look at the comment, and thats the right code, see the differences between this code and yours.
basically to get JSON from a request don't use json.loads because this search your local computer not the request
use r.json() so it call the json you got from r.
Note that r.text will return raw text output so when you want to get url it will not be identified, with r.json you can call the object url as stated about
Hope this help.
and please could you tell us more what you wanna do with this in view ? because template is a directory which contains all the HTML files which you don't need to work with.
but why views.py ?

How to display images in html?

I have a working app using the imgur API with python.
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
for item in items:
print(item.link)
It outputs a set of imgur links. I want to display all those pictures on a webpage.
Does anyone know how I can integrate python to do that in an HTML page?

OK, I haven't tested this, but it should work. The HTML is really basic (but you never mentioned anything about formatting) so give this a shot:
from imgurpython import ImgurClient
client_id = '5b786edf274c63c'
client_secret = 'e44e529f76cb43769d7dc15eb47d45fc3836ff2f'
client = ImgurClient(client_id, client_secret)
items = client.subreddit_gallery('rarepuppers')
htmloutput = open('somename.html', 'w')
htmloutput.write("[html headers here]")
htmloutput.write("<body>")
for item in items:
print(item.link)
htmloutput.write('SOME DESCRIPTION<br>')
htmloutput.write("</body</html>")
htmloutput.close
You can remove the print(item.link) statement in the code above - it's there to give you some comfort that stuff is happening.
I've made a few assumptions:
The item.link returns a string with only the unformatted link. If it's already in html format, then remove the HTML around item.link in the example above.
You know how to add an HTML header. Let me know if you do not.
This script will create the html file in the folder it's run from. This is probably not the greatest of ideas. So adjust the path of the created file appropriately.
Once again, though, if that is a real client secret, you probably want to change it ASAP.

How to get information about groups liked by a user using Facebook Graph API

I am very new to the Graph API and a basically trying to write a python (v2.7) script which takes as input the userID of a Facebook user and returns names/IDs of all groups/pages that have been liked by the user.
I have managed to acquire an Access Token that uses the following permissions: user_likes and user_groups. Do I need anything else?
I have used the following code so far to get a JSON dump of all the output from this access token:
import urllib
import json
import sys
import os
accessToken = 'ACCESS_ToKEN_HERE' #INSERT YOUR ACCESS TOKEN
userId = sys.argv[1]
limit=100
# Read my likes as a json object
url='https://graph.facebook.com/'+userId+'/feed?access_token='+accessToken +'&limit='+str(limit)
data = json.load(urllib.urlopen(url))
id=0
print str(data)
I did get some JSON data but I couldn't find any page/group related info in it neither did it seem to be the most recently updated data! Why is this?
Also, what are the field names that must be tracked to detect a page or a group in the likes data? Please help!

You are using the wrong API- /feed - this will fetch the feeds/posts of the user, not the pages/groups.
To get the groups he has joined:
API: /{user-id}/groups
Permissions req: user_groups
To get the pages he has liked:
API: /{user-id}/likes
Permissions req: user_likes

Sorting Facebook data using python

I am retrieving data from Facebook, the retrieved data gives me information about my Facebook friends likes.
I have written this:
import requests # pip install requests
import json
ACCESS_TOKEN='' #my Facebook access token here
base_url = 'https://graph.facebook.com/me'
# Get 10 likes for 10 friends
fields = 'id,name,friends.limit(10).fields(likes.limit(10))'
url = '%s?fields=%s&access_token=%s' %(base_url, fields, ACCESS_TOKEN,)
# This API is HTTP-based and could be requested in the browser,
# with a command line utlity like curl, or using just about
# any programming language by making a request to the URL.
# Click the hyperlink that appears in your notebook output
# when you execute this code cell to see for yourself...
print url
# Interpret the response as JSON and convert back
# to Python data structures
content = requests.get(url).json()
# Pretty-print the JSON and display it
print json.dumps(content, indent=1)
###############################More Options########################################
# Get all likes for 10 friends
fields = 'id,name,friends.limit(10).fields(likes)'
# Get all likes for 10 more friends
fields = 'id,name,friends.offset(10).limit(10).fields(likes)'
# Get 10 likes for all friends
fields = 'id,name,friends.fields(likes.limit(10))'
I have used execfile('Filename.py') command in order to run this code on python interactive shell.
Now what I want to do is to store the retrieved data in a text file called "likes.txt".
Does anyone have an idea how to do this?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scrape brand suggestions from Keepa with Python - python

Related

How to get the status of the pull request of an issue in jira via python

Using Slack RTM API with Django 2.0

How to display images in html?

How to get information about groups liked by a user using Facebook Graph API

Sorting Facebook data using python

Categories

Resources