script to look up google safe browsing with Pysafebrowsing

script to look up google safe browsing with Pysafebrowsing - python

it's my second time to post my problem.
I need to look up if an url is safe by google using Pysafebrowsing it's a python package, so I get i different result Comparison of the original source :
https://transparencyreport.google.com/safe-browsing/search
My script is :
from pysafebrowsing import SafeBrowsing
s = SafeBrowsing(KEY)
r = s.lookup_urls([domain])
print(r)
Please if you have any idea help me!

Related

HTML hidden elements

I'm actually trying to code a little "GPS" and actually I couldn't use Google API because of the daily restriction.
I decided to use a site "viamichelin" which provide me the distance between two adresses. I created a little code to fetch all the URL adresses I needed like this :
import pandas
import numpy as np
df = pandas.read_excel('C:\Users\Bibi\Downloads\memoire\memoire.xlsx', sheet_name='Clients')
df2= pandas.read_excel('C:\Users\Bibi\Downloads\memoire\memoire.xlsx', sheet_name='Agences')
matrix=df.as_matrix(columns=None)
clients = np.squeeze(np.asarray(matrix))
matrix2=df2.as_matrix(columns=None)
agences = np.squeeze(np.asarray(matrix2))
compteagences=0
comptetotal=0
for j in agences:
compteclients=0
for i in clients:
print agences[compteagences]
print clients[compteclients]
url ='https://fr.viamichelin.be/web/Itineraires?departure='+agences[compteagences]+'&arrival='+clients[compteclients]+'&arrivalId=34MTE1MnJ5ZmQwMDMzb3YxMDU1ZDFvbGNOVEF1TlRVNU5UUT1jTlM0M01qa3lOZz09Y05UQXVOVFl4TlE9PWNOUzQzTXpFNU5nPT1jTlRBdU5UVTVOVFE9Y05TNDNNamt5Tmc9PTBqUnVlIEZvbmQgZGVzIEhhbGxlcw==&index=0&vehicle=0&type=0&distance=km&currency=EUR&highway=false&toll=false&vignette=false&orc=false&crossing=true&caravan=false&shouldUseTraffic=false&withBreaks=false&break_frequency=7200&coffee_duration=1200&lunch_duration=3600&diner_duration=3600&night_duration=32400&car=hatchback&fuel=petrol&fuelCost=1.393&allowance=0&corridor=&departureDate=&arrivalDate=&fuelConsumption='
print url
compteclients+=1
comptetotal+=1
compteagences+=1
All my datas are on Excel that's why I used the pandas library. I have all the URL's needed for my project.
Although, I would like to extract the number of kilometers needed but there's a little problem. In the source code, I don't have the information I need, so I can't extract it with Python... The site is presented like this:
Michelin
When I click on "inspect" I can find the information needed (on the left) but I can't on the source code (on the right) ... Can someone provide me some help?
Itinerary
I have already tried this, without succeeding :
import os
import csv
import requests
from bs4 import BeautifulSoup
requete = requests.get("https://fr.viamichelin.be/web/Itineraires?departure=Rue%20Lebeau%2C%20Liege%2C%20Belgique&departureId=34MTE1Mmc2NzQwMDM0NHoxMDU1ZW44d2NOVEF1TmpNek5ERT1jTlM0MU5qazJPQT09Y05UQXVOak16TkRFPWNOUzQxTnpBM01nPT1jTlRBdU5qTXpOREU9Y05TNDFOekEzTWc9PTBhUnVlIExlYmVhdQ==&arrival=Rue%20Rys%20De%20Mosbeux%2C%20Trooz%2C%20Belgique&arrivalId=34MTE1MnJ5ZmQwMDMzb3YxMDU1ZDFvbGNOVEF1TlRVNU5UUT1jTlM0M01qa3lOZz09Y05UQXVOVFl4TlE9PWNOUzQzTXpFNU5nPT1jTlRBdU5UVTVOVFE9Y05TNDNNamt5Tmc9PTBqUnVlIEZvbmQgZGVzIEhhbGxlcw==&index=0&vehicle=0&type=0&distance=km&currency=EUR&highway=false&toll=false&vignette=false&orc=false&crossing=true&caravan=false&shouldUseTraffic=false&withBreaks=false&break_frequency=7200&coffee_duration=1200&lunch_duration=3600&diner_duration=3600&night_duration=32400&car=hatchback&fuel=petrol&fuelCost=1.393&allowance=0&corridor=&departureDate=&arrivalDate=&fuelConsumption=")
page = requete.content
soup = BeautifulSoup(page, "html.parser")
print soup

Looking at the inspector for the page, the actual routing is done via a JavaScript invocation to this rather long URL.
The data you need seems to be in that response, starting from _scriptLoaded(. (Since it's a JavaScript object literal, you can use Python's built-in JSON library to load the data into a dict.)

while working with gspread-pandas module, I want to change default_dir of the module

import json
from os import path, makedirs
_default_dir = path.expanduser('~/.config/gspread_pandas')
_default_file = 'google_secret.json'
def ensure_path(pth):
if not path.exists(pth):
makedirs(pth)
hi, I'm currently working on data collection via selenium and pandas to parse the data and edit it with pandas to send the data to google spread
however, while I'm working on gspread-pandas module, the module needs to put google_secret json file to '~/.config/gspread_pandas'. which is fixed location as described in the link below
https://pypi.python.org/pypi/gspread-pandas/0.15.1
I want to make some function to set the custom location to achieve independent working app environment.
for example, I want to locate the file to here
default_folder = os.getcwd()
the default_folder will be where my project is located(the same folder)
what can I do with it?

If you see the source https://github.com/aiguofer/gspread-pandas/blob/master/gspread_pandas/conf.py you can notice, that you can create your own config and pass it to Spread object constructor.
But yes, this part is really badly documented.
So, this code works well for me:
from gspread_pandas import Spread, conf
c = conf.get_config('[Your Path]', '[Your filename]')
spread = Spread('username', 'spreadname', config=c)

Thank you for this. It really should be documented better. I was getting so frustrated trying to get this to work with heroku, but it worked perfectly. I had to change to the following:
c = gspread_pandas.conf.get_config('/app/', 'google_secret.json')
spread = gspread_pandas.Spread('google_sheet_key_here_that_is_a_long_string', config=c)

Retrieve image from method called by URL in python

I'm trying to retrieve an image that is returned through a given URL using python, for example this one:
http://fundamentus.com.br/graficos3.php?codcvm=2453&tipo=108
I am trying to do this by using urllib retrieve method:
import urllib
urlStr = "http://fundamentus.com.br/graficos3.php?codcvm=2453&tipo=108"
filename = "image.png"
urllib.urlretrieve(urlStr,filename)
I already used this for other URLs, (such as http://chart.finance.yahoo.com/z?s=CMIG4.SA&t=9m), but for the first one it's not working.
Does anyone have an idea about how to make this for the given URL?
Note: I'm using Python 2.7

You need to use a session which you can do with requests:
import requests
with requests.Session() as s:
s.get("http://fundamentus.com.br/graficos.php?papel=CMIG4&tipo=2")
with open("out.png", "wb") as f:
f.write(s.get("http://fundamentus.com.br/graficos3.php?codcvm=2453&tipo=108").content)
It works in your browser as you had visited the initial page where the image was so any necessary cookies were set.

While more verbose than #PadraicCunningham response. This should also do the trick. I'd run into a similar problem (host would only support certain browsers), so i had to start using urllib2 instead of just urllib. pretty powerful and is a module which comes with python.
Basically, you capture all the information you need during your initial request, and add it to your next request and subsequent requests. The requests module seems to pretty much do all of this for you behind the scenes. If only I'd known about that all these years...
import urllib2
urlForCookie = 'http://fundamentus.com.br/graficos.php?papel=CMIG4&tipo=2'
urlForImage = 'http://fundamentus.com.br/graficos3.php?codcvm=2453&tipo=108'
initialRequest = urllib2.Request(urlForCookie)
siteCookie = urllib2.urlopen(req1).headers.get('Set-Cookie')
imageReq = urllib2.Request(urlForImage)
imageReq.add_header('cookie', siteCookie)
with open("image2.pny",'w') as f:
f.write(urllib2.urlopen(req2).read())
f.close()

How do I get the XML format of Bugzilla given a bug ID using python and XML-RPC?

This question has been updated
I am writing a python script using the python-bugzilla 1.1.0 pypi. I am able to get all the bug IDs but I want to know if there is a way for me to access each bug's XML page? Here is the code I have so far:
bz = bugzilla.Bugzilla(url='https://bugzilla.mycompany.com/xmlrpc.cgi')
try:
bz.login('name#email.com', 'password');
print'Authorization cookie received.'
except bugzilla.BugzillaError:
print(str(sys.exc_info()[1]))
sys.exit(1)
#getting all the bug ID's and displaying them
bugs = bz.query(bz.build_query(assigned_to="your-bugzilla-account"))
for bug in bugs:
print bug.id
I don't know how to access each bug's XML page and not sure if it is even possible to do so. Can anyone help me with this? Thanks.

bz.getbugs()
Will get all bugs, bz.getbugssimple is also worth a look.

#!/usr/bin/env python
import bugzilla
bz = bugzilla.Bugzilla(url='https://bugzilla.company.com/xmlrpc.cgi')
bz.login('username#company.com', 'password')
results = bz.query(bz.url_to_query(queryUrl))
bids = []
for b in results:
bids.append(b.id)
print bids

Trying to do batch update to Google spreadsheet using gdata python libraries

I have been trying to figure this out for a while now and just dont seem to be able to break through so hopefully someone out there has done this before.
My issue is that I am trying to do a batch update of a google spreadsheet using the gdata python client libraries and authenticating via oauth2. I have found an example of how to do the batch update using the gdata.spreadsheet.service module here: https://code.google.com/p/gdata-python-client/wiki/UsingBatchOperations
However that does not seem to work when authenticating via oauth2 and so I am having to use the gdata.spreadsheets.client module instead as discussed in this post: https://code.google.com/p/gdata-python-client/issues/detail?id=549
Using the gdata.spreadsheets.client module works for authentication and for updating the sheet however batch commands does not seem to work. Below is my latest variation of the code which is about the closest I have got. It seems to work but the sheet is not updated and the batch_status returned is: 'Insert not supported on batch.' (Note: I did try modifying the batch_operation and batch_id parameters of the CellEntries in the commented out code but this did not work either.)
Thanks for any help you can provide.
import gdata
import gdata.gauth
import gdata.service
import gdata.spreadsheets
import gdata.spreadsheets.client
import gdata.spreadsheets.data
token = gdata.gauth.OAuth2Token(client_id=Client_id,client_secret=Client_secret,scope=Scope,
access_token=ACCESS_TOKEN, refresh_token=REFRESH_TOKEN,
user_agent=User_agent)
client = gdata.spreadsheets.client.SpreadsheetsClient()
token.authorize(client)
range = "D6:D13"
cellq = gdata.spreadsheets.client.CellQuery(range=range, return_empty='true')
cells = client.GetCells(file_id, 'od6', q=cellq)
objData = gdata.spreadsheets.data
batch = objData.BuildBatchCellsUpdate(file_id, 'od6')
n = 1
for cell in cells.entry:
cell.cell.input_value = str(n)
batch.add_batch_entry(cell, cell.id.text, batch_id_string=cell.title.text, operation_string='update')
n = n + 1
client.batch(batch, force=True)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

script to look up google safe browsing with Pysafebrowsing - python

Related

HTML hidden elements

while working with gspread-pandas module, I want to change default_dir of the module

Retrieve image from method called by URL in python

How do I get the XML format of Bugzilla given a bug ID using python and XML-RPC?

Trying to do batch update to Google spreadsheet using gdata python libraries

Categories

Resources