I'm trying to read this Soap API http://telemetriaws1.ana.gov.br/ServiceANA.asmx?WSDL.
When I try to call DadosHidrometeorologicosGerais function I getting this error:
File "/home/1234/.local/share/virtualenvs/data_getter-1W9NAele/lib/python3.6/site-packages/zeep/xsd/schema.py", line 570, in _get_component
return items[qname]
KeyError: <lxml.etree.QName object at 0x7f2e93fa6d00>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
zeep.exceptions.LookupError: No element 'DocumentElement' in namespace None. Available elements are: NewDataSet
All other function working ok.
Here my code:
from zeep import Client
client = Client('http://telemetriaws1.ana.gov.br/ServiceANA.asmx?WSDL')
client.service.DadosHidrometeorologicosGerais(codEstacao='86450500', dataInicio='05/05/2018', dataFim='05/05/2018')
I'm not sure, but the xml file seems generated from some data set. The "id" attribute does not match the identification of the element.
A possible solution would be to omit the zeep parsing of the XML and returning the raw data. The use the module xml.etree.ElementTree to parse it.
from zeep import Client, Settings
import xml.etree.ElementTree as ET
settings = Settings(force_https=False, raw_response=True)
WSDL = 'http://telemetriaws1.ana.gov.br/ServiceANA.asmx?WSDL'
client = Client(WSDL, settings=settings)
response = client.service.DadosHidrometeorologicosGerais(codEstacao='86450500',
dataInicio='05/05/2018',
dataFim='05/05/2018')
root = ET.fromstring(response.content)
Related
So I'm currently working on a script using Shodan in Python using import shodan.
The documentation around isn't the best so id thought I'd bring my issue here.
Using the Shodan CLI there is a command known as Shodan Info which displays the number of search credits that your account has remaining.
When looking at the Shodan Documentation i can see that there is an info() tag but whenever I use it there has been an error.
Does anyone know the correct syntax for using info in a python script to display the number of credits an account has?
Errors i have received:
ipinfo = shodan.info()
print (ipinfo)
Error
Traceback (most recent call last):
File "C:/Users/XXXX/OneDrive - XXXX/Documents/XX Documents/XXXX/Working
Segments/ShodanScraper.py", line 8, in <module>
ipinfo = shodan.info()
AttributeError: module 'shodan' has no attribute 'info'
And
ipinfo = shodan.Shodan(info())
print (ipinfo)
Error
Traceback (most recent call last):
File "C:/Users/XXXX/OneDrive - XXXX/Documents/XXXXDocuments/XXXXXX/Working
Segments/ShodanScraper.py", line 8, in <module>
ipinfo = shodan.Shodan(info())
NameError: name 'info' is not defined
You need to instantiate the Shodan class and then you will be able to use its Shodan.info() method. Here is how it looks in the official command-line interface:
https://github.com/achillean/shodan-python/blob/master/shodan/__main__.py#L364
Here's a short script that prints the current account usage information:
import pprint
import shodan
key = 'YOUR API KEY'
api = shodan.Shodan(key)
results = api.info()
pprint.pprint(results)
shodan does not express how to use info. heres a breif example of how to set up shodan info
import shodan
api = shodan.Shodan('YOUR API KEY')
info = api.host('8.8.8.8')
shodan info
Good morning, I am looking to convert html to xml using chilkat library. but it throws this error at me.
import chilkat
a = "asd asd asd asd"
xml = a.toXml()
print(xml)
Traceback (most recent call last):
File "C:\Users\acalobish\Desktop\iaa.py", line 4, in <module>
xml = a.toXml()
AttributeError: 'str' object has no attribute 'toXml'
import chilkat
htmlToXml = chilkat.CkHtmlToXml()
# Indicate the charset of the output XML we'll want.
htmlToXml.put_XmlCharset("utf-8")
success = htmlToXml.ConvertFile("test.html","out.xml")
if (success != True):
print(htmlToXml.lastErrorText())
else:
print("Success")
I am learning how to scrape web information. Below is a snippet of the actual code solution + output from datacamp.
On datacamp, this works perfectly fine, but when I try to run it on Spyder (my own macbook), it doesn't work...
This is because on datacamp, the URL has already been pre-loaded into a variable named 'response'.. however on Spyder, the URL needs to be defined again.
So, I first defined the response variable as response = requests.get('https://www.datacamp.com/courses/all') so that the code will point to datacamp's website..
My code looks like:
from scrapy.selector import Selector
import requests
response = requests.get('https://www.datacamp.com/courses/all')
this_url = response.url
this_title = response.xpath('/html/head/title/text()').extract_first()
print_url_title( this_url, this_title )
When I run this on Spyder, I got an error message
Traceback (most recent call last):
File "<ipython-input-30-6a8340fd3a71>", line 11, in <module>
this_title = response.xpath('/html/head/title/text()').extract_first()
AttributeError: 'Response' object has no attribute 'xpath'
Could someone please guide me? I would really like to know how to get this code working on Spyder.. thank you very much.
The value returned by requests.get('https://www.datacamp.com/courses/all') is a Response object, and this object has no attribute xpath, hence the error: AttributeError: 'Response' object has no attribute 'xpath'
I assume response from your tutorial source, probably has been assigned to another object (most likely the object returned by etree.HTML) and not the value returned by requests.get(url).
You can however do this:
from lxml import etree #import etree
response = requests.get('https://www.datacamp.com/courses/all') #get the Response object
tree = etree.HTML(response.text) #pass the page's source using the Response object
result = tree.xpath('/html/head/title/text()') #extract the value
print(response.url) #url
print(result) #findings
I am using sightengine API to detect child sexual abuse in images. In order to do that I am trying to detect the nudity level in an image using sightengine API. The following example is provided in the documentation itself.
from sightengine.client import SightengineClient
client = SightengineClient('api_user', 'api_key')
output = client.check('nudity').image('https://d3m9459r9kwism.cloudfront.net/img/examples/example7.jpg')
Apart from copying the same code I am getting the following error.
Traceback (most recent call last):
File "driver.py", line 3, in <module>
output = client.check('nudity').image('https://d3m9459r9kwism.cloudfront.net/img/examples/example7.jpg')
AttributeError: 'Check' object has no attribute 'image'
I have used both Python 2 and Python 3 for the same code but both raise the same error.
Try This:
from sightengine.client import SightengineClient
client = SightengineClient('API user', 'API secret')
checkNudity = client.check('nudity')
output1 = checkNudity.set_url('https://d3m9459r9kwism.cloudfront.net/img/examples/example7.jpg')
print(output1)
there. I'm building a simple scraping tool. Here's the code that I have for it.
from bs4 import BeautifulSoup
import requests
from lxml import html
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import datetime
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('Programming
4 Marketers-File-goes-here.json', scope)
site = 'http://nathanbarry.com/authority/'
hdr = {'User-Agent':'Mozilla/5.0'}
req = requests.get(site, headers=hdr)
soup = BeautifulSoup(req.content)
def getFullPrice(soup):
divs = soup.find_all('div', id='complete-package')
price = ""
for i in divs:
price = i.a
completePrice = (str(price).split('$',1)[1]).split('<', 1)[0]
return completePrice
def getVideoPrice(soup):
divs = soup.find_all('div', id='video-package')
price = ""
for i in divs:
price = i.a
videoPrice = (str(price).split('$',1)[1]).split('<', 1)[0]
return videoPrice
fullPrice = getFullPrice(soup)
videoPrice = getVideoPrice(soup)
date = datetime.date.today()
gc = gspread.authorize(credentials)
wks = gc.open("Authority Tracking").sheet1
row = len(wks.col_values(1))+1
wks.update_cell(row, 1, date)
wks.update_cell(row, 2, fullPrice)
wks.update_cell(row, 3, videoPrice)
This script runs on my local machine. But, when I deploy it as a part of an app to Heroku and try to run it, I get the following error:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 219, in put_feed
r = self.session.put(url, data, headers=headers)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 82, in put
return self.request('PUT', url, params=params, data=data, **kwargs)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 69, in request
response.status_code, response.content))
gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "AuthorityScraper.py", line 44, in
wks.update_cell(row, 1, date)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/models.py", line 517, in update_cell
self.client.put_feed(uri, ElementTree.tostring(feed))
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 221, in put_feed
if ex[0] == 403:
TypeError: 'RequestError' object does not support indexing
What do you think might be causing this error? Do you have any suggestions for how I can fix it?
There are a couple of things going on:
1) The Google Sheets API returned an error: "Invalid query parameter value for cell_id":
gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")
2) A bug in gspread caused an exception upon receipt of the error:
TypeError: 'RequestError' object does not support indexing
Python 3 removed __getitem__ from BaseException, which this gspread error handling relies on. This doesn't matter too much because it would have raised an UpdateCellError exception anyways.
My guess is that you are passing an invalid row number to update_cell. It would be helpful to add some debug logging to your script to show, for example, which row it is trying to update.
It may be better to start with a worksheet with zero rows and use append_row instead. However there does seem to be an outstanding issue in gspread with append_row, and it may actually be the same issue you are running into.
I encountered the same problem. BS4 works fine at a local machine. However, for some reason, it is way too slow in the Heroku server resulting into giving error.
I switched to lxml and it is working fine now.
Install it by command:
pip install lxml
A sample code snippet is given below:
from lxml import html
import requests
getpage = requests.get("https://url_here")
gethtmlcontent = html.fromstring(getpage.content)
data = gethtmlcontent.xpath('//div[#class = "class-name"]/text()')
#this is a sample for fetching data from the dummy div
data = data[0:n] # as per your requirement
#now inject the data into django tmeplate.