fusion tables importrows - python

Have anyone used the function importRows() from fusion table API?
As the API reference below,
https://developers.google.com/fusiontables/docs/v1/reference/table/importRows
I have to supply CSV data in the request body.
But what should I do for the html body exactly?
My code:
http = getAuthorizedHttp()
DISCOVERYURL = 'https://www.googleapis.com/discovery/v1/apis/{api}/{apiVersion}/rest'
ftable = build('fusiontables', 'v1', discoveryServiceUrl=DISCOVERYURL, http=http)
body = create_ft(CSVFILE,"title here") # the function to load csv file and create the table with columns from csv file.
result = ftable.table().insert(body=body).execute()
print result["tableId"] # good, I have got the id for new created table
# I have no idea how to go on here..
f = ftable.table().importRows(tableId=result["tableId"])
f.body = ?????????????
f.execute()

I finally fixed my problem, my code can be found in the following link.
https://github.com/childnotfound/parser/blob/master/uploader.py

I fixed the problem like this:
media = http.MediaFileUpload('example.csv', mimetype='application/octet-stream', resumable=True)
request = service.table().importRows(media_body=media, tableId='1cowubQ0vj_H9q3owo1vLM_gMyavvbuoNmRQaYiZV').execute()

Related

Python API call to BigQuery using cloud functions

I'm trying to build my first cloud function. Its a function that should get data from API, transform to DF and push to bigquery. I've set the cloud function up with a http trigger using validate_http as entry point. The problem is that it states the function is working but it doesnt actually write anything. Its a similiar problem as the problem discussed here: Passing data from http api to bigquery using google cloud function python
import pandas as pd
import json
import requests
from pandas.io import gbq
import pandas_gbq
import gcsfs
#function 1: Responding and validating any HTTP request
def validate_http(request):
request.json = request.get_json()
if request.args:
get_api_data()
return f'Data pull complete'
elif request_json:
get_api_data()
return f'Data pull complete'
else:
get_api_data()
return f'Data pull complete'
#function 2: Get data and transform
def get_api_data():
import pandas as pd
import requests
import json
#Setting up variables with tokens
base_url = "https://"
token= "&token="
token2= "&token="
fields = "&fields=date,id,shippingAddress,items"
date_filter = "&filter=date in '2022-01-22'"
data_limit = "&limit=99999999"
#Performing API call on request with variables
def main_requests(base_url,token,fields,date_filter,data_limit):
req = requests.get(base_url + token + fields +date_filter + data_limit)
return req.json()
#Making API Call and storing in data
data = main_requests(base_url,token,fields,date_filter,data_limit)
#transforming the data
df = pd.json_normalize(data['orders']).explode('items').reset_index(drop=True)
items = df['items'].agg(pd.Series)[['id','itemNumber','colorNumber', 'amount', 'size','quantity', 'quantityReturned']]
df = df.drop(columns=[ 'items', 'shippingAddress.id', 'shippingAddress.housenumber', 'shippingAddress.housenumberExtension', 'shippingAddress.address2','shippingAddress.name','shippingAddress.companyName','shippingAddress.street', 'shippingAddress.postalcode', 'shippingAddress.city', 'shippingAddress.county', 'shippingAddress.countryId', 'shippingAddress.email', 'shippingAddress.phone'])
df = df.rename(columns=
{'date' : 'Date',
'shippingAddress.countryIso' : 'Country',
'id' : 'order_id'})
df = pd.concat([df, items], axis=1, join='inner')
#Push data function
bq_load('Return_data_api', df)
#function 3: Convert to bigquery table
def bq_load(key, value):
project_name = '375215'
dataset_name = 'Returns'
table_name = key
value.to_gbq(destination_table='{}.{}'.format(dataset_name, table_name), project_id=project_name, if_exists='replace')
The problem is that the script doesnt write to bigquery and doesnt return any error. I know that the get_api_data() function is working since I tested it locally and does seem to be able to write to BigQuery. Using cloud functions I cant seem to trigger this function and make it write data to bigquery.
There are a couple of things wrong with the code that would set you right.
you have list data, so store as a csv file (in preference to json).
this would mean updating (and probably renaming) the JsonArrayStore class and its methods to work with CSV.
Once you have completed the above and written well formed csv, you can proceed to this:
reading the csv in the del_btn method would then look like this:
import python
class ToDoGUI(tk.Tk):
...
# methods
...
def del_btn(self):
a = JsonArrayStore('test1.csv')
# read to list
with open('test1.csv') as csvfile:
reader = csv.reader(csvfile)
data = list(reader)
print(data)
Good work, you have a lot to do, if you get stuck further please post again.

Extract table data and put them into dictionary with azure form recognizer

I have searched related to my question but none found.
Below is my tried working code:
import json
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.formrecognizer import FormRecognizerClient, FormTrainingClient
from azure.core.credentials import AzureKeyCredential
credentials = json.load(open("creds.json"))
API_KEY = credentials["API_KEY"]
ENDPOINT = credentials["ENDPOINT"]
url = "https://some_pdf_url_which_contains_tables.pdf" #or image url which contains
#table
form_recognizer_client = FormRecognizerClient(ENDPOINT, AzureKeyCredential(API_KEY))
poller = form_recognizer_client.begin_recognize_content_from_url(url)
form_data = poller.result()
for page in form_data:
for table in page.tables:
for cell in table.cells:
for item in cell.text:
print(item)
## But I need table in dictionary format with header names in keys and
## values in values.
I hope I get some help. Thank you.
According to the python Azure form recognizer documentation,
you can use the 'to_dict' method.
result_table = form_data.tables[0].to_dict()
And then you can loop in the dictionary.
I hope it helps you !

Handling JSON in Python 3.7

I am using Python 3.7 and I am trying to handle some JSON data that I receive back from a website. A sample of the JSON response is below but it can vary in length. In essence, it returns details about 'officers' and in the example below, there is data for two officers. This is using the OpenCorporates API
{"api_version":"0.4","results":{"page":1,"per_page":30,"total_pages":1,"total_count":2,"officers":[{"officer":{"id":212927580,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/212927580","start_date":"2015-01-28","end_date":null,"occupation":"SERVICE MANAGER","current_status":null,"inactive":false,"company":{"name":"GRSS LIMITED","jurisdiction_code":"gb","company_number":"09411531","opencorporates_url":"https://opencorporates.com/companies/gb/09411531"}}},{"officer":{"id":190031476,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/190031476","start_date":"2002-05-17","end_date":null,"occupation":"COMPANY DIRECTOR","current_status":null,"inactive":false,"company":{"name":"GILBERT ROAD SERVICE STATION LIMITED","jurisdiction_code":"gb","company_number":"04441363","opencorporates_url":"https://opencorporates.com/companies/gb/04441363"}}}]}}
My code so far is:-
response = requests.get(url)
response.raise_for_status()
jsonResponse = response.json()
officerDetails = jsonResponse['results']['officers']
This works well but my ultimate goal is to create variables and write them to a .csv. So I'd like to write something like:-
name = jsonResponse['results']['officers']['name']
position = jsonResponse['results']['officers']['name']
companyName = jsonResponse['results']['officers']['company']['name']
Any suggestions how I could do this? As said, I'd like to loop through each 'officer' in the JSON response and then capture these values and write to a .csv (I will tackle the .csv part once I have them assigned to the variables)
officers = jsonResponse['results']['officers']
res = []
for officer in officers:
data = {}
data['name'] = officer['officer']['name']
data['position'] = officer['officer']['position']
data['company_name'] = officer['officer']['company']['name']
res.append(data)
You can then go ahead to write res, which is a list of objects to a csv file.

From local Python function to Google Cloud Function

I created a Python function for an API call so I longer have to do that in Power BI. It creates 5 XML files that are then combined into a single CSV-file. I would like the function to run on Google Cloud (correct me if this is not a good idea).
I don't think it' s possible to create XML files in the function (maybe it's possible to write to a bucket) but ideally I would like to skip the XML file creation and just go straight to creating the CSV.
Please find the code for generating the XML files and combining into CSV below:
offices = ['NL001', 'NL002', 'NL003', 'NL004', 'NL005']
#Voor elke office inloggen, office veranderen en een aparte xml maken
for office in offices:
xmlfilename = office+'.xml'
session.service.SelectCompany(office, _soapheaders={'Header': auth_header})
proces_url = cluster + r'/webservices/processxml.asmx?wsdl'
proces = Client(proces_url)
response = proces.service.ProcessXmlString(query.XML_String, _soapheaders={'Header': auth_header})
f = open(xmlfilename, 'w')
f.write(response)
f.close()
to csv
if os.path.exists('CombinedFinance.csv'):
os.remove('CombinedFinance.csv')
else:
print("The file does not exist")
xmlfiles = ['NL001.xml','NL002.xml','NL003.xml','NL004.xml','NL005.xml']
for xmlfile in xmlfiles:
with open(xmlfile, encoding='windows-1252') as xml_toparse:
tree = ET.parse(xml_toparse)
root = tree.getroot()
columns = [element.attrib['label'] for element in root[0]]
columns.append('?')
data = [[field.text for field in row] for row in root[1::]]
df = pd.DataFrame(data, columns=columns)
df = df.drop('?', axis=1)
df.to_csv('CombinedFinance.csv', mode='a', header=not os.path.exists('CombinedFinance.csv'))
Any ideas?
n.b. If i can improve my code please let me know, I'm just learning all of this
EDIT: In response to some comments, code now looks like this. When deploying to cloud I get the following error:
ERROR: (gcloud.functions.deploy) OperationError: code=13, message=Function deployment failed due to a health check failure. This usually indicates that your code was built successfully but failed during a test execution. Examine the logs to determine the cause. Try deploying again in a few minutes if it appears to be transient.
My requirements.txt looks like this:
zeep==3.4.0
pandas
Any ideas?
import pandas as pd
import xml.etree.ElementTree as ET
from zeep import Client
import query
import authentication
import os
sessionlogin = r'https://login.twinfield.com/webservices/session.asmx?wsdl'
login = Client(sessionlogin)
auth = login.service.Logon(authentication.username, authentication.password, authentication.organisation)
auth_header = auth['header']['Header']
cluster = auth['body']['cluster']
#Use cluster to create a session:
url_session = cluster + r'/webservices/session.asmx?wsdl'
session = Client(url_session)
#Select a company for the session:
offices = ['NL001', 'NL002', 'NL003', 'NL004', 'NL005']
#Voor elke office inloggen, office veranderen en een aparte xml maken
for office in offices:
session.service.SelectCompany(office, _soapheaders={'Header': auth_header})
proces_url = cluster + r'/webservices/processxml.asmx?wsdl'
proces = Client(proces_url)
response = proces.service.ProcessXmlString(query.XML_String, _soapheaders={'Header': auth_header})
treetje = ET.ElementTree(ET.fromstring(response))
root = treetje.getroot()
columns = [element.attrib['label'] for element in root[0]]
columns.append('?')
data = [[field.text for field in row] for row in root[1::]]
df = pd.DataFrame(data, columns=columns)
df = df.drop('?', axis=1)
df.to_csv('/tmp/CombinedFinance.csv', mode='a', header=not os.path.exists('/tmp/CombinedFinance.csv'))
A few things to consider about turning a regular Python script (what you have here) into a Cloud Function:
Cloud Functions respond to events -- either an HTTP request or some other background trigger. You should think about the question "what is going to trigger my function?"
HTTP functions take in a request that corresponds to the incoming request, and must return some sort of HTTP response
The only available part of the filesystem that you can write to is /tmp. You'll have to write all files there during the execution of your function
The filesystem is ephemeral. You can't expect files to stick around between invocations. Any file you create must either be stored elsewhere (like in a GCS bucket) or returned in the HTTP response (if it's an HTTP function)
A Cloud Function has a very specific signature that you'll need to wrap your existing business logic in:
def my_http_function(request):
# business logic here
...
return "This is the response", 200
def my_background_function(event, context):
# business logic here
...
# No return necessary

Search through JSON query from Valve API in Python

I am looking to find various statistics about players in games such as CS:GO from the Steam Web API, but cannot work out how to search through the JSON returned from the query (e.g. here) in Python.
I just need to be able to get a specific part of the list that is provided, e.g. finding total_kills from the link above. If I had a way that could sort through all of the information provided and filters it down to just that specific thing (in this case total_kills) then that would help a load!
The code I have at the moment to turn it into something Python can read is:
url = "http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/?key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697&include_appinfo=1"
data = requests.get(url).text
data = json.loads(data)
If you are looking for a way to search through the stats list then try this:
import requests
import json
def findstat(data, stat_name):
for stat in data['playerstats']['stats']:
if stat['name'] == stat_name:
return stat['value']
url = "http://api.steampowered.com/ISteamUserStats/GetUserStatsForGame/v0002/?appid=730&key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697"
data = requests.get(url).text
data = json.loads(data)
total_kills = findstat(data, 'total_kills') # change 'total_kills' to your desired stat name
print(total_kills)

Categories