I need to do an exploratory analysis using python over two tables that are in a Google BigQuery database.
The only thing I was provided is a JSON file containing some credentials.
How can I access the data using this JSON file?
This is the first time I try to do something like this, so I have no idea on how to do it.
I tried reading different tutorials and documentations, but nothing worked.
Related
Suppose I have list of API's like the following...
https://developer.genesys.cloud/devapps/api-explorer#get-api-v2-alerting-alerts-active
https://developer.genesys.cloud/devapps/api-explorer#get-api-v2-alerting-interactionstats-rules
https://developer.genesys.cloud/devapps/api-explorer#get-api-v2-analytics-conversations-details
I want to read this API's one by one and store the data to snowflake using Pandas and SQLalchemy.
Do you have any ideas for reading the API's one by one in my python script?
-Read the API's one by one from a file.
Load the data to snowflake table directly.
Does anyone know if it's possible to take the contents of a bigquey table and append that data to a Google sheet, using airflow. I have looked through the docs and can only find this: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/transfer/sql_to_sheets.html
Firstly, I'm not sure what a database_conn_id is? And I'm not sure if this supports any Google sheets validations such as drop downs via data validation or append functions.
I'm trying to piece together the code required to run a query on a Hive/HDFS database (i.e. the same query I could run in Hive or Impala, using Zeppelin or Hue), then upload the contents of that to a REST API URL. I'm a very experienced developer but new to Python, dataframes, Spark, HDFS etc.
I've got my SQL query that returns the correct data (e.g. using Impala or Hive).
I've got Python code that will connect to a REST API endpoint for upload:
import requests
x = requests.post(url, data = my_data)
I know that Python pandas library can save out CSV https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html#pandas.DataFrame.to_csv
I'm not sure how to get Python to run the query though, and what else I might be missing here...
Execution environment is python or pyspark running in Apache Zeppelin, table is in Hadoop/HDFS
Apologies if I'm misusing terms here, just trying to get my head around this :)
Thanks
I am trying to find a way to automatically update a big query table using this link: https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:1
This link is updated with new data every week and I want to be able to replace the Big Query table with this new data. I have researched that you can export spreadsheets to Big Query, but that is not a streamlined approach.
How would I go about submitting a script that imports the data and having that data be fed to Big Query?
I assume you already have a working script that parses the content of the URL and places the contents in BigQuery. Based on that I would recommend the following workflow:
Upload the script as a Google Cloud Function. If your script isn't written in a compatible language (i.e. Python, Node, Go), you can use Google Cloud Run instead. Set the Cloud Function to be triggered by a Pub/Sub message. In this scenario, the content of your Pub/Sub message doesn't matter.
Set up a Google Cloud Scheduler job to (a) run at 12am every Saturday (or whatever time you wish) and (b) send a dummy message to the Pub/Sub topic that your Cloud Function is subscribed to.
You can try using a HTTP request to the page using a programming language like Python with the Request library, save the data into a Pandas Dataframe or a CSV file, and then using the BigQuery libraries you can push that data into a BigQuery table.
I need to be able to upload an excel or csv file to appengine so that the server can process the rows and create objects. Can anyone provide or point me to an example of how this is done? Thanks for your help.
Uploading to the Blobstore is probably what you are after. Then reading the data and processing it with the csv module.
You might want to look into sending your file to google docs in the case of excel (and other) formats then reading the rows back via the Spreadsheets API
If you mean a one-off (or a few) transfers, you're probably looking for the bulk upload system: http://code.google.com/appengine/docs/python/tools/uploadingdata.html
If you're talking about regular uploads during use, you'll need to handle them as post requests to the application.