I created a dataframe to jupyter notebook and I would like to export this dataframe directly to sharepoint as an excel file. Is there a way to do that?
As per the other answers, there doesn't seem to be a way to write a dataframe directly to SharePoint at the moment, but rather than saving it to file locally, you can create a BytesIO object in memory, and write that to SharePoint using the Office365-REST-Python-Client.
from io import BytesIO
buffer = BytesIO() # Create a buffer object
df.to_excel(buffer, index=False) # Write the dataframe to the buffer
buffer.seek(0)
file_content = buffer.read()
Referencing the example script upload_file.py, you would just replace the lines reading in the file with the code above.
You can add the SharePoint site as a network drive on your local computer and then use a file path that way. Here's a link to show you how to map the SharePoint to a network drive. From there, just select the file path for the drive you selected.
df.to_excel(r'Y:\Shared Documents\file_name.xlsx')
I think there is no way directly export pandas dataframe data to sharepoint but you can export your pandas data to excel and then upload the excel data to sharepoint
Export Covid data to Excel file
df.to_excel(r'/home/noh/Desktop/covid19.xls', index=False)
Related
I have a csv file (data.csv) in my Google drive.
I want to append new data (df) to that csv file(stack new below old as both csv's have same column headers) and save it to existing data.csv without using google sheets.
How can I do this?
Open google colab and mount your google drive in colab using :
from google.colab import drive
drive.mount('/content/drive')
Provide your authorization and the file will be available in your current session inside your drive directory. You can then edit the file by importing pandas or csv then :
with open('data.csv','a') as fd:
fd.write(myCsvRow)
Opening a file with the 'a' parameter allows you to append to the end of the file instead of simply overwriting the existing content. Any changes you do will be saved to your original file.
I am using below code in jupyter notebook to read csv from url.
df_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
Any idea where the csv file gets saved on drive?
I doubt if it gets saved except you download it or you can write df_train to a .csv with:
df_train.to_csv('train.csv')
This will be saved in your present working directory.
I'm using Pyspark to try to read a zip file from blob storage. I want to unzip the file once loaded, and then write the unzipped CSVs back to blob storage.
I'm following this guidance which explains how to unzip the file once read: https://docs.databricks.com/_static/notebooks/zip-files-python.html
But it doesn't explain how I read the zip from blob. I have the following code
file_location = "path_to_my.zip"
df = sqlContext.read.format("file_location").load
I expected this to load the zip to databricks as df, and from there I could follow the advice from the article to unzip, load the csvs to a dataframe and then write the dataframes back to blob.
Any ideas on how to initially read the zip file from blob using pyspark?
Thanks,
As shown in the first cell of your DataBricks notebook, you need to download the zip file and decompress it somehow. Your case is different because you are using Azure Blob storage and you want to do everything in Python (no other shell application).
This page documents the process for accessing files in Azure Blob Storage. You need to follow these steps:
Install the package azure-storage-blob.
Import the SDK modules and set the necessary credentials (reference).
Create an instance of BlobServiceClient using a connection string:
# Create the BlobServiceClient object which will be used to create a container client
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
Create an instance of BlobClient for the file you want:
blob_client = blob_service_client.get_blob_client(container="container", blob="path_to_my.zip")
Download the blob (the zip file) and unzip it with gzip. I would write something like this:
from pathlib import Path
import gzip
Path("./my/local/filepath.csv").write_bytes(
gzip.decompress(blob_client.download_blob().readall())
)
Use "./my/local/filepath.csv" to create the DataFrame.
I have a Json object which needed to be uploaded to the Google Spreadsheet. I have searched and read various resources but could not find the solution. Is there a way to upload object or csv from locally to google spreadsheet using gspread. I will prefer not to use google client api. Thanks
You can import CSV data using gspread.Client.import_csv method. Here's a quick example from the docs:
import gspread
# Check how to get `credentials`:
# https://github.com/burnash/gspread
gc = gspread.authorize(credentials)
# Read CSV file contents
content = open('file_to_import.csv', 'r').read()
gc.import_csv('<SPREADSHEET_ID>', content)
This will upload your CSV file into the first sheet of a spreadsheet with <SPREADSHEET_ID>.
NOTE
This method removes all other worksheets and then entirely replaces the contents of the first worksheet.
Is there any way we can load direct excel file into BigQuery, instead of converting to CSV.
I get the files every days in excel format and need to load into BigQuery. Right now converting into CSV manually and loading into BigQuery.
Planning to schedule the job.
If not possible to load the excel files directly into BigQuery then I need to write a process(Python) to convert into CSV before loading into BigQuery.
Please let me know if any better options are there.
Thanks,
I think you could achieve above in a few clicks, without any code.
You need to use Google Drive and external (federated) tables.
1) You could upload manually you excel files to Google Drive or synchronise them
2) In Google Drive Settings find:
"**Convert uploads** [x] Convert uploaded files to Google Docs editor format"
and check it.
To access above option go to https://drive.google.com/drive/my-drive, click on the Gear settings icon and then choose Settings.
Now you excel files will be accessible by Big Query
3) Last part: https://cloud.google.com/bigquery/external-data-drive
You could access you excel file by URI: https://cloud.google.com/bigquery/external-data-drive#drive-uri and then create table manually using above uri.
You could do last step also by API.