pandas.read_csv() using url - file location - python

I am using below code in jupyter notebook to read csv from url.
df_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
Any idea where the csv file gets saved on drive?

I doubt if it gets saved except you download it or you can write df_train to a .csv with:
df_train.to_csv('train.csv')
This will be saved in your present working directory.

Related

How to export a dataframe to csv on local desktop

I have created a dataframe from an existing file.Now i am trying to download it onto my local desktop with the code as shown below-
data.to_csv(r'C:\Users\pmishr50\Desktop\Skills\python\new.csv')
The code doesnt show any error but i cnt find my file in the given path.
I have found results to download the data, also download the data to Google drive. But i wish to download the data to the path mentioned here.

How to update a existing csv in google drive without using google sheets using python

I have a csv file (data.csv) in my Google drive.
I want to append new data (df) to that csv file(stack new below old as both csv's have same column headers) and save it to existing data.csv without using google sheets.
How can I do this?
Open google colab and mount your google drive in colab using :
from google.colab import drive
drive.mount('/content/drive')
Provide your authorization and the file will be available in your current session inside your drive directory. You can then edit the file by importing pandas or csv then :
with open('data.csv','a') as fd:
fd.write(myCsvRow)
Opening a file with the 'a' parameter allows you to append to the end of the file instead of simply overwriting the existing content. Any changes you do will be saved to your original file.

export dataframe excel directly to sharepoint (or a web page)

I created a dataframe to jupyter notebook and I would like to export this dataframe directly to sharepoint as an excel file. Is there a way to do that?
As per the other answers, there doesn't seem to be a way to write a dataframe directly to SharePoint at the moment, but rather than saving it to file locally, you can create a BytesIO object in memory, and write that to SharePoint using the Office365-REST-Python-Client.
from io import BytesIO
buffer = BytesIO() # Create a buffer object
df.to_excel(buffer, index=False) # Write the dataframe to the buffer
buffer.seek(0)
file_content = buffer.read()
Referencing the example script upload_file.py, you would just replace the lines reading in the file with the code above.
You can add the SharePoint site as a network drive on your local computer and then use a file path that way. Here's a link to show you how to map the SharePoint to a network drive. From there, just select the file path for the drive you selected.
df.to_excel(r'Y:\Shared Documents\file_name.xlsx')
I think there is no way directly export pandas dataframe data to sharepoint but you can export your pandas data to excel and then upload the excel data to sharepoint
Export Covid data to Excel file
df.to_excel(r'/home/noh/Desktop/covid19.xls', index=False)

how to get the file path to my csv file in data assets in watson studio?

I have been trying to get the file path of my csv file in watson studio. The file is saved in my project data assets in watson studio. And all I need is the file path to read its content in a jupyter notebook. I'm trying to use a simple python file reader, that should read a file in a specified path. I have tried using watson studio insert file credentials, but can't get it to work.
This works fine when I run the same file in IBM cognitiveclass.ai platform, but I can't get this to work in IBM watson studio, please help.
file name is enrollments.csv
import unicodecsv
with open('enrollments.csv', 'rb') as f:
reader = unicodecsv.DictReader(f)
enrollments = list(reader)
I assume you mean uploaded the "enrollments.csv" file to Files section.
This uploads file to the Bucket of Cloud Object Storage service which storage for your project.
You can use project-lib to fetch the file url.
# Import the lib
from project_lib import Project
project = Project(sc,"<ProjectId>", "<ProjectToken>")
# Get the url
url = project.get_file_url("myFile.csv")
For more refer this:-
https://dataplatform.cloud.ibm.com/docs/content/analyze-data/project-lib-python.html
https://dataplatform.cloud.ibm.com/analytics/notebooks/v2/a972effc-394f-4825-af91-874cb165dcfc/view?access_token=ee2bd90bee679afc278cdb23453946a3922c454a6a7037e4bd3c4b0f90eb0924
For the sake of future readers, try this one.
Upload your csv file as your asset in your Watson Studio Project (you can also do this step later).
Open your notebook. On the top ribbon, on the upper-right corner of the page (below your name), click on the 1010 icon.
Make sure you're on Files tab, and below you will see the list of your uploaded datasets (you can also upload your files here).
Click the drop-down and choose pandas DataFrame to add a block of code that will load the uploaded data into your notebook. Note that you need to select a blank cell so that it doesn't mess up your existing cell that has some codes.
I struggled to define the path in Watson as well.
Here is what worked for me:
Within a Project, select the "Settings" tab. I believe the default view is on the "Assets" tab.
Create a token. Scroll down to "Access Tokens". Then click on "New token"
Go back to "Assets" and open your notebook.
Click on the three vertical dots in the header. One option is to "Insert project token". This creates a new code block that defines the correct parameters under the Project method.
I think you are really asking how can you read a file from assets in your Watson Studio Project. This is documented here : https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=lib-watson-studio-python
# Import the lib
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space()
# Fetch the data from a file
my_file = wslib.load_data("my_asset_name.csv")
# Read the CSV data file into a pandas DataFrame
my_file.seek(0)
import pandas as pd
pd.read_csv(my_file, nrows=10)
Project File
Project File Read in Notebook
The old project-lib has been deprecated. See Deprecation Announcement https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=notebook-using-project-lib-python-deprecated

How to access csv file in the root folder of the Google Colaboratory?

I am trying to access a CSV file uploaded right in the same folder that my python notebook is from the colaboratory.
when I simply do :
pd.read_csv("train.csv")
It throws me : "File b'train.csv' does not exist" error.
Any idea is highly appreciated.
When you upload the file. It's not saved in the current folder.
uploaded = files.upload()
It's stored as the value in the uploaded dictionary, with the filename as key.

Categories