How to import the data file into google colab? - python

I'm generating a code in Google Colab and at some point, I need to call a data file. This data file is called "apr.dat" and is inside a folder called "Eos_table" I hosted this folder on the drive and used the following structure to call it:
import os
def set_eos(file):
directory = os.getcwd()
data_eos = np.loadtxt(directory+str("\\")+str("\\")+str("Eos_table")+str("\\")+str("\\")+str(file ), skiprows=1)
...
return ...
file_eos = "apr.dat"
ep_eos = set_eos(file_eos)[0]
But google colab is returning me an error that I don't know very well if it's due to the way I directed the directory or if it's for another reason. The error is:
OSError: /content\\Eos_table\\apr.dat not found.
What am I doing wrong? How can I fix this error?
Thanks

Related

How to transfer a csv file from notebook folder to a datastore

I want to transfer a generated csv file test_df.csv from my Azure ML notebook folder which has a path /Users/Ankit19.Gupta/test_df.csv to a datastore which has a web path https://abc.blob.core.windows.net/azureml/LocalUpload/f3db18b6. I have written the python code as
from azureml.core import Workspace
ws = Workspace.from_config()
datastore = ws.get_default_datastore()
datastore.upload_files('/Users/Ankit19.Gupta/test_df.csv',
target_path='https://abc.blob.core.windows.net/azureml/LocalUpload/f3db18b6',
overwrite=True)
But it is showing the following error message:
UserErrorException: UserErrorException:
Message: '/' does not point to a file. Please upload the file to cloud first if running in a cloud notebook.
InnerException None
ErrorResponse
{
"error": {
"code": "UserError",
"message": "'/' does not point to a file. Please upload the file to cloud first if running in a cloud notebook."
}
}
I have tried this but it is not working for me. Can anyone please help me to resolve this issue. Any help would be appreciated.
The way the path was mentioned is not accurate. The datastore path will be different manner.
Replace the below code for the small change in the calling path.
from azureml.core import Workspace
ws = Workspace.from_config()
datastore = ws.get_default_datastore()
datastore.upload_files('./Users/foldername/filename.csv',
target_path=’your targetfolder',
overwrite=True)
We need to call all the parent folders before the folder. “./” is the way we can call the dataset from datastore.

Reading Json file within a folder on Mac vs. Windows

Currently, I am trying to write an application that can be run on mac and windows. We have a folder with PATH = "[folder1]\configurations\globals.json". The following function works for windows:
def grab_api_credentials(resource: str) -> dict:
"""
:param resource: database, fmp, fred, polygon
"""
with open(PATH, 'r') as file:
data = json.load(file)
if resource is None:
return data
return data[resource]
How would you change the PATH variable to accommodate a mac?
To be sure, I have looked on many resources online, yet none of the showed how to read a json file within a folder. I greatly appreciate your help!

Error downloading a file from Google Drive

I exported some images from Google Earth Engine to Google Drive. I need to download those images to a local drive using a Python script. Then, I tried to use oauth2client, apiclient as I saw here:
I got a list of files in Drive and the corresponding IDs, then I use the ID to try to download the file using the gdown lib:
gdown.download(f'https://drive.google.com/uc?id={file_data["id"]}',
f'{download_path}{os.sep}{filename_to_download}.tif')
I got the following error message:
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=<id>
As I got the Drive file list, I suppose that the Drive authentication is ok. If I use the error message suggested link in the browser, I can download the file. If a check file properties at Drive, I can see:
Who can access: not shared.
What should I do to download the files?
This is the complete code:
# https://medium.com/swlh/google-drive-api-with-python-part-i-set-up-credentials-1f729cb0372b
# https://levelup.gitconnected.com/google-drive-api-with-python-part-ii-connect-to-google-drive-and-search-for-file-7138422e0563
# https://stackoverflow.com/questions/38511444/python-download-files-from-google-drive-using-url
import os
from apiclient import discovery
from httplib2 import Http
from oauth2client import client, file, tools
import gdown
class GoogleDrive(object):
# define API scope
def __init__(self, secret_credentials_file_path = './credentials'):
self.DriveFiles = None
SCOPE = 'https://www.googleapis.com/auth/drive'
self.store = file.Storage(f'{secret_credentials_file_path}{os.sep}credentials.json')
self.credentials = self.store.get()
if not self.credentials or self.credentials.invalid:
flow = client.flow_from_clientsecrets(f'{secret_credentials_file_path}{os.sep}client_secret.json',
SCOPE)
self.credentials = tools.run_flow(flow, self.store)
oauth_http = self.credentials.authorize(Http())
self.drive = discovery.build('drive', 'v3', http=oauth_http)
def RetrieveAllFiles(self):
results = []
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
files = self.drive.files().list(**param).execute()
# append the files from the current result page to our list
results.extend(files.get('files'))
# Google Drive API shows our files in multiple pages when the number of files exceed 100
page_token = files.get('nextPageToken')
if not page_token:
break
except Exception as error:
print(f'An error has occurred: {error}')
break
self.DriveFiles = results
def GetFileData(self, filename_to_search):
for file_data in self.DriveFiles:
if file_data.get('name') == filename_to_search:
return file_data
else:
return None
def DownloadFile(self, filename_to_download, download_path):
file_data = self.GetFileData(f'{filename_to_download}.tif')
gdown.download(f'https://drive.google.com/uc?id={file_data["id"]}',
f'{download_path}{os.sep}{filename_to_download}.tif')
Google drive may not be the best tool for this, you may want to upload them into a RAW file hosting service like Imgur and download it to a file using requests, you can then read the file using the script or you don't even have to write it to the file and just use image.content instead to specify the image. Here's an example:
image = requests.get("https://i.imgur.com/5SMNGtv.png")
with open("image.png", 'wb') as file:
file.write(image.content)
(You can specify the location of where you want the file to download by adding the PATH before the file name, like this:)
image = requests.get("https://i.imgur.com/5SMNGtv.png")
with open("C://Users//Admin//Desktop//image.png", 'wb') as file:
file.write(image.content)
Solution 1.
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=<id>
In the sharing tab on gdrive (Right click on image, open Share or Get link), please change privacy to anyone with the link. Hopefully your code should work.
Solution 2.
If you can use Google Colab, then you can mount gdrive easily and access files there using
from google.colab import drive
drive.mount('/content/gdrive')
Google has this policy that they do not accept your regular google-/gmail-password. They only accept so called "App Passwords" that you need to create for your google-account in order to authenticate if you are using thirdparty apps

# Python, Upload file in Dropbox using python

Python, how find out the path of Dropbox and upload a file there
I want to upload daily csv file to dropbox account, but I'm getting ValidationError and others.
my code:
#finding the path
import pathlib
import dropbox
import os
# Automation is the name of my folder at dropbox
pathlib.Path.home() / "Automation"
Out[37]: WindowsPath('C:/Users/pb/Automation')
dbx = dropbox.Dropbox('My-token here')
dbx.users_get_current_account()
Out[38]: FullAccount(account_id='accid', name=Name(given_name='pb', surname='manager', familiar_name='pb', display_name='pb', abbreviated_name='pb'), email='example#example.com', email_verified=True, disabled=False, locale='en', referral_link='https://www.dropbox.com/referrals/codigo', is_paired=False, account_type=AccountType('basic', None), root_info=UserRootInfo(root_namespace_id='1111111111', home_namespace_id='11111111'), profile_photo_url='https://dl-web.dropbox.com/account_photo/get/sssssssssssssssssss', country='US', team=None, team_member_id=None)
# Now trying to see something in the folder, I just want upload file there
response = dbx.files_list_folder(path='user:/pb/automation')
print(response)
for entry in dbx.files_list_folder('https://www.dropbox.com/home/automation').entries:
print(entry.name)
ValidationError: 'user:/pb/automation' did not match pattern '(/(.|[\r\n])*)?|id:.*|(ns:[0-9]+(/.*)?)'
That error happens because the path parameter that the API is expecting needs to start with a '/'. It could be called out better in the docs.
Is the Automation folder in the root of your Dropbox directory? If so, then '/automation' should be sufficient for path. Try tinkering with the /files/list_folder endpoint in the Dropbox API explorer until you find the correct path.
Your for loop is likely to throw an error too, though. Are you just trying to loop over the results of the list_folder call? I'd suggest changing to
for entry in response:
print entry

How do I automatically generate files to the same google drive folder as my colab notebook?

I am performing LDA on a simple wikipedia dump file, but the code I am following needs to output the articles to a file. I need some guidance as python and colab are really broad and I can't seem to find an answer to this specific problem. Here's my code for mounting google drive:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate the user
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# Get your file
fileId ='xxxx'
fileName = 'simplewiki-20170820-pages-meta-current-reduced.xml'
downloaded = drive.CreateFile({'id': fileId})
downloaded.GetContentFile(fileName)
and here's the culprit, this code is trying to create a file from the article
if not article_txt == None and not article_txt == "" and len(article_txt) > 150 and is_ascii(article_txt):
outfile = dir_path + str(i+1) +"_article.txt"
f = codecs.open(outfile, "w", "utf-8")
f.write(article_txt)
f.close()
print (article_txt)
I have tried so many things already and I can't recall them all. Basically, what I need to know is how to convert this code so that it would work with google drive. I've been trying so many solutions for hours now. Something I recall doing is converting this code into this
file_obj = drive.CreateFile()
file_obj['title'] = "file name"
But then I got an error 'expected str, bytes or os.PathLike object, not GoogleDriveFile'. It's not the question of how to upload a file and open it with colab, as I already know how to do that with the XML file, what I need to know is how to generate files through my colab script and place them to the same folder as my script. Any help would be appreciated. Thanks!
I am not sure whether the problem is with generating the files or copying them to google drive, if it is the latter, a simpler approach would be to mount your drive directly to the instance as follows
from google.colab import drive
drive.mount('drive')
You can then access any item in your drive as if it were a hard disk and copy your files using bash commands:
!cp filename 'drive/My Drive/folder1/'
Another alternative is to use shutil :
import shutil
shutil.copy(filename, 'drive/My Drive/folder1/')

Categories