Can't create file into specific folder using gspread google colab

Can't create file into specific folder using gspread google colab - python

Yesterday I was still able to create a google sheet files using gspread into a specific folder using this code:
ss = gc.create(fileName,"my folder destination")
But today, this code yields an error.
Here is my full code:
from google.colab import auth
auth.authenticate_user()
import pandas as pd
import gspread
from google.auth import default
creds, _ = default()
gc = gspread.authorize(creds)
wb = gc.open('file name')
ws = wb.worksheet('sheet name')
# get_all_values gives a list of rows.
rows = ws.get_all_values()
df = pd.DataFrame.from_records(rows[1:],columns=rows[0])
modisseries = df["column name"]
uniquemodis = modisseries.drop_duplicates().tolist()
def createSpreadsheet(columName):
ndf = df[df["Modis"] == Columname]
nlist = [ndf.columns.tolist()] + ndf.to_numpy().tolist()
ss = gc.create(columnName,"Folder_id")
nws = ss.sheet1
nws.update_title(columnName)
nws.update("A1",nlist,value_input_option="USER_ENTERED")
for modis in uniquemodis:
createSpreadsheet(modis)
And here is the error message:
TypeError Traceback (most recent call last)
<ipython-input-1-5c297289782b> in <module>()
30
31 for column_value in uniquecolumn:
---> 32 createSpreadsheet(column_value)
<ipython-input-1-5c297289782b> in createSpreadsheet(column_value)
24 nlist = [ndf.columns.tolist()] + ndf.to_numpy().tolist()
25
---> 26 ss = gc.create(column_value,"folder_id")
27 nws = ss.sheet1
28 nws.update_title(column_value)
TypeError: create() takes 2 positional arguments but 3 were given

Hi which version of gspread do you use ?
The parameter folder_id has been introduced in version 3.5.0
From the error message you get it seems you have an old version and it does not handle the new parameter folder_id.
You can check the version you are currently using by using: print(gspread.__version__)
If this is bellow version 3.5.0 then run the following command:
python3 -m pip install -U gspread

Related

python gdal not processing grib file properly on CENTOS linux

I am trying to use python gdal to process a grib2 file into a geotiff file based on the following example:
https://geoexamples.com/d3-raster-tools-docs/code_samples/vardah.html
based on this I have the following code:
import gdal
import osr
ds = gdal.Open(r"/home/test/gfs.t12z.sfluxgrbf000.grib2",1)
gph = ds.GetRasterBand(84).ReadAsArray()
press = ds.GetRasterBand(54).ReadAsArray() / 100
temp = ds.GetRasterBand(52).ReadAsArray()
u = ds.GetRasterBand(50).ReadAsArray()
v = ds.GetRasterBand(51).ReadAsArray()
corr_press = press * (1 - (0.0065*gph/(0.0065*gph + temp + 273.15)))**-5.257
driver = gdal.GetDriverByName('GTiff')
outRaster = driver.Create("/home/test/vardah2.tiff", ds.RasterXSize, ds.RasterYSize, 4, gdal.GDT_Float32)
outRaster.SetGeoTransform(ds.GetGeoTransform())
outband = outRaster.GetRasterBand(1)
outband.WriteArray(corr_press)
outband.SetMetadata({'name': 'press'})
outRasterSRS = osr.SpatialReference()
outRasterSRS.ImportFromEPSG(4326)
outRaster.SetProjection(outRasterSRS.ExportToWkt())
outband.FlushCache()
I am attempting to do this on centos 7 and when I run the program I get the following error:
ERROR 6: The GRIB driver does not support update access to existing datasets.
Traceback (most recent call last):
File "ficky.py", line 4, in <module>
gph = ds.GetRasterBand(84).ReadAsArray()
AttributeError: 'NoneType' object has no attribute 'GetRasterBand'
How do I resolve this error to get a successful run of this script on a centos interface?

Change
ds = gdal.Open(r"/home/test/gfs.t12z.sfluxgrbf000.grib2",1)
to
ds = gdal.Open("/home/test/gfs.t12z.sfluxgrbf000.grib2")
or even better
ds = gdal.Open("/home/test/gfs.t12z.sfluxgrbf000.grib2", GA_ReadOnly)

missing variable to complete cell execution

I hope someone can help me. I've been stuck with this error for a while. I have two .py files that I am importing in a jupyter notebook. When running the last cell (see code below) I get a Traceback error I can'f fix.
I think there is some error in my ch_data_prep.py file related to variable df_ch not correctly passed between files. Is this possible? Any suggestion on how to solve this problem?
Thanks!
ch_data_prep.py
def seg_data(self):
seg_startdate = input('Enter start date (yyyy-mm-dd): ')
seg_finishdate = input('Enter end date (yyyy-mm-dd): ')
df_ch_seg = df_ch[(df_ch['event_datetime'] > seg_startdate)
& (df_ch['event_datetime'] < seg_finishdate)
]
return df_ch_seg
df_ch_seg = seg_data(df_ch)
ch_db.py
def get_data():
# Some omitted code here to connect to database and get data...
df_ch = pd.DataFrame(result)
return df_ch
df_ch = get_data()
Jupyter Notebook data_analysis.ipynb
In[1]: import ch_db
df_ch = ch_db.get_data()
In[2]: import ch_data_prep
When I run cell 2, I get this error
--------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-fbe2d4fceba6> in <module>
----> 1 import ch_data_prep
~/clickstream/ch_data_prep.py in <module>
34 return df_ch_seg
35
---> 36 df_ch_seg = seg_data()
TypeError: seg_data() missing 1 required positional argument: 'df_ch'

NameError: name 'pycrfsuite' is not defined

The following code appears when I am running a cell on Google Colab:
NameError Traceback (most recent call last)
<ipython-input-36-5f325bc0550d in <module>()
4
5 TAGGER_PATH = "crf_nlu.tagger" # path to the tagger- it will save/access the model from here
----> 6 ct = CRFTagger(feature_func=get_features) # initialize tagger with get_features function
7
8 print("training tagger...")
/usr/local/lib/python3.6/dist-packages/nltk/tag/crf.py in __init__(self, feature_func, verbose, training_opt)
81
82 self._model_file = ''
---> 83 self._tagger = pycrfsuite.Tagger()
84
85 if feature_func is None:
NameError: name 'pycrfsuite' is not defined
This is the code from the cell:
# Train the CRF BIO-tag tagger
import pycrfsuite
TAGGER_PATH = "crf_nlu.tagger" # path to the tagger- it will save/access the model from here
ct = CRFTagger(feature_func=get_features) # initialize tagger with get_features function
print("training tagger...")
ct.train(training_data, TAGGER_PATH)
print("done")
What causes this issue?
I have an import for the CRFTagger which is:
from nltk.tag import CRFTagger

I just sorted it out. For the google colab, I had to add the following line:
pip install sklearn-pycrfsuite

Use:
pip install python-crfsuite
Scikit-crfsuite provides API similar to scikit-learn library.

xlwings loop causing OS CommandErrors while line-by-line doesn't

I am trying to loop through Excel files on a Mac 10.10.5 with xlwings in python in order to save the values only (i.e. not the formulae that create them). Many of the files have been output by automated software, so the typical pandas.read_excel function produces null values for any cells that start with =.
Anyway, if I walk through what's inside the for loop, one at a time in ipython, the code works fine. However, what's causing an error is that when the script runs with a loop, whenever it gets to a workbook that hasn't been walked through manually, it produces an error. Here's the script (reduced version in update below):
wkbks = [fl for fl in os.listdir(originals_folder) if any([fl.endswith(ext) for ext in ['.xls', 'xlsx']])]
for w_idx, wkbk in enumerate(wkbks):
if w_idx % 10 == 0:
print "\t{}/{}".format(w_idx, len(wkbks))
wb = xw.Book( os.path.join(originals_folder, wkbk) )
new_wb = xw.Book()
for sht_idx, sht in enumerate(wb.sheets):
# look for a pretty wide range (disclaimer: not super robust for really large sheets)
values = sht.range('A1:HZ1000').value
df = pd.DataFrame(values)
rows_to_keep = df.isnull().sum(axis=1) < df.shape[1]
cols_to_keep = df.isnull().sum(axis=0) < df.shape[0]
df = df.ix[rows_to_keep, cols_to_keep]
# add values to new sheet
if not any([sht.name == sh.name for sh in new_wb.sheets]):
if sht_idx == 0:
new_wb.sheets.add(sht.name)
else:
new_wb.sheets.add(sht.name, after=wb.sheets[sht_idx-1].name)
new_wb.sheets[sht.name].range('A1').value = df.values
new_name = new_wb.name
new_wb.save() # close to generic name in current location, as filepaths produced an error
# close newly made workbook
close_workbook_by_name(xw, new_name)
# copy newly made workbook to base_folder
new_fname = [f for f in os.listdir('./') if f.startswith(new_name)][0]
shutil.copy2(os.path.join('./', new_fname), os.path.join(base_folder, wkbk))
# remove newly made file from current directory
os.remove( os.path.join('./', new_fname) )
# close original workbook
close_workbook_by_name(xw, wkbk)
del sht, new_wb, wb, wkbk, shts
And here's the error (again, this only happens when it loops to a wkbk that hasn't yet been manually 'walked through' in ipython):
---------------------------------------------------------------------------
CommandError Traceback (most recent call last)
<ipython-input-319-2240e28ae169> in <module>()
14 shts = wb.sheets
15 sht_idx = 0
---> 16 for sht in shts:
17 sht = wb.sheets[sht.name]
18
...
CommandError: Command failed:
OSERROR: -1728
MESSAGE: The object you are trying to access does not exist
COMMAND: app(pid=8732).workbooks['cwa balance sheet 9.30.2017-5688630b3.xlsx'].count(each=k.worksheet)
Any suggestions on the specifics or general strategy here would be greatly appreciated. Thanks!
UPDATE
Reduced script
wkbks = [fl for fl in os.listdir(originals_folder) if any([fl.endswith(ext) for ext in ['.xls', 'xlsx']])]
for w_idx, wkbk in enumerate(wkbks):
wb = xw.Book( os.path.join(originals_folder, wkbk) )
new_wb = xw.Book()
for sht_idx, sht in enumerate(wb.sheets):
pass
new_wb.save() # close to generic name in current location, as filepaths produced an error
# close newly made workbook
book_idx_to_close = [b_idx for b_idx, b in enumerate(xw.books) if b.name.startswith(new_name)]
if len(book_idx_to_close) > 0:
book_to_close = xw.books[book_idx_to_close[0]]
book_to_close.close()
print "Closed", new_name
# close original workbook
book_idx_to_close = [b_idx for b_idx, b in enumerate(xw.books) if b.name.startswith(wkbk)]
if len(book_idx_to_close) > 0:
book_to_close = xw.books[book_idx_to_close[0]]
book_to_close.close()
print "Closed", wkbk
New error:
<ipython-input-14-68ed128e9078> in <module>()
34 if len(book_idx_to_close) > 0:
35 book_to_close = xw.books[book_idx_to_close[0]]
---> 36 book_to_close.close()
37 print "Closed", new_name
38
...
CommandError: Command failed:
OSERROR: -50
MESSAGE: Parameter error.
COMMAND: app(pid=10100).workbooks[2].close(saving=k.no)
The error occurs when a file opens that requires a 'Read Only' access due to some features not being compatible with the current version of Excel. However, after receiving the error, when I retype book_to_close.close(), the file closes with no error. Also, there is no error when opening/saving/closing a file that had the same read only access, when I had previously 'walked through' it manually.
I realize this is a different error than the one above, but suspect they may be related (hence why leaving the original post as is above).

py2exe SytaxError: invalid syntax (asyncsupport.py, line22) [duplicate]

This command works fine on my personal computer but keeps giving me this error on my work PC. What could be going on? I can run the Char_Limits.py script directly in Powershell without a problem.
error: compiling 'C:\ProgramData\Anaconda2\lib\site-packages\jinja2\asyncsupport.py' failed
SyntaxError: invalid syntax (asyncsupport.py, line 22)
My setup.py file looks like:
from distutils.core import setup
import py2exe
setup (console=['Char_Limits.py'])
My file looks like:
import xlwings as xw
from win32com.client import constants as c
import win32api
"""
Important Notes: Header row has to be the first row. No columns without a header row. If you need/want a blank column, just place a random placeholder
header value in the first row.
Product_Article_Number column is used to determine the number of rows. It must be populated for every row.
"""
#functions, hooray!
def setRange(columnDict, columnHeader):
column = columnDict[columnHeader]
rngForFormatting = xw.Range((2,column), (bttm, column))
cellReference = xw.Range((2,column)).get_address(False, False)
return rngForFormatting, cellReference
def msg_box(message):
win32api.MessageBox(wb.app.hwnd, message)
#Character limits for fields in Hybris
CharLimits_Fields = {"alerts":500, "certifications":255, "productTitle":300,
"teaserText":450 , "includes":1000, "compliance":255, "disclaimers":9000,
"ecommDescription100":100, "ecommDescription240":240,
"internalKeyword":1000, "metaKeywords":1000, "metaDescription":1000,
"productFeatures":7500, "productLongDescription":1500,"requires":500,
"servicePlan":255, "skuDifferentiatorText":255, "storage":255,
"techDetailsAndRefs":12000, "warranty":1000}
# Fields for which a break tag is problematic.
BreakTagNotAllowed = ["ecommDescription100", "ecommDescription240", "productTitle",
"skuDifferentiatorText"]
app = xw.apps.active
wb = xw.Book(r'C:\Users\XXXX\Documents\Import File.xlsx')
#identifies the blanket range of interest
firstCell = xw.Range('A1')
lstcolumn = firstCell.end("right").column
headers_Row = xw.Range((1,1), (1, lstcolumn)).value
columnDict = {}
for column in range(1, len(headers_Row) + 1):
header = headers_Row[column - 1]
columnDict[header] = column
try:
articleColumn = columnDict["Product_Article_Number"]
except:
articleColumn = columnDict["Family_Article_Number"]
firstCell = xw.Range((1,articleColumn))
bttm = firstCell.end("down").row
wholeRange = xw.Range((1,1),(bttm, lstcolumn))
wholeRangeVal = wholeRange.value
#Sets the font and deletes previous conditional formatting
wholeRange.api.Font.Name = "Arial Unicode MS"
wholeRange.api.FormatConditions.Delete()
for columnHeader in columnDict.keys():
if columnHeader in CharLimits_Fields.keys():
rng, cellRef = setRange(columnDict, columnHeader)
rng.api.FormatConditions.Add(2,3, "=len(" + cellRef + ") >=" + str(CharLimits_Fields[columnHeader]))
rng.api.FormatConditions(1).Interior.ColorIndex = 3
if columnHeader in BreakTagNotAllowed:
rng, cellRef = setRange(columnDict, columnHeader)
rng.api.FormatConditions.Add(2,3, '=OR(ISNUMBER(SEARCH("<br>",' + cellRef + ')), ISNUMBER(SEARCH("<br/>",' + cellRef + ")))")
rng.api.FormatConditions(2).Interior.ColorIndex = 6
searchResults = wholeRange.api.Find("~\"")
if searchResults is not None:
msg_box("There's a double quote in this spreadsheet")
else:
msg_box("There are no double quotes in this spreadsheet")
# app.api.FindFormat.Clear
# app.api.FindFormat.Interior.ColorIndex = 3
# foundRed = wholeRange.api.Find("*", SearchFormat=True)
# if foundRed is None:
# msg_box("There are no values exceeding character limits")
# else:
# msg_box("There are values exceeding character limits")
# app.api.FindFormat.Clear
# app.api.FindFormat.Interior.ColorIndex = 6
# foundYellow = wholeRange.api.Find("*", SearchFormat=True)
# if foundYellow is None:
# msg_box("There are no break tags in this spreadsheet")
# else:
# msg_box("There are break tags in this spreadsheet")

Note:
If you are reading this, I would try Santiago's solution first.
The issue:
Looking at what is likely at line 22 on the github package:
async def concat_async(async_gen):
This is making use of the async keyword which was added in python 3.5, however py2exe only supports up to python 3.4. Now jinja looks to be extending the python language in some way (perhaps during runtime?) to support this async keyword in earlier versions of python. py2exe cannot account for this language extension.
The Fix:
async support was added in jinja2 version 2.9 according to the documentation. So I tried installing an earlier version of jinja (version 2.8) which I downloaded here.
I made a backup of my current jinja installation by moving the contents of %PYTHONHOME%\Lib\site-packages\jinja2 to some other place.
extract the previously downloaded tar.gz file and install the package via pip:
cd .\Downloads\dist\Jinja2-2.8 # or wherever you extracted jinja2.8
python setup.py install
As a side note, I also had to increase my recursion limit because py2exe was reaching the default limit.
from distutils.core import setup
import py2exe
import sys
sys.setrecursionlimit(5000)
setup (console=['test.py'])
Warning:
If whatever it is you are using relies on the latest version of jinja2, then this might fail or have unintended side effects when actually running your code. I was compiling a very simple script.

I had the same trouble coding in python3.7. I fixed that adding the excludes part to my py2exe file:
a = Analysis(['pyinst_test.py'],
#...
excludes=['jinja2.asyncsupport','jinja2.asyncfilters'],
#...)
I took that from: https://github.com/pyinstaller/pyinstaller/issues/2393

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can't create file into specific folder using gspread google colab - python

Related

python gdal not processing grib file properly on CENTOS linux

missing variable to complete cell execution

NameError: name 'pycrfsuite' is not defined

xlwings loop causing OS CommandErrors while line-by-line doesn't

py2exe SytaxError: invalid syntax (asyncsupport.py, line22) [duplicate]

Categories

Resources