Unable to insert 2d array within gspread - python

I'm trying to insert a 2D array in order to get two columns inserted into a sheet via gspread. I'm able to insert the individual lists fine, but inserting the array causes an error. Here's my code.
def megaDepotScrape():
listings = 0
priceList = []
skuList = []
# Iterate through the listings on the page, printing the price per entry
for listings in range(0, 12):
# Connect to the site to be scraped
siteURL = "https://megadepot.com/catalog/lab-equipment/multiwell-plates/brand:brandtech/"
response = requests.get(siteURL, headers=headers)
# with open('brandtech.html', 'wb') as fp:
# fp.write(response.content)
# Cook the soup
html_soup = BeautifulSoup(response.text, 'html.parser')
# Find all containers with the appropriate class name
# The 'strong' class 'hot' contains the price information
price_containers = html_soup.find_all("strong", class_="hot")
price = price_containers[listings]
priceStr = list(price)
priceList.append(priceStr)
# Find all containers for the appropriate class name
# The 'div' class 'product-wrapper' contains the SKU
sku_containers = html_soup.find_all("div", class_="product-wrapper")
sku = sku_containers[listings]
# The sku is stored in the 'data-variant' of the 'article' tag
for data in sku.find_all("article"):
skuData = data["data-variant"]
skuList.append(skuData)
# Iterate through the loop
listings += 1
# Write both lists to the sheets document
# Reference update() in docs
rows = [priceList, skuList]
print(rows)
#sheet.update('A1', [list(e) for e in zip(*rows)])
sheet.insert_row(skuList)
Here's the value of rows
[[['$81.57'], ['$80.91'], ['$91.63'], ['$91.63'], ['$455.20'], ['$196.90'], ['$282.60'], ['$146.10'], ['$97.22'], ['$166.70'], ['$287.30'], ['$237.50']], ['781411', '781415', '781412', '781416', '701355', '701330', '701346', '701352', '782153', '701354', '781347', '781345']]
And here's the error I get
sheet.update('A1', [list(e) for e in zip(*rows)])
File "C:\Users\Jacob\PythonTestProject\venv\lib\site-packages\gspread\utils.py", line 592, in wrapper
return f(*args, **kwargs)
File "C:\Users\Jacob\PythonTestProject\venv\lib\site-packages\gspread\models.py", line 1127, in update
{'values': values, 'majorDimension': kwargs['major_dimension']}
File "C:\Users\Jacob\PythonTestProject\venv\lib\site-packages\gspread\models.py", line 236, in values_update
r = self.client.request('put', url, params=params, json=body)
File "C:\Users\Jacob\PythonTestProject\venv\lib\site-packages\gspread\client.py", line 76, in request
raise APIError(response)
gspread.exceptions.APIError: {'code': 400, 'message': 'Invalid values[0][0]: list_value {\n values {\n string_value: "$81.57"\n }\n}\n', 'status': 'INVALID_ARGUMENT'}```
I'm not sure if there's some kind of limit that I'm hitting from uploading so much, or if there's some kind of error. Please let me know. Thank you.

I believe your goal as follows.
From the following sample value.
Here's the value of rows
[[['$81.57'], ['$80.91'], ['$91.63'], ['$91.63'], ['$455.20'], ['$196.90'], ['$282.60'], ['$146.10'], ['$97.22'], ['$166.70'], ['$287.30'], ['$237.50']], ['781411', '781415', '781412', '781416', '701355', '701330', '701346', '701352', '782153', '701354', '781347', '781345']]
I understood that the values of priceList and skuList might be the following values.
priceList = [['$81.57'], ['$80.91'], ['$91.63'], ['$91.63'], ['$455.20'], ['$196.90'], ['$282.60'], ['$146.10'], ['$97.22'], ['$166.70'], ['$287.30'], ['$237.50']]
skuList = ['781411', '781415', '781412', '781416', '701355', '701330', '701346', '701352', '782153', '701354', '781347', '781345']
You want to put the values of priceList and skuList to 2 columns.
Modification points:
In this case, the array is required to be as follows.
[["a1", "b1"], ["a2", "b2"],,,]
When you want to insert the several rows with 2 columns, you can use insert_rows().
When this is reflected to your script, it becomes as follows.
Sample script:
client = gspread.authorize(credentials)
spreadsheetId = "###" # Please set the Spreadsheet ID.
sheetName = "Sheet1" # Please set the sheet name you want to put the values.
spreadsheet = client.open_by_key(spreadsheetId)
sheet = spreadsheet.worksheet(sheetName)
# These values are from your question.
priceList = [['$81.57'], ['$80.91'], ['$91.63'], ['$91.63'], ['$455.20'], ['$196.90'], ['$282.60'], ['$146.10'], ['$97.22'], ['$166.70'], ['$287.30'], ['$237.50']]
skuList = ['781411', '781415', '781412', '781416', '701355', '701330', '701346', '701352', '782153', '701354', '781347', '781345']
# I modified below script.
row = [[e1[0], e2] for e1, e2 in zip(priceList, skuList)]
print(row) # You can confirm the value of "row".
sheet.insert_rows(row)
When you run this script, the values of priceList and skuList are put to the columns "A" and "B" of "Sheet1".
References:
insert_rows(values, row=1, value_input_option='RAW')

Why tf you use insert_row as insert_cols ??? :D
If you want to insert values into one row just do
wks.insert_row(priceList, index=10, value_input_option='RAW')
If you want to insert values into one column just do
wks.insert_cols(values=[priceList], col=2, value_input_option='RAW')
Note that
priceList=['781411', '781415', '781412', '781416']
is like
priceList=['row1', 'row2', 'row3', 'row4']
and
priceList=[['781411'], ['781415'], ['781412'], ['781416']]
is like
priceList=[['col1'], ['col2'], ['col3'], ['col4']]

Related

Push a python dataframe to Smartsheet using Smartsheet API

I have a python script where I'm trying to fetch data from meraki dashboard through its API. Now the data is stored in a dataframe which needs to be pushed to a Smartsheet using the Smartsheet API integration. I've tried searching the Smartsheet API documentation but couldn't find any solution to the problem. Has anyone worked on this kind of use case before or know a script to push a simple data frame to the smartsheet?
The code is something like this:
for device in list_of_devices:
try:
dict1 = {'Name': [device['name']],
"Serial_No": [device['serial']],
'MAC': [device['mac']],
'Network_Id': [device['networkId']],
'Product_Type': [device['productType']],
'Model': [device['model']],
'Tags': [device['tags']],
'Lan_Ip': [device['lanIp']],
'Configuration_Updated_At': [device['configurationUpdatedAt']],
'Firmware': [device['firmware']],
'URL': [device['url']]
}
except KeyError:
dict1['Lan_Ip'] = "NA"
temp = pd.DataFrame.from_dict(dict1)
alldata = alldata.append(temp)
alldata.reset_index(drop=True, inplace=True)
The dataframe("alldata") looks something like this:
Name Serial_No MAC \
0 xxxxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
1 xxxxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
2 xxxxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
the dataframe has somewhere around 1000 rows and 11 columns
I've tried pushing this dataframe similar to the code mentioned in the comments but I'm getting a "Bad Request" error.
smart = smartsheet.Smartsheet(access_token='xxxxxxxx')
sheet_id = xxxxxxxxxxxxx
sheet = smart.Sheets.get_sheet(sheet_id)
column_map = {}
for column in sheet.columns:
column_map[column.title] = column.id
data_dict = alldata.to_dict('index')
rowsToAdd = []
for i,i in data_dict.items():
new_row = smart.models.Row()
new_row.to_top = True
for k,v in i.items():
new_cell = smart.models.Cell()
new_cell.column_id = column_map[k]
new_cell.value = v
new_row.cells.append(new_cell)
rowsToAdd.append(new_row)
result = smart.Sheets.add_rows(sheet_id, rowsToAdd)
{"response": {"statusCode": 400, "reason": "Bad Request", "content": {"detail": {"index": 0}, "errorCode": 1012, "message": "Required object attribute(s) are missing from your request: cell.value.", "refId": "1ob56acvz5nzv"}}}
Smartsheet photo where the data must be pushed
The following code adds data from a dataframe to a sheet in Smartsheet -- this should be enough to at least get you started. If you still can't get the desired result using this code, please update your original post to include the code you're using, the outcome you're wanting, and a detailed description of the issue you encountered. (Add a comment to this answer if you update your original post, so I'll be notified and will know to look.)
# target sheet
sheet_id = 3932034054809476
sheet = smartsheet_client.Sheets.get_sheet(sheet_id)
# translate column names to column id
column_map = {}
for column in sheet.columns:
column_map[column.title] = column.id
df = pd.DataFrame({'item_id': [111111, 222222],
'item_color': ['red', 'yellow'],
'item_location': ['office', 'kitchen']})
data_dict = df.to_dict('index')
rowsToAdd = []
# each object in data_dict represents 1 row of data
for i, i in data_dict.items():
# create a new row object
new_row = smartsheet_client.models.Row()
new_row.to_top = True
# for each key value pair, create & add a cell to the row object
for k, v in i.items():
# create the cell object and populate with value
new_cell = smartsheet_client.models.Cell()
new_cell.column_id = column_map[k]
new_cell.value = v
# add the cell object to the row object
new_row.cells.append(new_cell)
# add the row object to the collection of rows
rowsToAdd.append(new_row)
# add the collection of rows to the sheet in Smartsheet
result = smartsheet_client.Sheets.add_rows(sheet_id, rowsToAdd)
UPDATE #1 - re Bad Request error
Seems like the error you've described in your first comment below is perhaps being caused by the fact that some of the cells in your dataframe don't have a value. When you add a new row using the Smartsheet API, each cell that's specified for the row must specify a value for the cell -- otherwise you'll get the Bad Request error you've described. Maybe try adding an if statement inside the for loop to skip adding the cell if the value of v is None?
for k,v in i.items():
# skip adding this cell if there's no value
if v is None:
continue
...
UPDATE #2 - re further troubleshooting
In response to your second comment below: you'll need to debug further using the data in your dataframe, as I'm unable to repro the issue you describe using other data.
To simplify things -- I'd suggest that you start by trying to debug with just one item in the dataframe. You can do so by adding the line (statement) break at the end of the for loop that's building the dict -- that way, only the first device will be added.
for device in list_of_devices:
try:
...
except KeyError:
dict1['Lan_Ip'] = "NA"
temp = pd.DataFrame.from_dict(dict1)
alldata = alldata.append(temp)
# break out of loop after one item is added
break
alldata.reset_index(drop=True, inplace=True)
# print dataframe contents
print (alldata)
If you get the same error when testing with just one item, and can't recognize what it is about that data (or the way it's stored in your dataframe) that's causing the Smartsheet error, then feel free to add a print (alldata) statement after the for loop (as I show in the code snippet above) to your code and update your original post again to include the output of that statement (changing any sensitive data values, of course) -- and then I can try to repro/troubleshoot using that data.
UPDATE #3 - repro'd issue
Okay, so I've reproduced the error you've described -- by specifying None as the value of a field in the dict.
The following code successfully inserts two new rows into Smartsheet -- because every field in each dict it builds contains a (non-None) value. (For simplicity, I'm manually constructing two dicts in the same manner as you do in your for loop.)
# target sheet
sheet_id = 37558492129156
sheet = smartsheet_client.Sheets.get_sheet(sheet_id)
# translate column names to column id
column_map = {}
for column in sheet.columns:
column_map[column.title] = column.id
#----
# start: repro SO question's building of dataframe
#----
alldata = pd.DataFrame()
dict1 = {'Name': ['name1'],
"Serial_No": ['serial_no1'],
'MAC': ['mac1'],
'Network_Id': ['networkId1'],
'Product_Type': ['productType1'],
'Model': ['model1'],
'Tags': ['tags1'],
'Lan_Ip': ['lanIp1'],
'Configuration_Updated_At': ['configurationUpdatedAt1'],
'Firmware': ['firmware1'],
'URL': ['url1']
}
temp = pd.DataFrame.from_dict(dict1)
alldata = alldata.append(temp)
dict2 = {'Name': ['name2'],
"Serial_No": ['serial_no2'],
'MAC': ['mac2'],
'Network_Id': ['networkId2'],
'Product_Type': ['productType2'],
'Model': ['model2'],
'Tags': ['tags2'],
'Lan_Ip': ['lanIp2'],
'Configuration_Updated_At': ['configurationUpdatedAt2'],
'Firmware': ['firmware2'],
'URL': ['URL2']
}
temp = pd.DataFrame.from_dict(dict2)
alldata = alldata.append(temp)
alldata.reset_index(drop=True, inplace=True)
#----
# end: repro SO question's building of dataframe
#----
data_dict = alldata.to_dict('index')
rowsToAdd = []
# each object in data_dict represents 1 row of data
for i, i in data_dict.items():
# create a new row object
new_row = smartsheet_client.models.Row()
new_row.to_top = True
# for each key value pair, create & add a cell to the row object
for k, v in i.items():
# create the cell object and populate with value
new_cell = smartsheet_client.models.Cell()
new_cell.column_id = column_map[k]
new_cell.value = v
# add the cell object to the row object
new_row.cells.append(new_cell)
# add the row object to the collection of rows
rowsToAdd.append(new_row)
result = smartsheet_client.Sheets.add_rows(sheet_id, rowsToAdd)
However, running the following code (where the value of the URL field in the second dict is set to None) results in the same error you've described:
{"response": {"statusCode": 400, "reason": "Bad Request", "content": {"detail": {"index": 1}, "errorCode": 1012, "message": "Required object attribute(s) are missing from your request: cell.value.", "refId": "dw1id3oj1bv0"}}}
Code that causes this error (identical to the successful code above except that the value of the URL field in the second dict is None):
# target sheet
sheet_id = 37558492129156
sheet = smartsheet_client.Sheets.get_sheet(sheet_id)
# translate column names to column id
column_map = {}
for column in sheet.columns:
column_map[column.title] = column.id
#----
# start: repro SO question's building of dataframe
#----
alldata = pd.DataFrame()
dict1 = {'Name': ['name1'],
"Serial_No": ['serial_no1'],
'MAC': ['mac1'],
'Network_Id': ['networkId1'],
'Product_Type': ['productType1'],
'Model': ['model1'],
'Tags': ['tags1'],
'Lan_Ip': ['lanIp1'],
'Configuration_Updated_At': ['configurationUpdatedAt1'],
'Firmware': ['firmware1'],
'URL': ['url1']
}
temp = pd.DataFrame.from_dict(dict1)
alldata = alldata.append(temp)
dict2 = {'Name': ['name2'],
"Serial_No": ['serial_no2'],
'MAC': ['mac2'],
'Network_Id': ['networkId2'],
'Product_Type': ['productType2'],
'Model': ['model2'],
'Tags': ['tags2'],
'Lan_Ip': ['lanIp2'],
'Configuration_Updated_At': ['configurationUpdatedAt2'],
'Firmware': ['firmware2'],
'URL': [None]
}
temp = pd.DataFrame.from_dict(dict2)
alldata = alldata.append(temp)
alldata.reset_index(drop=True, inplace=True)
#----
# end: repro SO question's building of dataframe
#----
data_dict = alldata.to_dict('index')
rowsToAdd = []
# each object in data_dict represents 1 row of data
for i, i in data_dict.items():
# create a new row object
new_row = smartsheet_client.models.Row()
new_row.to_top = True
# for each key value pair, create & add a cell to the row object
for k, v in i.items():
# create the cell object and populate with value
new_cell = smartsheet_client.models.Cell()
new_cell.column_id = column_map[k]
new_cell.value = v
# add the cell object to the row object
new_row.cells.append(new_cell)
# add the row object to the collection of rows
rowsToAdd.append(new_row)
result = smartsheet_client.Sheets.add_rows(sheet_id, rowsToAdd)
Finally, note that the error message I received contains {"index": 1} -- this implies that the value of index in this error message indicates the (zero-based) index of the problematic row. The fact that your error message contains {"index": 0} implies that there's a problem with the data in the first row you're trying to add to Smartsheet (i.e., the first item in the dataframe). Therefore, following the troubleshooting guidance I posted in my previous update (Update #2 above) should allow you to closely examine the data for the first item/row and hopefully spot the problematic data (i.e., where the value is missing).

how to check item is exists and renew value only in python gspread

How to check item is exists and renew value only, if not exists and add new one?
For example:
I have an item 1 and value 1 already in my sheet, and then I get new value of item 1, I want to renew value 1 only, otherwise, if I get new item 2 and value 2, I want to add this in new columns.
I don't know how to write code, I search it long time but cannot found, could anyone help me? Many thanks!
The script below, the steps are:
first step, check my gmail get keyword 1
second, use keyword search datas in website (beautifulsoup module)
the last step, upload datas to google sheet (gspread module)
def Check_emailbox(box='Inbox', lab='SUBJECT', title='[PASS]'):
global email_content, report_info1, my_msg, report_info
dirpath = 'XXX'
with open(dirpath) as act:
content = act.read()
my_act = yaml.load(content, Loader=yaml.FullLoader)
user, password = my_act['user'], my_act['password']
imapUrl = 'imap.gmail.com'
my_mail = imaplib.IMAP4_SSL(imapUrl)
my_mail.login(user, password)
print('Login gmail account seccess.')
my_mail.select(box)
key = lab
value = title
_, data = my_mail.search(None, key, value)
mail_id_list = data[0].split()
msg_id = mail_id_list[-1]
res, data = my_mail.fetch(msg_id, '(RFC822)')
report_info = []
if res == 'OK':
raw_msg_txt = data[0][1]
try:
my_msg = email.message_from_bytes(raw_msg_txt)
print('Subject: ', my_msg['subject'])
print('From: ', my_msg['from'])
print('Time: ', my_msg['date'])
print('------------------------------------------------------------------------------------')
print('Content:')
for part in my_msg.walk():
email_content = part.get_payload()
report_info.append(email_content)
report_info1 = ''.join('%s' % id for id in report_info)
print(report_info1, type(report_info1))
# print('Hide info, if want to see detail, unmark previous code')
print('------------------------------------------------------------------------------------')
# my_mail.store(msg_id, '-FLAGS', '\SEEN')
except AttributeError:
my_msg = email.message_from_string(raw_msg_txt)
print('AttributeError: ', my_msg)
return email_content, my_msg, report_info, report_info1
Check_emailbox()
keyName = re.findall(r'Daily Report : (.*?)$', report_info1)
fwName = ''.join(keyName)
print(fwName)
# ↑ This data will be upload to sheet, and this is main item for check:
# if "feName" is exists, renew below datas only, if not exists, add new one in next row.
fwVersion = ''.join(re.findall(r'\d-(.*?)-', fwName)).rsplit('.',1)[0]
print(fwVersion)
# connect to the website and use beautifulsoup
ele = requests.get('XXXXXX')
felement = BeautifulSoup(ele.text, 'html.parser')
# print(felement.prettify())
fwinfo = felement.find(['a'], text = fwName)
fwhref = fwinfo.get('href')
print('Info: ', fwinfo)
print(fwhref)
rowid = ''.join(re.findall(r'data/(.*?)$', fwhref))
print('Download id is: ', rowid)
fwlink = 'XXXXXXXXX' + rowid
print('Download link: ', fwlink)
json_key = "XXXXXXX"
spread_url = ['https://spreadsheets.google.com/feeds']
connect_auth = SAC.from_json_keyfile_name(json_key, spread_url)
google_sheets = gspread.authorize(connect_auth)
sheet = google_sheets.open_by_key('XXXXXXXXX').worksheet('Pass Data')
Sheets = sheet
upload = []
upload.append(fwName)
upload.append(fwVersion)
upload.append(rowid)
upload.append(fwlink)
Sheets.append_row(upload)
print('==== Uplod to Google Sheet Done. ====')
In your situation, how about the following modification?
Modified script:
In this case, please use your google_sheets.
# Please set your values here.
fwName = "###"
fwVersion = "###"
rowid = "###"
fwlink = "###"
sheet = google_sheets.open_by_key('XXXXXXXXX').worksheet("Pass Data")
values = sheet.get_all_values()[2:]
obj = {}
for i, r in enumerate(values):
obj[r[0]] = i + 3
if obj.get(fwName):
sheet.update("B" + str(obj.get(fwName)), [[fwVersion, rowid, fwlink]], value_input_option="USER_ENTERED")
When this script is run, first, the values are retrieve from the sheet. And, by searching the value of column "A", new value is put to the searched row.
Note:
I prepared this modified script using your sample image. In your sample image, the 1st 2 rows are header rows. And, the search column is the column "A". I used them. So, when you change your Spreadsheet, this script might not be able to be used. Please be careful about this.
References:
update(range_name, values=None, **kwargs)
get_all_values(**kwargs)

How to create pandas dataframe from Twitter Search API?

I am working with the Twitter Search API which returns a dictionary of dictionaries. My goal is to create a dataframe from a list of keys in the response dictionary.
Example of API response here: Example Response
I have a list of keys within the Statuses dictionary
keys = ["created_at", "text", "in_reply_to_screen_name", "source"]
I would like to loop through each key value returned in the Statuses dictionary and put them in a dataframe with the keys as the columns.
Currently have code to loop through a single key individually and assign to list then append to dataframe but want a way to do more than one key at a time. Current code below:
#w is the word to be queired
w = 'keyword'
#count of tweets to return
count = 1000
#API call
query = twitter.search.tweets(q= w, count = count)
def data_l2 (q, k1, k2):
data = []
for results in q[k1]:
data.append(results[k2])
return(data)
screen_names = data_l3(query, "statuses", "user", "screen_name")
data = {'screen_names':screen_names,
'tweets':tweets}
frame=pd.DataFrame(data)
frame
I will share a more generic solution that I came up with, as I was working with the Twitter API. Let's say you have the ID's of tweets that you want to fetch in a list called my_ids :
# Fetch tweets from the twitter API using the following loop:
list_of_tweets = []
# Tweets that can't be found are saved in the list below:
cant_find_tweets_for_those_ids = []
for each_id in my_ids:
try:
list_of_tweets.append(api.get_status(each_id))
except Exception as e:
cant_find_tweets_for_those_ids.append(each_id)
Then in this code block we isolate the json part of each tweepy status object that we have downloaded and we add them all into a list....
my_list_of_dicts = []
for each_json_tweet in list_of_tweets:
my_list_of_dicts.append(each_json_tweet._json)
...and we write this list into a txt file:
with open('tweet_json.txt', 'w') as file:
file.write(json.dumps(my_list_of_dicts, indent=4))
Now we are going to create a DataFrame from the tweet_json.txt file (I have added some keys that were relevant to my use case that I was working on, but you can add your specific keys instead):
my_demo_list = []
with open('tweet_json.txt', encoding='utf-8') as json_file:
all_data = json.load(json_file)
for each_dictionary in all_data:
tweet_id = each_dictionary['id']
whole_tweet = each_dictionary['text']
only_url = whole_tweet[whole_tweet.find('https'):]
favorite_count = each_dictionary['favorite_count']
retweet_count = each_dictionary['retweet_count']
created_at = each_dictionary['created_at']
whole_source = each_dictionary['source']
only_device = whole_source[whole_source.find('rel="nofollow">') + 15:-4]
source = only_device
retweeted_status = each_dictionary['retweeted_status'] = each_dictionary.get('retweeted_status', 'Original tweet')
if retweeted_status == 'Original tweet':
url = only_url
else:
retweeted_status = 'This is a retweet'
url = 'This is a retweet'
my_demo_list.append({'tweet_id': str(tweet_id),
'favorite_count': int(favorite_count),
'retweet_count': int(retweet_count),
'url': url,
'created_at': created_at,
'source': source,
'retweeted_status': retweeted_status,
})
tweet_json = pd.DataFrame(my_demo_list, columns = ['tweet_id', 'favorite_count',
'retweet_count', 'created_at',
'source', 'retweeted_status', 'url'])

Get info from a table on a website where XPATH varies on each site, Python

If you take this website as an example:
http://gbgfotboll.se/information/?scr=table&ftid=51168
I am using this code to get information from the second table:
for url in urlList:
request = net.Request(url)
response = net.urlopen(request)
data = response.read()
dom = lxml.html.parse(BytesIO(data))
#all table rows
xpatheval = etree.XPathDocumentEvaluator(dom)
rows = xpatheval('//div[#id="content-primary"]/table[2]/tbody/tr')
divName = xpatheval('//*[#id="content-primary"]/h1//text()')[0]
trash, divisionName = divName.rsplit("- ")
dict[divisionName] = {}
for id,row in enumerate(rows):
columns = row.findall("td")
teamName = columns[0].find("a").text, # Lag
print teamName
teamName
playedGames = columns[1].text, # S
wins = columns[2].text,
draw = columns[3].text,
lost = columns[4].text,
dif = columns[6].text, # GM-IM
points = columns[7].text, # P - last column
dict[divisionName].update({id :{"teamName":columns[0].find("a").text, "playedGames":playedGames, "wins":wins, "draw":draw, "lost":lost, "dif":dif, "points":points }})
For that website the rows has table[2]
For this website:
http://gbgfotboll.se/serier/?scr=table&ftid=57108
the rows would need to look like this:
rowss = '//div[#id="content-primary"]/table[1]/tbody/tr'[0]
So what I am asking for if there is a way to get the information I need regardless what table index the table will be at?
One way to do it would be to select by its class attribute (all 3 classes are required):
xpatheval('//div[#id="content-primary"]/table[#class="clCommonGrid clTblStandings clTblWithFullToggle"]/tbody/tr'
An alternative would be to select a child element in that table that you know is only present in that specific type of table. For example, the GM-IM header could be quite specific to that type of table, so I navigate to it and then work my way up the tree to end up with the same rows as you:
xpatheval('//div[#id="content-primary"]//tr[th="GM-IM"]/../../tbody/tr')

export list to csv and present to user via browser

Want to prompt browser to save csv
^^working off above question, file is exporting correctly but the data is not displaying correctly.
#view_config(route_name='csvfile', renderer='csv')
def csv(self):
name = DBSession.query(table).join(othertable).filter(othertable.id == 9701).all()
header = ['name']
rows = []
for item in name:
rows = [item.id]
return {
'header': header,
'rows': rows
}
Getting _csv.Error
Error: sequence expected but if I change in my renderer writer.writerows(value['rows']) to writer.writerow(value['rows']) the file will download via the browser just fine. Problem is, it's not displaying data in each row. The entire result/dataset is in one row, so each entry is in it's own column rather than it's own row.
First, I wonder if having a return statement inside your for loop isn't also causing problems; from the linked example it looks like their loop was in the prior statement.
I think what it looks like it's doing is it's building a collection of rows based on "table" having columns with the same name as the headers. What are the fields in your table table?
name = DBSession.query(table).join(othertable).filter(othertable.id == 9701).all()
This is going to give you back essentially a collection of rows from table, as if you did a SELECT query on it.
Something like
name = DBSession.query(table).join(othertable).filter(othertable.id == 9701).all()
header = ['name']
rows = []
for item in name:
rows.append(item.name)
return {
'header': header,
'rows': r
}
Figured it out. kept getting Error: sequence expected so I was looking at the output. Decided to try putting the result inside another list.
#view_config(route_name='csv', renderer='csv')
def csv(self):
d = datetime.now()
query = DBSession.query(table, othertable).join(othertable).join(thirdtable).filter(
thirdtable.sid == 9701)
header = ['First Name', 'Last Name']
rows = []
filename = "csvreport" + d.strftime(" %m/%d").replace(' 0', '')
for i in query:
items = [i.table.first_name, i.table.last_name, i.othertable.login_time.strftime("%m/%d/%Y"),
]
rows.append(items)
return {
'header': header,
'rows': rows,
'filename': filename
}
This accomplishes 3 things. Fills out the header, fills the rows, and passes through a filename.
Renderer should look like this:
class CSVRenderer(object):
def __init__(self, info):
pass
def __call__(self, value, system):
fout = StringIO.StringIO()
writer = csv.writer(fout, delimiter=',',quotechar =',',quoting=csv.QUOTE_MINIMAL)
writer.writerow(value['header'])
writer.writerows(value['rows'])
resp = system['request'].response
resp.content_type = 'text/csv'
resp.content_disposition = 'attachment;filename='+value['filename']+'.csv'
return fout.getvalue()
This way, you can use the same csv renderer anywhere else and be able to pass through your own filename. It's also the only way I could figure out how to get the data from one column in the database to iterate through one column in the renderer. It feels a bit hacky but it works and works well.

Categories