After starting the program
results = smart.Search.search("2244113312180")
print(results)
Getting the data
{"results":
[{"contextData": ["2244113312180"],
"objectId": 778251154810756,
"objectType": "row",
"parentObjectId": 3648397300262788,
"parentObjectName": "Sample Sheet",
"parentObjectType": "sheet",
"text": "2244113312180"},
{"contextData": ["2244113312180"],
"objectId": 7803446734415748,
"objectType": "row",
"parentObjectId": 3648397300262788,
"parentObjectName": "Sample Sheet",
"parentObjectType": "sheet",
"text": "2244113312180"}],
"totalCount": 2}
How do I use them correctly in my program?
Please provide a correct usage example.
And how to find out the id_column in which the value was found "2244113312180"?
new_row = smartsheet.models.Row()
new_row.id = results.objectId
Sorry I didn't write the error right away. I can't use the properties from the results. String:
new_row.id = results.objectId
Causes an error
AttributeError: 'SearchResult' object has no attribute 'objectId'
Thank you for any help!
P.S. I found how to do it.
results = smart.Search.search("2244113312180")
text = str(results)
json_op = json.loads(text)
for i in json_op["results"]:
new_row = smartsheet.models.Row()
new_row.id = i["objectId"]
I don't know if this is a good solution or not.
According to the SearchResultItem Object definition in the Smartsheet API docs, a search result item will never contain information about the column where a value exists. As the result JSON you've posted shows, if the specified value is found within the row of a sheet (i.e., in any of the cells that row contains), the corresponding search result item will identify the sheet ID (parentObjectId) and the row ID (objectId).
You can then use those two values to retrieve the row, as described in the Get Row section of the docs:
row = smartsheet_client.Sheets.get_row(
4583173393803140, # sheet_id
2361756178769796 # row_id
)
Then you can iterate through the row.cells array, checking the value property of each cell to determine if it matches the value you searched for previously. When you find a cell object that contains that value, the column_id property of that cell object will give you the column ID where the matching value exists.
UPDATE:
Thanks for clarifying info in your original post. I'm updating this answer to provide a complete code sample that implements the approach I described previously. Hope this is helpful!
This code sample does the following:
searches everything in Smartsheet (that the holder of the API token being used has access to) for a string value
iterates through search result items to process any "row" results (i.e., anywhere that the string appears within the cells of a sheet)
replaces any occurrences within (the cells of) a sheet with the string new value
# set search criteria
query = '2244113312180'
# search everything
search_results = smart.Search.search(query)
# loop through results
# (acting upon only search results that appear within a row of a sheet)
for item in search_results.results:
if item.object_type == 'row':
# get row
row = smart.Sheets.get_row(
item.parent_object_id, # sheet_id
item.object_id # row_id
)
# find the cell that contains the value and update that cell value
for cell in row.cells:
if cell.value == query:
# build new cell value
new_cell = smartsheet.models.Cell()
new_cell.column_id = cell.column_id
new_cell.value = "new value"
new_cell.strict = False
# build the row to update
new_row = smartsheet.models.Row()
new_row.id = item.object_id
new_row.cells.append(new_cell)
# update row
result = smart.Sheets.update_rows(
item.parent_object_id, # sheet_id
[new_row])
Related
Looking on tips how to get the data of the latest row of a sheet. I've seen solution to get all the data and then taking the length of that.
But this is of course a waste of all that fetching. Wondering if there is a smart way to do it, since you can already append data to the last row+1 with worksheet.append_rows([some_data])
I used the solution #buran metnion. If you init the worksheet with
add_worksheet(title="title", rows=1, cols=10)
and only append new data via
worksheet.append_rows([some_array])
Then #buran's suggestion is brilliant to simply use
worksheet.row_count
I found this code in another question, it creates a dummy append in the sheet.
After that, you can search for the location later on:
def get_last_row_with_data(service, value_input_option="USER_ENTERED"):
last_row_with_data = '1'
try:
# creates a dummy row
dummy_request_append = service.spreadsheets().values().append(
spreadsheetId='<spreadsheet id>',
range="{0}!A:{1}".format('Tab Name', 'ZZZ'),
valueInputOption='USER_ENTERED',
includeValuesInResponse=True,
responseValueRenderOption='UNFORMATTED_VALUE',
body={
"values": [['']]
}
).execute()
# Search the dummy row
a1_range = dummy_request_append.get('updates', {}).get('updatedRange', 'dummy_tab!a1')
bottom_right_range = a1_range.split('!')[1]
number_chars = [i for i in list(bottom_right_range) if i.isdigit()]
last_row_with_data = ''.join(number_chars)
except Exception as e:
last_row_with_data = '1'
return last_row_with_data
You can see a sample of Append in this documentation.
However, for me it is just easier to use:
# The ID of the sheet you are working with.
Google_sheets_ID = 'ID_of_your_Google_Sheet'
# define the start row that has data
# it will later be replace with the last row
# in my test sheet, it starts in row 2
last_row = 2
# code to the get the last row
# range will be the column where the information is located
# remember to change "sheet1" for the name of your worksheet.
response = service.spreadsheets().values().get(
spreadsheetId = Google_sheets_ID,
range = 'sheet1!A1:A'
)execute()
#Add the initial value where the range started to the last row with values
Last_row += len(response['values']) - 1
#If you print last row, you should see the last row with values in the Sheet.
print(last_row)
I am using the following code to search and extract research documents on chemical compounds from pubmed. I am interested in the author, name of document, abstract, etc..When I run the code I am only getting results for the last item on my list (see example data) in code below. Yet when I do a manual search I.e. one at a time), I get results from all of them..
#example data list
data={'IUPACName':['ethenyl(trimethoxy)silane','sodium;prop-2-enoate','2-methyloxirane;oxirane','2-methylprop-1-ene;styrene','terephthalic acid', 'styrene' ]}
df=pd.DataFrame(data)
df_list = []
import time
from pymed import PubMed
pubmed = PubMed(tool="PubMedSearcher", email="thomas.heiman#fda.hhs.gov")
data = []
for index, row in df.iterrows():
## PUT YOUR SEARCH TERM HERE ##
search_term =row['IUPACName']
time.sleep(3) #because I dont want to slam them with requests
#search_term = '3-hydroxy-2-(hydroxymethyl)-2-methylpropanoic '
results = pubmed.query(search_term, max_results=500)
articleList = []
articleInfo = []
for article in results:
# Print the type of object we've found (can be either PubMedBookArticle or PubMedArticle).
# We need to convert it to dictionary with available function
articleDict = article.toDict()
articleList.append(articleDict)
# Generate list of dict records which will hold all article details that could be fetch from
#PUBMED API
for article in articleList:
#Sometimes article['pubmed_id'] contains list separated with comma - take first pubmedId in that list - thats article pubmedId
pubmedId = article['pubmed_id'].partition('\n')[0]
# Append article info to dictionary
try:
articleInfo.append({u'pubmed_id':pubmedId,
u'title':article['title'],
u'keywords':article['keywords'],
u'journal':article['journal'],
u'abstract':article['abstract'],
u'conclusions':article['conclusions'],
u'methods':article['methods'],
u'results': article['results'],
u'copyrights':article['copyrights'],
u'doi':article['doi'],
u'publication_date':article['publication_date'],
u'authors':article['authors']})
except KeyError as e:
continue
# Generate Pandas DataFrame from list of dictionaries
articlesPD = pd.DataFrame.from_dict(articleInfo)
#Add the query to the first column
articlesPD.insert(loc=0, column='Query', value=search_term)
df_list.append(articlesPD)
data = pd.concat(df_list, axis=1)
all_export_csv = data.to_csv (r'C:\Users\Thomas.Heiman\Documents\pubmed_output\all_export_dataframe.csv', index = None, header=True)
#Print first 10 rows of dataframe
#print(all_export_csv.head(10))
Any ideas on what I am doing wrong? Thank you!
I've just started using gspread and looking for some advice on searching a sheet. I want to search for multiple strings and get the results of the specific row where both strings exist. Both strings must match for a result (logical AND)
An example of the search string would be to search for an IP Address, AND hostname. In the sheet the IP address would be in cell A1 and Hostname would be in B1.
I'm using the below code example from their documentation and have tried various iterations but not having much luck.
amount_re = re.compile(r'(192.168.0.1|Gi0/0.100)')
cell = worksheet.find(amount_re)
Gspread documentation
Here is the format of the data:
192.168.0.1,Gi0/0.100
192.168.0.1,Gi0/0.200
192.168.0.1,Gi0/0.300
192.168.0.2,Gi0/0.100
As you can see there are duplicates in A and B column so the only way to get a unique results is to search for both. e.g.
192.168.0.1,Gi0/0.100
It needs to be in the Gspread search format though. I can't just search for the string '192.168.0.1,Gi0/0.100'
I believe your goal like below.
You want to search 2 values like 192.168.0.1 and Gi0/0.100 from a sheet in a Google Spreadsheet.
The 2 values are to the columns "A" and "B".
When 2 values like 192.168.0.1 and Gi0/0.100 are found at the same row, you want to retrieve the values.
You want to achieve this using gspread with python.
You have alredy been get and put values for Google Spreadsheet using Sheets API.
For achieving your goal, how about this answer?
I think that unfortunately, re.compile(r'(192.168.0.1|Gi0/0.100)') cannot be used for achieving your goal. So here, I would like to propose the following 2 patterns.
Pattern 1:
In this pattern, the values are searched using the Query Language. The access token can be used from the authorization for gspread.
Sample script:
searchValues = ["192.168.0.1", "Gi0/0.100"] # Please set the search values.
spreadsheet_id = "###" # Please set the Spreadsheet ID.
sheetName = "Sheet1" # Please set the sheet name.
client = gspread.authorize(credentials)
ss = client.open_by_key(spreadsheet_id)
ws = ss.worksheet(sheetName)
sheet_id = ws._properties['sheetId']
access_token = client.auth.token_response['access_token']
query = "select * where A='" + \
searchValues[0] + "' and B='" + searchValues[1] + "'"
url = 'https://docs.google.com/spreadsheets/d/' + \
spreadsheet_id + '/gviz/tq?tqx=out:csv&gid=' + \
str(sheet_id) + '&tq=' + urllib.parse.quote(query)
res = requests.get(url, headers={'Authorization': 'Bearer ' + access_token})
ar = [row for row in csv.reader(io.StringIO(res.text), delimiter=',')]
print(ar)
In this case, when the search values are found, ar has the searched rows. When the search values are NOT found, the length of ar is 0.
In this case, the row index cannot be retrieved.
Pattern 2:
In this pattern, at first, all values are retrieved from the worksheet, and the values are searched.
Sample script:
searchValues = ["192.168.0.1", "Gi0/0.100"] # Please set the search values.
spreadsheet_id = "###" # Please set the Spreadsheet ID.
sheetName = "Sheet1" # Please set the sheet name.
client = gspread.authorize(credentials)
ss = client.open_by_key(spreadsheet_id)
ws = ss.worksheet(sheetName)
values = ws.get_all_values()
ar = [{"rowIndex": i, "value": e} for i, e in enumerate(
values) if e[0] == searchValues[0] and e[1] == searchValues[1]]
print(ar)
In this case, when the search values are found, ar has the row index and values of the searched rows. When the search values are NOT found, the length of ar is 0.
In this case, the row index can be retrieved.
References:
Query Language
get_all_values()
I am trying to add a new row to an existing sheet with date columns. Some of the cells in the new row hyperlink to other sheets, this works fine. But there are other cells that I want to link_in_to from other sheets. I keep getting an "attributes are not allowed for this operation" error.
there was a comment from an old smartsheet community post indicating that cell links don't work in the 1.1 API. But we are well past that, and the 2.0 documentation implies that it should be possible.
Has anyone else seen this or solved it?
row_a.cells.append({
'column_id': status_columns['Exp Start'],
'value': None,
'linkInFromCell': {
'columnID': project_columns['Start'],
'rowID': project_rows[1],
'sheetID': map_of_sheets[this_project]},
})
The value property must be set to an ExplicitNull (so that it is serialized as null in the JSON body), like this:
cell = smart.models.Cell()
cell.column_id = col_id
cell.link_in_from_cell = cell_link
cell.value = smart.models.ExplicitNull()
row = smart.models.Row()
row.id = added_row.id
row.cells.append(cell)
action = smart.Sheets.update_rows(sheet.id, [row])
Check out test_regression.py in the tests/integration folder, test case test_link_in_from_cell shows the technique.
I read from a file and stored into artists_tag with column names .
Now this file has multiple columns and I need to generate a new data structure which has 2 columns from the artists_tag as it is and the most frequent value from the 'Tag' column as the 3rd column value.
Here is what I have written as of now:
import pandas as pd
from collections import Counter
def parse_artists_tags(filename):
df = pd.read_csv(filename, sep="|", names=["ArtistID", "ArtistName", "Tag", "Count"])
return df
def parse_user_artists_matrix(filename):
df = pd.read_csv(filename)
return df
# artists_tags = parse_artists_tags(DATA_PATH + "\\artists-tags.txt")
artists_tags = parse_artists_tags("C:\\Users\\15-J001TX\\Documents\\ml_task\\artists-tags.txt")
#print(artists_tags)
user_art_mat = parse_user_artists_matrix("C:\\Users\\15-J001TX\\Documents\\ml_task\\userart-mat-training.csv")
#print ("Number of tags {0}".format(len(artists_tags))) # Change this line. Should be 952803
#print ("Number of artists {0}".format(len(user_art_mat))) # Change this line. Should be 17119
# TODO Implement this. You can change the function arguments if necessary
# Return a data structure that contains (artist id, artist name, top tag) for every artist
def calculate_top_tag(all_tags):
temp = all_tags.Tag
a = Counter(temp)
a = a.most_common()
print (a)
top_tags = all_tags.ArtistID,all_tags.ArtistName,a;
return top_tags
top_tags = calculate_top_tag(artists_tags)
# Print the top tag for Nirvana
# Artist ID for Nirvana is 5b11f4ce-a62d-471e-81fc-a69a8278c7da
# Should be 'Grunge'
print ("Top tag for Nirvana is {0}".format(top_tags)) # Complete this line
In the last method calculate_top_tag I don't understand how to choose the most frequent value from the 'Tag' column and put it as the third column for top_tags before returning it.
I am new to python and my knowledge of syntax and data structures is limited. I did try the various solutions mentioned for finding the most frequent value from the list but they seem to display the entire column and not one particular value. I know this is some trivial syntax issue but after having searched for long I still cannot figure out how to get this one.
edit 1 :
I need to find the most common tag for a particular artist and not the most common overall.
But again, I don't know how to.
edit 2 :
here is the link to the data files:
https://github.com/amplab/datascience-sp14/raw/master/hw2/hw2data.tar.gz
I'm sure there is a more succint way of doing it, but this should get you started:
# returns a df grouped by ArtistID and Tag
tag_counts = artists_tags.groupby(['ArtistID', 'Tag'])
# sum up tag counts and sort in descending order
tag_counts = tag_counts.sum().sort('Count', ascending=False).reset_index()
# keep only the top ranking tag per artist
top_tags = tag_counts.groupby('ArtistID').first()
# top_tags is now a dataframe which contains the top tag for every artist
# We can simply lookup the top tag for Nirvana via it's index:
top_tags.ix['5b11f4ce-a62d-471e-81fc-a69a8278c7da'][0]
# 'Grunge'