I'm pretty new to coding so apologies if this is easy.
I have two columns in google sheets and I want to add a formula into a third column that is something like this:
=(E3*90)+(F3*10) - the values in the columns are grades and the 90 and 10 are weightings that are fixed.
I created a for loop to try and iterate through a range(3,90) as as it updates each cell in the column.
It prints the formula in every cell but it's only the last iteration '=(E89*90)+(F89*10)'
I managed to get this working by adding report.update_acell('E'+str(i),'=(E'+str(i)+'*90)+(F'+str(i)+'*10)') to the for loop but this create too many calls and causes problems.
sh = client.open("grading")
report = sh.worksheet("Report")
weighted = report.range('G3:G89')
for cell in weighted:
for i in range(3,90):
cell.value = '=(E'+str(i)+'*90)+(F'+str(i)+'*10)'
report.update_cells(weighted, value_input_option='USER_ENTERED')
What I'd like to see is every cell in the 'weighted' range be updated with a formula that looks at the two cells next to them and adds them into the formula so that a result is visible in weighted column.
eg.
row 3 should be =(E3*90)+(F3*10)
row 4 should be =(E4*90)+(F4*10) and so on until the range is completed.
I fixed this after a lot of trial and error. For anyone who is trying to do the same here is my solution:
sh = client.open("grading")
report = sh.worksheet("Report")
weighted = report.range('G3:G89')
for i, cell in enumerate(weighted,3):
cell.value = '=(E'+str(i)+'*90)+(F'+str(i)+'*10)'
report.update_cells(weighted, value_input_option='USER_ENTERED')
Related
Hello guys! I am struggling to calculate the mean of certain rows from
an excel sheet using python. In particular, I would like to calculate the mean for every three rows starting from the first three and then moving to the next three and so on. My excel sheet contains 156 rows of data.
My data sheet looks like this:
And this is my code:
import numpy
import pandas as pd
df = pd.read_excel("My Excel.xlsx")
x = df.iloc[[0,1,2], [9,10,11]].mean()
print(x)
To sum up, I am trying to calculate the mean of Part 1 Measurements 1 (rows 1,2,3) and the mean of Part 2
Measurements 1 (rows 9,10,11) using one line of code, or some kind of index. I am expecting to receive two lists of numbers, one that stands for the mean of Part 1 Measurement 1 (rows 1,2,3) and the other for the mean of Part 2 Measurements 1 (rows 10,11,12). I am also familiar with the fact that python counts row number one as 0. The index should have a form of n+1.
Thank you in advance.
You could (e.g.) generate a list for each mean you want to calculate:
x1, x2 = list(df.iloc[[0,1,2]].mean()), list(df.iloc[[9,10,11]].mean())
Or you could also generate a list of lists:
x = [list(df.iloc[[0,1,2]].mean()), list(df.iloc[[9,10,11]].mean())]
I have a Python script running on AWS Lambda (serverless). When triggered it needs to add a new row to a Google Spreadsheet via the API and that involves copying some of the formulas from the row above to the new row. But for the formulas to be valid the cell row numbers need to be incremented by +1 before copying to the new row.
Example say I have a cell with formula =if(countif(G198:G201,"<=3")<=3,0,1). I need to change this to =if(countif(G199:G202,"<=3")<=3,0,1) and add to the new row.
Whats the easiest way to identify the numbers that identify cells in the formula and increment them to create the new formula?
def update_formula(formula):
import re
cells = re.findall('[A-Z][0-9]+',formula)
for cell in cells:
row_number = re.search('[0-9]+',cell).group(0)
new_row_number = int(row_number) + 1
new_cell = cell.replace(str(row_number),str(new_row_number))
formula = formula.replace(cell,new_cell)
return formula
I am working on a project using python to select certain values from an excel file. I am using the xlrd library and openpyxl library to do this.
The way the python program should we working is :
Grouping all the data point entries that are in a certain card tase. These are marked in column E. For example, all of the entries between row 26 and row 28 are in Card Task A, and hence they should be grouped together. All entries without a “Card Task” value in column E should not be considered as anything.
Next…
looking at the value from column N (lastExecTime) from a row and compare that time with the following value in column M
If it is seen that the times overlap (column M is less than the previous N value) it will increment a variable called “count” . Count stores the number of times a procedure overlaps.
Finally…
As for the output, the goal is to create a separate text file that displays which tasks are overlapping, and how many tasks overlap in a certain Card Task.
The problem that I am running into is that I cannot pair the data from a card task
Here is a sample of the excel data:
The data (a picture of it)
Here is a picture of more data (this will probably be more helpful)
Click here for it
And here is the code that I have written that tells me if there are multiple procedures going on:
from openpyxl import load_workbook
book = load_workbook('LearnerSummaryNoFormat.xlsx')
sheet = book['Sheet1']
for row in sheet.rows:
if ((row[4].value[:9]) != 'Card Task'):
print ("Is not a card task: " + str(row[1].value))
Essentially my problem is that I am not able to compare all the values from one card task with each other.
Blockquote
I would read through the data once like you have already but store all rows with 'Card Task' in a separate list. Once you have a list of only card task items you can compare.
card_task_row_object_list = []
count = 0
for row in sheet.rows:
if 'Card Task' in row[4]:
card_task_row_object_list.append(row)
From here you would want to compare the time values. What are you needed to check, if two different card task times overlap?
(row 12: start, row 13: end)
def compare_times(card_task_row_object_list):
for row in card_task_row_object_list:
for comparison_row in card_task_row_object_list:
if (comparison_row[12] <= row[13] && comparison_row[13] >= row[12])
# No overlap
else
count+=1
I have data frame df and I would like to keep a running total of names that occur in a column of that data frame. I am trying to calculate the running total column:
name running total
a 1
a 2
b 1
a 3
c 1
b 2
There are two ways I thought to do this:
Loop through the dataframe and use a separate dictionary containing name and current count. The current count for the relevant name would increase by 1 each time the loop is carried out, and that value would be copied into my dataframe.
Change the count in field for each value in the dataframe. In excel I would use a countif combined with a drag down formula A$1:A1 to fix the first value but make the second value relative so that the range I am looking in changes with the row.
The problem is I am not sure how to implement these. Does anyone have any ideas on which is preferable and how these could be implemented?
#bunji is right. I'm assuming you're using pandas and that your data is in a dataframe called df. To add the running totals to your dataframe, you could do something like this:
df['running total'] = df.groupby(['name']).cumcount() + 1
The + 1 gives you a 1 for your first occurrence instead of 0, which is what you would get otherwise.
I have a general question about pandas. I have a DataFrame named d with a lot of info on parks. All unique park names are stored in an array called parks. There's another column with a location ID and I want to iterate through the parks array and print unique location ID counts associated with that park name.
d[d['Park']=='AKRO']
len(d['Location'].unique())
gives me a count of 24824.
x = d[d['Park']=='AKRO']
print(len(x['Location'].unique()))
gives me a location count of 1. Why? I thought these are the same except I am storing the info in a variable.
So naturally the loop I was trying doesn't work. Does anyone have any tips?
counts=[]
for p in parks:
x= d[d['Park']==p]
y= (len(x['Location'].unique()))
counts.append([p,y])
You can try something like,
d.groupby('Park')['Location'].nunique()
When you subset the first time, you're not assigning d[d['Park'] == 'ARKO'] to anything. So you haven't actually changed the data. You only viewed that section of the data.
When you assign x = d[d['Park']=='AKRO'], x is now only that section that you viewed with the first command. That's why you get the difference you are observing.
Your for loop is actually only looping through the columns of d. If you wish to loop through the rows, you can use the following.
for idx, row in d.iterrows():
print(idx, row)
However, if you want to count the number of locations with a for loop, you have to loop through each park. Something like the following.
for park in d['Park'].unique():
print(park, d.loc[d['Park'] == park, 'Location'].size())
You can accomplish your goal without iteration, however. This sort of approach is preferred.
d.groupby('Park')['Location'].nunique()
Be careful with Panda's DataFrame functions for which produce an inline change or not. For example, d[d['Park']=='AKRO'] doesn't actually change the DataFrame d. However, x = d[d['Park']=='AKRO'] sets the output of d[d['Park']=='AKRO'] to x so x now only has 1 Location.
Have you manually checked how many unique Location IDs exist for 'AKRO'? The for loop looks correct outside of the extra brackets around y= len(x['Location'].unique())