TL;DR - trying to clean this up but unsure of the best practice for compiling a list of variables and still separating them on individual lines on the .txt file they're being copied to.
This is my first post here.
I've recently created a script to automate an extremely tedious process at work that involves modifying an excel document, copying and pasting outputs from specifics cells depending on the type of configuration we are generating and pasting into 3 separate .txt files to send out via email.
I've got the script functioning, but I hate how my code looks and to be honest, it is quite the pain to try to make additions to.
I'm using openpyxl & pycel for this, as the cells I copy are outputs from a formula that I couldn't seem to get anything except for #N/A when strictly using openpyxl so I integrated pycel for that piece.
I've referenced my code below, & I appreciate any input.
F62 = format(excel.evaluate('Config!F62'))
F63 = format(excel.evaluate('Config!F63'))
F64 = format(excel.evaluate('Config!F64'))
F65 = format(excel.evaluate('Config!F65'))
F66 = format(excel.evaluate('Config!F66'))
F67 = format(excel.evaluate('Config!F67'))
F68 = format(excel.evaluate('Config!F68'))
F69 = format(excel.evaluate('Config!F69'))
F70 = format(excel.evaluate('Config!F70'))
F71 = format(excel.evaluate('Config!F71'))
F72 = format(excel.evaluate('Config!F72'))
F73 = format(excel.evaluate('Config!F73'))
F74 = format(excel.evaluate('Config!F74'))
F75 = format(excel.evaluate('Config!F75'))
F76 = format(excel.evaluate('Config!F76'))
F77 = format(excel.evaluate('Config!F77'))
#so on and so forth to put into:
with open(f'./GRAK-R-{KIT}/3_GRAK-R-{KIT}_FULL.txt', 'r') as basedone:
linetest = f"{F62}\n{F63}\n{F64}\n{F65}\n{F66}\n{F67}\n{F68}\n{F69}\n{F70}\n{F71}\n{F72}\n{F73}\n{F74}\n{F75}\n{F76}\n{F77}\n{F78}\n{F79}\n{F80}\n{F81}\n{F82}\n{F83}\n{F84}\n{F85}\n{F86}\n{F87}\n{F88}\n{F89}\n{F90}\n{F91}\n{F92}\n{F93}\n{F94}\n{F95}\n{F96}\n{F97}\n{F98}\n{F99}\n{F100}\n{F101}\n{F102}\n{F103}\n{F104}\n{F105}\n{F106}\n{F107}\n{F108}\n{F109}\n{F110}\n{F111}\n{F112}\n{F113}\n{F114}\n{F115}\n{F116}\n{F117}\n{F118}\n{F119}\n{F120}\n{F121}\n{F122}\n{F123}\n{F124}\n{F125}\n{F126}\n{F127}\n{F128}\n{F129}\n{F130}\n{F131}\n{F132}\n{F133}\n{F134}\n{F135}\n{F136}\n{F137}\n{F138}\n{F139}\n{F140}\n{F141}\n{F142}\n{F143}\n{F144}\n{F145}\n{F146}\n{F147}\n{F148}\n{F149}\n{F150}\n{F151}\n{F152}\n{F153}\n{F154}\n{F155}\n{F156}\n{F157}\n{F158}\n{F159}\n{F160}\n{F161}\n{F162}\n{F163}\n{F164}\n{F165}\n{F166}\n{F167}\n{F168}\n{F169}\n{F170}\n{F171}\n{F172}\n{F173}\n{F174}\n{F175}\n{F176}\n{F177}\n{F178}\n{F179}\n {F180}\n{F181}\{F182}\n{F183}\n{F184}\n{F185}\n{F186}\n{F187}\n{F188}\n{F189}\n{F190}\n {F191}\n{F192}\n{F193}\n{F194}\n{F195}\n{F196}\n{F197}\n{F198}\n{F199}\n{F200}\n{F201}\n{F202}\n{F203}\n{F204}\n{F205}\n{F206}\n{F207}\n{F208}\n{F209}\n{F210}\n{F211}\n{F212}\n{F213}\n{F214}\n{F215}\n{F216}\n{F217}\n{F218}\n{F219}\n{F220}\n{F221}\n{F222}\n{F223}\n{F224}\n{F225}\n{F226}\n{F227}\n{F228}\n{F229}\n{F230}\n{F231}\n{F232}\n{F233}\n{F234}\n{F235}\n{F236}\n{F237}\n{F238}\n{F239}\n{F240}\n{F241}\n{F242}\n{F243}\n{F244}\n{F245}\n{F246}\n{F247}\n{F248}\n{F249}\n{F250}\n{F251}\n{F252}\n{F253}\n{F254}\n{F255}\n{F256}\n{F257}\n{F258}\n{F259}\n{F260}\n{F261}\n{F262}\n{F263}\n{F264}\n{F265}\n{F266}\n{F267}\n{F268}\n{F269}\n{F270}\n{F271}\n{F272}\n{F273}\n{F274}\n"
oline = basedone.readlines()
oline.insert(9,linetest)
basedone.close()
with open(f'./GRAK-R-{KIT}/3_GRAK-R-{KIT}_FULL.txt', 'w') as basedone:
basedone.writelines(oline)
basedone.close
I don't think you need to name every single variable. You can use f-strings and list comprehensions to keep your code flexible.
min_cell = 62
max_cell = 77
column_name = 'F'
sheet_name = 'Config'
cell_names = [f'{sheet_name}!{column_name}{i}' for i in range(min_cell, max_cell + 1)]
vals = [format(excel.evaluate(cn)) for cn in cell_names]
linetest = '\n'.join(vals)
Related
filename = r"C:\Users\EXCEL_1.xlsx"
book = openpyxl.load_workbook(filename)
# sheet extract
sheet = book.worksheets[0]
# EXCEL_1 LIST
data = []
for row in sheet.rows:
data.append([
row[0].value,
' ',
row[3].value,
row[4].value,
row[5].value,
row[6].value,
row[7].value,
row[8].value,
])
data = data[1:]
# EXCEL_2
wb = load_workbook(r"C:\Users\EXCEL_2.xlsx")
ws = wb.active
# excel input
for n, datalist in enumerate(data, 3):
for n2, i in enumerate(datalist, 1):
cell = ws.cell(row = n, column = n2).value = i
# save
wb.save("Final Product.xlsx")
wb.close()
As above, I wrote a code to enter the contents of EXCEL_2 in EXCEL_1.
I will implement this as a GUI and make it an EXE file finally.
When you run a file, how do you write code to derive results by selecting different EXCEL files for "EXCEL_1" and "EXCEL_2" regardless of path?
I ask for your help.
Sorry if I cannot understand your question. I will try to add a couple of pieces of information and hope it is useful.
You may be asking how the file name can differ in your program.
wb = load_workbook(r'C:\Users\EXCEL_2.xlsx')
is the same as:
name = r'C:\Users\EXCEL_2.xlsx'
wb = load_workbook(name)
You may be asking how to make the GUI. You might want to start looking at the 'cookbook' of simple programs. The FileBrowse and FolderBrowse controls might be useful.
To make the question more answerable, try cutting down the code to smallest portion that you want to make better. Programming looks daunting at first, but it gets better.
Keep on coding. Keep notes.
This question is a follow-up to the question I asked here, which in summary was:
"In python how do I read in parameters from the text file params.txt, creating the variables and assigning them the values that are in the file? The contents of the file are (please ignore the auto syntax highlighting, params.txt is actually a plain text file):
Lx = 512 Ly = 512
g = 400
================ Dissipation =====================
nupower = 8 nu = 0
...[etc]
and I want my python script to read the file so that I have Lx, Ly, g, nupower, nu etc available as variables (not keys in a dictionary) with the appropriate values given in params.txt. By the way I'm a python novice."
With help, I have come up with the following solution that uses exec():
with open('params.txt', 'r') as infile:
for line in infile:
splitline = line.strip().split(' ')
for i, word in enumerate(splitline):
if word == '=':
exec(splitline[i-1] + splitline[i] + splitline[i+1])
This works, e.g. print(Lx) returns 512 as expected.
My questions are:
(1) Is this approach safe? Most questions mentioning the exec() function have answers that contain dire warnings about its use, and imply that you shouldn't use it unless you really know what you're doing. As mentioned, I'm a novice so I really don't know what I'm doing, so I want to check that I won't be making problems for myself with this solution. The rest of the script does some basic analysis and plotting using the variables read in from this file, and data from other files.
(2) If I want to wrap up the code above in a function, e.g. read_params(), is it just a matter of changing the last line to exec(splitline[i-1] + splitline[i] + splitline[i+1], globals())? I understand that this causes exec() to make the assignments in the global namespace. What I don't understand is whether this is safe, and if not why not. (See above about being a novice!)
(1) Is this approach safe?
No, it is not safe. If someone can edit/control/replace params.txt, they can craft it in such a way to allow arbitrary code execution on the machine running the script.
It really depends where and who will run your Python script, and whether they can modify params.txt. If it's just a script run directly on a normal computer by a user, then there's not much to worry about, because they already have access to the machine and can do whatever malicious things they want, without having to do it using your Python script.
(2) If I want to wrap up the code above in a function, e.g. read_params(), is it just a matter of changing the last line to exec(splitline[i-1] + splitline[i] + splitline[i+1], globals())?
Correct. It doesn't change the fact you can execute arbitrary code.
Suppose this is params.txt:
Lx = 512 Ly = 512
g = 400
_ = print("""Holy\u0020calamity,\u0020scream\u0020insanity\nAll\u0020you\u0020ever\u0020gonna\u0020be's\nAnother\u0020great\u0020fan\u0020of\u0020me,\u0020break\n""")
_ = exec(f"import\u0020ctypes")
_ = ctypes.windll.user32.MessageBoxW(None,"Releasing\u0020your\u0020uranium\u0020hexaflouride\u0020in\u00203...\u00202...\u00201...","Warning!",0)
================ Dissipation =====================
nupower = 8 nu = 0
And this is your script:
def read_params():
with open('params.txt', 'r') as infile:
for line in infile:
splitline = line.strip().split(' ')
for i, word in enumerate(splitline):
if word == '=':
exec(splitline[i-1] + splitline[i] + splitline[i+1], globals())
read_params()
As you can see, it has correctly assigned your variables, but it has also called print, imported the ctypes library, and has then presented you with a dialog box letting you know that your little backyard enrichment facility has been thwarted.
As martineau suggested, you can use configparser. You'd have to modify params.txt so there is only one variable per line.
tl;dr: Using exec is unsafe, and not best practice, but that doesn't matter if your Python script will only be run on a normal computer by users you trust. They can already do malicious things, simply by having access to the computer as a normal user.
Is there an alternative to configparser?
I'm not sure. With your use-case, I don't think you have much to worry about. Just roll your own.
This is similar to some of the answers in your other question, but is uses literal_eval and updates the globals dictionary so you can directly use the variables as you want to.
params.txt:
Lx = 512 Ly = 512
g = 400
================ Dissipation =====================
nupower = 8 nu = 0
alphapower = -0 alpha = 0
================ Timestepping =========================
SOMEFLAG = 1
SOMEOTHERFLAG = 4
dt = 2e-05
some_dict = {"key":[1,2,3]}
print = "builtins_can't_be_rebound"
Script:
import ast
def read_params():
'''Reads the params file and updates the globals dict.'''
_globals = globals()
reserved = dir(_globals['__builtins__'])
with open('params.txt', 'r') as infile:
for line in infile:
tokens = line.strip().split(' ')
zipped_tokens = zip(tokens, tokens[1:], tokens[2:])
for prev_token, curr_token, next_token in zipped_tokens:
if curr_token == '=' and prev_token not in reserved:
#print(prev_token, curr_token, next_token)
try:
_globals[prev_token] = ast.literal_eval(next_token)
except (SyntaxError, ValueError) as e:
print(f'Cannot eval "{next_token}". {e}. Continuing...')
read_params()
# We can now use the variables as expected
Lx += Ly
print(Lx, Ly, SOMEFLAG, some_dict)
Output:
1024 512 1 {'key': [1, 2, 3]}
I am trying to obtain the actual A1 values using the Sheetfu library's get_data_range().
When I use the code below, it works perfectly, and I get what I would expect.
invoice_sheet = spreadsheet.get_sheet_by_name('Invoice')
invoice_data_range = invoice_sheet.get_data_range()
invoice_values = invoice_data_range.get_values()
print(invoice_data_range)
print(invoice_values)
From the print() statements I get:
<Range object Invoice!A1:Q42>
[['2019-001', '01/01/2019', 'Services']...] #cut for brevity
What is the best way to get that "A1:Q42" value? I really only want the end of the range (Q42), because I need to build the get_range_from_a1() argument "A4:Q14". My sheet has known headers (rows 1-3), and the get_values() includes 3 rows that I don't want in the get_values() list.
I guess I could do some string manipulation to pull out the text between the ":" and ">" in
<Range object Invoice!A1:Q42>
...but that seems a bit sloppy.
As a quick aside, it would be fantastic to be able to call get_data_range() like so:
invoice_sheet = spreadsheet.get_sheet_by_name('Invoice')
invoice_data_range = invoice_sheet.get_data_range(start="A4", end="")
invoice_values = invoice_data_range.get_values()
...but that's more like a feature request. (Which I'm happy to do BTW).
Author here. Alan answers it well.
I added some methods at Range level to the library, that are simply shortcuts to the coordinates properties.
from sheetfu import SpreadsheetApp
spreadsheet = SpreadsheetApp("....access_file.json").open_by_id('long_string_id')
sheet = spreadsheet.get_sheet_by_name('test')
data_range = sheet.get_data_range()
starting_row = data_range.get_row()
starting_column = data_range.get_column()
max_row = data_range.get_max_row()
max_column = data_range.get_max_column()
This will effectively tell you the max row and max column that contains data in your sheet.
If you use the get_data_range method, the first row and first column typically is 1.
I received a response from the owner of Sheetfu, and the following code provides the information that I'm looking for.
Example code:
from sheetfu import SpreadsheetApp
spreadsheet = SpreadsheetApp("....access_file.json").open_by_id('long_string_id')
sheet = spreadsheet.get_sheet_by_name('test')
data_range = sheet.get_data_range()
range_max_row = data_range.coordinates.row + data_range.coordinates.number_of_rows - 1
range_max_column = data_range.coordinates.column + data_range.coordinates.number_of_columns - 1
As of this writing, the .coordinates properties are not currently documented, but they are usable, and should be officially documented within the next couple of weeks.
So I already tried to check other questions here about (almost) the same topic, however I did not find something that solves my problem.
Basically, I have a piece of code in Python that tries to open the file as a data frame and execute some eye tracking functions (PyGaze). I have 1000 files that I need to analyse and wanted to create a for-loop to execute my code on all the files automatically.
The code is the following:
os.chdir("/Users/Documents//Analyse/Eye movements/Python - Eye Analyse")
directory = '/Users/Documents/Analyse/Eye movements/R - Filtering Data/Filtered_data/Filtered_data_test'
for files in glob.glob(os.path.join(directory,"*.csv")):
#Downloas csv, plot
df = pd.read_csv(files, parse_dates = True)
#Plot raw data
plt.plot(df['eye_x'],df['eye_y'], 'ro', c="red")
plt.ylim([0,1080])
plt.xlim([0,1920])
#Fixation analysis
from detectors import fixation_detection
fixations_data = fixation_detection(df['eye_x'],df['eye_y'], df['time'],maxdist=25, mindur=100)
Efix_data = fixations_data[1]
numb_fixations = len(Efix_data) #number of fixations
fixation_start = [i[0] for i in Efix_data]
fixation_stop = [i[1] for i in Efix_data]
fixation = {'start' : fixation_start, 'stop': fixation_stop}
fixation_frame = pd.DataFrame(data=fixation)
fixation_frame['difference'] = fixation_frame['stop'] - fixation_frame['start']
mean_fixation_time = fixation_frame['difference'].mean() #mean fixation time
final = {'number_fixations' : [numb_fixations], 'mean_fixation_time': [mean_fixation_time]}
final_frame = pd.DataFrame(data=final)
#write everything in one document
final_frame.to_csv("/Users/Documents/Analyse/Eye movements/final_data.csv")
The code is running (no errors), however : it only runs for the first file. The code is not ran for the other files present in the folder/directory.
I do not see where my mistake is?
Your output file name is constant, so it gets overwritten with each iteration of the for loop. Try the following instead of your final line, which opens the file in "append" mode instead:
#write everything in one document
with open("/Users/Documents/Analyse/Eye movements/final_data.csv", "a") as f:
final_frame.to_csv(f, header=False)
I have been working with some code that exports layers individually filled with important data into a folder. The next thing I want to do is bring each one of those layers into a different program so that I can combine them and do some different tests. The current way that I know how to do it is by importing them one by one (as seen below).
fn0 = 'layer0'
f0 = np.genfromtxt(fn0 + '.csv', delimiter=",")
fn1 = 'layer1'
f1 = np.genfromtxt(fn1 + '.csv', delimiter=",")
The issue with continuing this way is that I may have to deal with up to 100 layers at a time, and it would be very inconvenient to have to import each layer individually.
Is there a way I can change my code to do this iteratively so that I can have a code similar to such:
N = 100
for i in range(N)
fn(i) = 'layer(i)'
f(i) = np.genfromtxt(fn(i) + '.csv', delimiter=",")
Please let me know if you know of any ways!
you can use string formatting as follows
N = 100
f = [] #create an empty list
for i in range(N)
fn_i = 'layer(%d)'%i #parentheses!
f.append(np.genfromtxt(fn_i + '.csv', delimiter=",")) #add to f
What I mean by parentheses! is that they are 'important' characters. They indicate function calls and tuples, so you shouldn't use them in variables (ever!)
The answer of Mohammad Athar is correct. However, you should not use the % printing any longer. According to PEP 3101 (https://www.python.org/dev/peps/pep-3101/) it is supposed to be replaced by format(). Moreover, as you have more than 100 files a format like layer_007.csv is probably appreciated.
Try something like:
dataDict=dict()
for counter in range(214):
fileName = 'layer_{number:03d}.csv'.format(number=counter)
dataDict[fileName] = np.genfromtxt( fileName, delimiter="," )
When using a dictionary, like here, you can directly access your data later by using the file name; it is unsorted though, such that you might prefer the list version of Mohammad Athar.