I have this:
dic_sheets = {}
for y in xl_files[]
dic_sheets.update({y:[]})
I want to populate the tables in the dictionary (dic_sheets) for each key(y) with the individual sheets inside of the excel document.
I do not know how many sheets are inside of the excel document; I don't have an index number to stop a range (x,y,z) loop.
Another way to put it: I want to dump x-number of excel files into the active directory and have each files sheets populate in a dictionary when I run the .py in CMD.
Can anyone help me achieve this goal?
xl_files contains "ExcelFile" data "pandas.io.excel.ExcelFile object at 0x0FF6B0D0
Edit: y represents individual excel files
Edit2: I need only the sheet names (or their unique index numbers) to populate, (i.e. 'sheet1', 'pivot2'). I'm not yet concerned with cells in the sheets.
Edit3: I already have the table ‘xl_files’ generated to contain every excel file in the cwd
I figured it out!
I had to use a for loop and the return function as an object, then combine it with another object of the array.append function and return function with a new array.
I'll try to word my questions better in the future, as I did not get a bite this round.
Related
I have a list (named df_split) that stores 576 data frames, and each data frame has 50 rows.
I want to iterate through each data frame and save it as a separate CSV file inside a folder.
I tried the following code but it only saved the last data frame as a CSV file inside the location that I specified.
In this case, I assume I should have also coded the file name for each data frame to be something like file1.csv, file2.csv, etc., but my skills aren't enough.
Can somebody kindly suggest some example solutions?
Here is the code that I tried:
for i in df_split: i.to_csv('./file.csv')
Use enumerate for counter for new file names:
for i, df in enumerate(df_split, 1):
df.to_csv(f'file_{i}.csv')
You could pass the i into the filename as well -
for i in df_split:
i.to_csv(f'file_{i}.csv')
In my code,
Samp_size
MSI
MRI
M_ASRS
a_h
d_h
a_v
d_v
max_hor_vel
max_ver_vel
These are randomly generated parameters above.Each can all take different numbers of values, lets say each has 2 different value.
I print them as dataframes to an excel file each in different sheets.(sheet1,sheet2 etc.)
So I have 2^10 different parameter set. How can I print the all the solutions for all this parameter set in order to an excel file?
It seems like you want to combine sheets into one single excel file, if so, the following post will solve your problem: Combine Multiple Excel sheets within Workbook into one sheet Python
The code from # And then append all the Workbooks into single Excel Workbook sheet will help.
Edit: I found out a solution to my question. More or less look at the user manual for openPyxl instead of online tutorials, the tutorials ran errors when I tried them (I tried more than one) and their thought process was significantly different from the thought process in the user manual. And also I ended up not using pandas as much as I thought I would need to.
I am trying to append certain values in an Excel file with multiple sheets based on user inputs and then rewrite it to the Excel file (without deleting the rest of the sheets). So far I have tried this which seems to combine the data but I didn't quite see how it applied to what I am doing since I want to append a part of a sheet instead of rewrite the whole excel file. I have also tried a few other things with ExcelWriter but I don't quite understand it since it usually wipes all the data in the file (I may be using it wrong).
episode_dataframe = pd.read_excel (r'All_excerpts (Siena Copy)_test.xlsx', sheet_name=episode)
#episode is a specified string inputted by user, this line makes a data frame for the specified sheet
episode_dataframe.loc[(int(pass_num) - 1), 'Resources'] = resources
#resources is also a user inputted string, it's what I am trying to append the spreadsheet cell value to, this appends to corresponding data frame
path_R = open("All_excerpts (Siena Copy)_test.xlsx", "rb")
with pd.ExcelWriter(path_R) as writer:
writer.book = openpyxl.load_workbook(path_R)
#I copied this from [here][3], i think it should make the writer for the to_excel? I don't fully know
episode_dataframe.to_excel(writer, sheet_name=episode, engine=openpyxl, if_sheet_exsits ='replace')
#this should write the sheet data frame onto the file, but I don't want it to delete the other sheets
Additionally, I have been running into a bunch of other smaller errors, a big one was Workbook' object has no attribute 'add worksheet' even though I'm not trying to add a worksheet, also I could not get their solution to work.
I am a bit of a novice at python, so my code might be a bit of a mess.
i'm working with openpyxl on a .xlsx file which has around 10K products, of which some are "regular items" and some are products that need to be ordered when required. For the project I'm doing I would like to delete all of the rows containing the items that need to be ordered.
I tested this with a small sample size of the actual workbook and did have the code working the way I wanted to. However when I tried this in the actual workbook with 10K rows it seems to be taking forever to delete those rows (it has been running for nearly and hour now).
Here's the code that I used:
wb = openpyxl.load_workbook('prod.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')
def clean_workbook():
for row in sheet:
for cell in row:
if cell.value == 'ordered':
sheet.delete_rows(cell.row)
I would like to know is there a faster way of doing this with some tweaks in my code? Or is there a better way to just read just the regular stock from the workbook without deleting the unwanted items?
Deleting rows in loops can be slow because openpyxl has to update all the cells below the row being deleted. Therefore, you should do this as little as possible. One way is to collect a list of row numbers, check for contiguous groups and then delete using this list from the bottom.
A better approach might be to loop through ws.values and write to a new worksheet filtering out the relevant rows. Copy any other relevant data such as formatting, etc. Then you can delete the original worksheet and rename the new one.
ws1 = wb['My Sheet']
ws2 = wb.create_sheet('My Sheet New')
for row in ws1.values:
if row[x] == "ordered": # we can assume this is always the same column
continue
ws2.append(row)
del wb["My Sheet"]
ws2.title = "My Sheet"
For more sophisticated filtering you will probably want to load the values into a Pandas dataframe, make the changes and then write to a new sheet.
You can open with read-only mode, and import all content into a list, then modify in list is always a lot more faster than working in excel. After you modify the list, made a new worksheet and upload your list back to excel. I did this way with my 100k items excel .
I'm doing some testing using python-excel modules. I can't seem to find a way to delete a row in an excel sheet using these modules and the internet hasn't offered up a solution. Is there a way to delete a row using one of the python-excel modules?
In my case, I want to open an excel sheet, read the first row, determine if it contains some valid data, if not, then delete it.
Any suggestions are welcome.
xlwt provides as the module name suggests Excel writer (creation rather than modification) funcionality.
xlrd on the other hand provides Excel reader funcionality.
If your source excel file is rather simple (no fancy graphs, pivot tables, etc.), you should proceed this way:
with xlrd module read the contents of the targeted excel file, and then with xlwt module create new excel file which contains the necessary rows.
If you, however are running this on windows platform , you might be able to manipulate Excel directly through Microsoft COM objects, see old book reference.
I was having the same issue but found a walk around:
Use a custom filter process (Reader>Filter1>Filter2>...>Writer) to generate a copy of the source excel file but with a blank column inserted at the front. Let's call this file augmented.xls.
Then, read augmented.xls into a xlrd.Workbook object, rb, using xlrd.open_workbook().
Use xlutils.copy.copy() to convert rb into a xlwt.Workbook object, wb.
Set the value of the first column of each of the to-be-deleted rows as "x" (or other values as a marker) in wb.
Save wb back to augmented.xls.
Use another custom filter process to generate a resulting excel file from augmented.xls by omitting those rows with "x" in the first column and shifting all columns one column left (equivalent to deleting the first column of markers).
Information and examples of defining a filter process can be found in http://www.simplistix.co.uk/presentations/python-excel.pdf
Hope this help in some way.
You can use the library openpyxl. When opening a file it is both for reading and for writing. Then, with a simple function you can achieve that:
from openpyxl import load_workbook
wb = load_workbook(filename)
ws = wb.active()
first_row = ws[1]
# Your code here using first_row
if first_row not valid:
ws.delete_rows(1, amount=1)