python Get excel sheet names in array, and put it in condition - python

What I want is load excel sheet which is ("gx_projectid.xlsx") for my example. After then get the sheet names and put them in an array. After then if these sheet names ending with "_ID" I want to separate them. For this code they: [0],[1],[2],[3],[4],[11] and I want to access and work on them later with "wb_obj.worksheets[x]".
import openpyxl
from openpyxl.styles import Font
wb = openpyxl.load_workbook("gx_projectid.xlsx") ## EXCELI YUKLEME
sheet = wb.active
Sheet_Names = [wb.sheetnames]
print("original sheet names:", Sheet_Names)
sheets = []
for row in Sheet_Names:
for cell in row:
sheets.append(cell.split())
print("Put it in an array : ",sheets)
my current output:
original sheet names: [['Reserved_ID', 'PowerLED_ID', 'RC_ID', 'Brand_ID', 'Product_ID', 'Panel', 'EDID', 'Cabinet', 'DEC', 'EnergyClass', 'CompatibleConfig', 'Project_ID', 'Project-id']]
Put it in an array : [['Reserved_ID'], ['PowerLED_ID'], ['RC_ID'], ['Brand_ID'], ['Product_ID'], ['Panel'], ['EDID'], ['Cabinet'], ['DEC'], ['EnergyClass'], ['CompatibleConfig'], ['Project_ID'], ['Project-id']]
idk how to check if the sheet names ends with "_ID" because i tried:
for i in range (len(sheets)):
print("sheet names",[i],": ",sheets[i])
# if sheets[i].endswith("_ID']"):
and I got some error because its list not a string type.

First a small tip, try to name your variables and functions with snake_case format. CamelCase naming format is mainly used for naming classes in python. I reccomend looking up pep8 documents.
Now let's get to the main problem. You are trying to use a str function on a list data type. First you should convert your data to str:
for i in range (len(sheets)):
str_sheet_name = str(sheets[i]) # converting to str
if str_sheet_name.endswith("_ID']"):
print(str_sheet_name)
this should work. please let me know

Related

Is there a way to duplicate a worksheet using gspread in python?

So I've been trying to create a data store using google sheets (It's easier for me to navigate). The way I'm trying to do this is by creating a new worksheet with the user's ID and putting the information I have saved in a separate worksheet called 'template' into the new worksheet.
This is my current code:
newsheet = sh.add_worksheet(title = f"{author}", rows = "152", cols = "2")
for index in range(1, len(savefiletemplate.col_values(1))):
newsheet.update_cell(index, 1, savefiletemplate.cell(index, 1).value)
author is the user's ID, sh is my spreadsheet and savefiletemplate is my template worksheet
It gives me a very long error that I don't understand after copying 90-120 cells. I was wondering if this is my fault or my IDE's fault and if anyone knows how to fix it
I would try something like this...
rows = 152
cols = 2
template_data = savefiletemplate.get_all_values() # this will be a list of lists
newsheet = sh.add_worksheet(title=author, rows=rows, cols=cols)
a1_range = f"A1:{chr(ord('#')+cols)}{rows}" # A1:B152
newsheet.update(a1_range, template_data)
This basically takes your template and puts it into a list of lists. Then uses the rows and cols value to build the proper a1 notation for the range. Then updates the range in the new sheet with the template_data. Give it a shot and see if it does what you're looking for.

Python Xlsx writing format advice

I've created a list and a for loop to iterate over each item in it to print it to a cell in excel. I'm using openpyxl. When I first started using it using easy statements like:
sheet["A1"] = "hello"
results in Cell A1 perfectly representing the hello value, without quotation marks.
I have this code:
workbook = Workbook()
sheet = workbook.active
text = ["Whistle", "Groot", "Numbers", "Mulan", "Buddy Holly"]
other = [5, 8, 100, 120]
for i in range(1,len(text)+1):
cell_letter = "A"
cell_number = str(i)
sheet[str((cell_letter + cell_number))] = str(text[i-1:i])
and it writes to the corresponding cell locations with the iterations over the variable "text". But when i open the file the format is ['Whistle'] and ['Groot']
What am I missing? Should I be passing each iteration to another variable to convert it from a list to a tuple for it to be written in then?
Sorry if my code seems a bit messy, I've literally just learned this over the past few hours and it's (kind of) doing what I need it to do, with the exception of the writing format.
Openpyxl let you write a list of lists, where the intern lists represents the 'lines' in a xlsx file.
So, you can store what you want as:
data_to_write = [["Whistle", "Groot", "Numbers", "Mulan", "Buddy Holly"]]
or, if you want some data in the next line:
data_to_write = [["Whistle", "Groot", "Numbers"], ["Mulan", "Buddy Holly"]]
then, add it to your WorkSheet:
for line in data_to_write:
sheet.append(line)
and, finally, save it:
workbook.save("filename.xlsx")
The full code could be something like:
from openpyxl import Workbook
workbook = Workbook()
sheet = workbook.active
data_to_write = [["Whistle", "Groot", "Numbers", "Mulan", "Buddy Holly"]]
for line in data_to_write:
sheet.append(line)
workbook.save('example.xlsx')
Give it a try and, then, give me a feedback, please XD

Convert excel file with many sheets (with spaces in the name of the shett) in pandas data frame

I would like to convert an excel file to a pandas dataframe. All the sheets name have spaces in the name, for instances, ' part 1 of 22, part 2 of 22, and so on. In addition the first column is the same for all the sheets.
I would like to convert this excel file to a unique dataframe. However I dont know what happen with the name in python. I mean I was hable to import them, but i do not know the name of the data frame.
The sheets are imported but i do not know the name of them. After this i would like to use another 'for' and use a pd.merge() in order to create a unique dataframe
for sheet_name in Matrix.sheet_names:
sheet_name = pd.read_excel(Matrix, sheet_name)
print(sheet_name.info())
Using only the code snippet you have shown, each sheet (each DataFrame) will be assigned to the variable sheet_name. Thus, this variable is overwritten on each iteration and you will only have the last sheet as a DataFrame assigned to that variable.
To achieve what you want to do you have to store each sheet, loaded as a DataFrame, somewhere, a list for example. You can then merge or concatenate them, depending on your needs.
Try this:
all_my_sheets = []
for sheet_name in Matrix.sheet_names:
sheet_name = pd.read_excel(Matrix, sheet_name)
all_my_sheets.append(sheet_name)
Or, even better, using list comprehension:
all_my_sheets = [pd.read_excel(Matrix, sheet_name) for sheet_name in Matrix.sheet_names]
You can then concatenate them into one DataFrame like this:
final_df = pd.concat(all_my_sheets, sort=False)
You might consider using the openpyxl package:
from openpyxl import load_workbook
import pandas as pd
wb = load_workbook(filename=file_path, read_only=True)
all_my_sheets = wb.sheetnames
# Assuming your sheets have the same headers and footers
n = 1
for ws in all_my_sheets:
records = []
for row in ws._cells_by_row(min_col=1,
min_row=n,
max_col=ws.max_column,
max_row=n):
rec = [cell.value for cell in row]
records.append(rec)
# Make sure you don't duplicate the header
n = 2
# ------------------------------
# Set the column names
records = records[header_row-1:]
header = records.pop(0)
# Create your df
df = pd.DataFrame(records, columns=header)
It may be easiest to call read_excel() once, and save the contents into a list.
So, the first step would look like this:
dfs = pd.read_excel(["Sheet 1", "Sheet 2", "Sheet 3"])
Note that the sheet names you use in the list should be the same as those in the excel file. Then, if you wanted to vertically concatenate these sheets, you would just call:
final_df = pd.concat(dfs, axis=1)
Note that this solution would result in a final_df that includes column headers from all three sheets. So, ideally they would be the same. It sounds like you want to merge the information, which would be done differently; we can't help you with the merge without more information.
I hope this helps!

How do I get data from multiple excel files and use them to create a new sheet

I have two sheets.
Sheet 1 has a comprehensive list of id numbers and their categories.
Sheet 2 has a shorter list of id numbers and associated values.
Goals:
1. look at each id and value in sheet 1
2. use that information to get the category of each value from sheet 2
3. sort the values by category into different columns in a new sheet
I'm trying to teach myself how to automate checking excel files, but I'm having a hard time. I've tried using openpyxl and xlrd. I don't know which is better for this situation.
My last attempt was using xlrd. I tried to use a for loop to get the contents of each row in the form of a list. I got stuck trying to iterate over every row except the first one.
import xlrd
#Opening workbook
wb1 = xlrd.open_workbook('DummySheet1.xlsx') #id and values
wb2 = xlrd.open_workbook('DummySheet2.xlsx') #id and category
#Opening specific sheets
wb1sheet1 = wb1.sheet_by_index(0)
print wb1sheet1.name
wb2sheet1 = wb2.sheet_by_index(0)
print wb2sheet1.name
#function for checking cells
def check():
for i in range(wb1sheet1.nrows):
list= wb1sheet1.row_values(i)
print(list)
check()
1.Read the sheet using the following code, for more refer HERE:
import pandas as pd
wb1 = pd.read_excel('DummySheet1.xlsx')
wb2 = pd.read_excel('DummySheet2.xlsx')
2.Merge the data using the merge command, refer HERE:
result = pd.merge(wb1, wb2, on=id)
3.Sort the values using the following code, refer HERE:
result = result.sort_values(by ='category' )
4.Write the data to a new sheet, refer HERE:
result.to_excel(newFileName)
Check if this works for you.

Xlsxwriter: format three cell ranges in same worksheet

I would like to format A1:E14 as US Dollars, F1:K14 as percentages and A15:Z1000 as US Dollars. Is there a way to do this in XlsxWriter?
I know how to format full columns as Dollars/Percentages, but I don't know how to format parts of columns -- whatever I do last will overwrite Columns F:K.
Data is starting in pandas so happy to solve the problem there. The following does not seem to work:
sheet.set_column('A1:E14', None, money_format)
More Code:
with pd.ExcelWriter(write_path) as writer:
book = writer.book
money_fmt = book.add_format({'num_format': '$#,##0'})
pct_fmt = book.add_format({'num_format': '0.00%'})
# call func that creates a worksheet named total with no format
df.to_excel(writer, sheet_name='Total', startrow=0)
other_df.to_excel(writer, sheet_name='Total', startrow=15)
writer.sheets['Total'].set_column('A1:E14',20, money_fmt)
writer.sheets['Total'].set_column('F1:K14',20, pct_fmt)
writer.sheets['Total'].set_column('F15:Z1000', 20, money_fmt)
I cannot see a way to achieve per cell formatting using just xlsxwriter with Pandas, but it would be possible to apply the formatting in a separate step using openpyxl as follows:
import openpyxl
def write_format(ws, cell_range, format):
for row in ws[cell_range]:
for cell in row:
cell.number_format = format
sheet_name = "Total"
with pd.ExcelWriter(write_path) as writer:
write_worksheet(df, writer, sheet_name=sheet_name)
wb = openpyxl.load_workbook(write_path)
ws = wb.get_sheet_by_name(sheet_name)
money_fmt = '$#,##0_-'
pct_fmt = '0.00%'
write_format(ws, 'A1:G1', money_fmt)
write_format(ws, 'A1:E14', money_fmt)
write_format(ws, 'F1:K14', pct_fmt)
write_format(ws, 'F15:Z1000', money_fmt)
wb.save(write_path)
When attempted with xlsxwriter, it always overwrites the existing data from Pandas. But if Pandas is then made to re-write the data, it then overwrites any applied formatting. There does not appear to be any method to apply formatting to an existing cell without overwriting the contents. For example, the write_blank() function states:
This method is used to add formatting to a cell which doesn’t contain
a string or number value.

Categories