nested drop downs with xlwings - python

I'm trying to generate nested drop downs with xlwings, a python module enabling linking python scripts to VBA functions. I was able to do this using the xslxwriter module using excel's =indirect(cell) formula but I can't seem to find any equivalent in xlwings.

Well trying to refine my question I found the answer.
Here is how to do nested drop downs with xlsxwriter.
import xlsxwriter
# open workbook
workbook = xlsxwriter.Workbook('nested_drop_downs_with_xlsxwriter.xlsx')
# data
countries = ['Mexico', 'USA', 'Canada']
mexican_cities = ['Morelia', 'Cancun', 'Puebla']
usa_cities = ['Chicago', 'Florida', 'Boston']
canada_cities = ['Montreal', 'Toronto', 'Vancouver']
# add data to workbook
worksheet = workbook.add_worksheet("sheet1")
worksheet.write_column(0, 0, countries)
worksheet.write(0, 1, countries[0])
worksheet.write_column(1, 1, mexican_cities)
worksheet.write(0, 2, countries[1])
worksheet.write_column(1, 2, usa_cities)
worksheet.write(0, 3, countries[2])
worksheet.write_column(1, 3, canada_cities)
# name regions
workbook.define_name('Mexico', '=sheet1!$B2:$B4')
workbook.define_name('USA', '=sheet1!$C2:$C4')
workbook.define_name('Canada', '=sheet1!$D2:$D4')
#
worksheet.data_validation('A10', {'validate': 'list', 'source': '=sheet1!$A$1:$A$3'})
worksheet.data_validation('B10', {'validate': 'list', 'source': '=INDIRECT($A$10)'})
workbook.close()
And here is how to do nested drop downs with xlwings:
def main():
wb = xw.Book.caller()
sheet = wb.sheets('Sheet1')
# data
countries = ['Mexico', 'USA', 'Canada']
mexican_cities = ['Morelia', 'Cancun', 'Puebla']
usa_cities = ['Chicago', 'Florida', 'Boston']
canada_cities = ['Montreal', 'Toronto', 'Vancouver']
# add data to workbook
sheet.range('A1:A3').options(transpose=True).value = countries
sheet.range('B1').value = countries[0]
sheet.range('B2').options(transpose=True).value= mexican_cities
sheet.range('C1').value = countries[1]
sheet.range('C2').options(transpose=True).value = usa_cities
sheet.range('D1').value = countries[2]
sheet.range('D2').options(transpose=True).value = canada_cities
# name regions <-------- naming regions with a dollar sign was the fix!
sheet.range('$B$2:$B$4').api.name.set('Mexico')
sheet.range('$C$2:$C$4').api.name.set('USA')
sheet.range('$D$2:$D$4').api.name.set('Canada')
sheet.range('A10').api.validation.delete()
sheet.range('A10').api.validation.add_data_validation(type=3, formula1='=Sheet1!$A$1:$A$3')
sheet.range('B10').api.validation.delete()
sheet.range('B10').api.validation.add_data_validation(type=3, formula1='=INDIRECT($A$10)')
if __name__ == "__main__":
xw.Book("demo1.xlsm").set_mock_caller()
main()

Related

How to extract text and save as excel file using python or JavaScript

How do I extract text from this PDF files where some data is in the form of table while some are key value based data
eg:
https://drive.internxt.com/s/file/78f2d73478b832b2ab55/3edb275967deeca6ad33e7d53f2337c50d5dfb50e0aa525bb7f10d49dff1e2b4
This is what I have tried :
import PyPDF2
import openpyxl
from openpyxl import Workbook
pdfFileObj = open('sample.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
pdfReader.numPages
pageObj = pdfReader.getPage(0)
mytext = pageObj.extractText()
wb = Workbook()
sheet = wb.active
sheet.title = 'MyPDF'
sheet['A1'] = mytext
wb.save('sample.xlsx')
print('Save')
However I'd like the data to be stored in the following format.
This pdf does not have well defined tables, hence cannot use any tool to extract the entire data in one table format. What we can do is read the entire pdf as text. And process each data fields line by line by using regex to extract the data.
Before you move ahead, please install the pdfplumber package for python
pip install pdfplumber
Assumptions
Here are some assumptions that I made for your pdf and accordingly I have written the code.
First line will always contain the title Account History Report.
Second line will contain the names IMAGE All Notes
Third line will contain only the data Date Created in the form of key:value.
Fourth line will contain only the data Number of Pages in the form of key:value.
Fifth line will only contain the data Client Code, Client Name
Starting line 6, a pdf can have multiple data entity, these data entity for eg in this pdf is 2 but can be any number of entity.
Each data entity will contain the following fields:
First line in data entity will contain only the data Our Ref, Name, Ref 1, Ref 2
Second line line will only contain data in the form as present in pdf Amount, Total Paid, Balance, Date of A/C, Date Received
Third line in data entity will contain the data Last Paid, Amt Last Paid, Status, Collector.
Fourth line will contain the column name Date Notes
The subsequent lines will contain data in the form of table until the next data entity is started.
I also assume that each data entity will contain the first data with key Our Ref :.
I assume that the data entity will be separated on the first line of each entity in the pattern of key values as Our Ref :Value Name: Value Ref 1 :Value Ref 2:value
pattern = r'Our Ref.*?Name.*?Ref 1.*?Ref 2.*?'
Please note that the rectangle that I have created(thick black) in above image, I am calling those as data entity.
The final data will be stored in a dictionary(json) where the data entity will have key as dataentity1, dataentity2, dataentity3 based on the number of entities you have in your pdf.
The header details are stored in the json as key:value and I assume that each key will be present in header only once.
CODE
Here is the simple elegant code, that gives you information from the pdf in the form of json. In the output the first few field contains information from the header part, subsequent data entities can be found as data_entity 1 and 2.
In the below code all you need to change is pdf_path.
import pdfplumber
import re
# regex pattern for keys in line1 of data entity
my_regex_dict_line1 = {
'Our Ref' : r'Our Ref :(.*?)Name',
'Name' : r'Name:(.*?)Ref 1',
'Ref 1' : r'Ref 1 :(.*?)Ref 2',
'Ref 2' : r'Ref 2:(.*?)$'
}
# regex pattern for keys in line2 of data entity
my_regex_dict_line2 = {
'Amount' : r'Amount:(.*?)Total Paid',
'Total Paid' : r'Total Paid:(.*?)Balance',
'Balance' : r'Balance:(.*?)Date of A/C',
'Date of A/C' : r'Date of A/C:(.*?)Date Received',
'Date Received' : r'Date Received:(.*?)$'
}
# regex pattern for keys in line3 of data entity
my_regex_dict_line3 ={
'Last Paid' : r'Last Paid:(.*?)Amt Last Paid',
'Amt Last Paid' : r'Amt Last Paid:(.*?)A/C\s+Status',
'A/C Status': r'A/C\s+Status:(.*?)Collector',
'Collector' : r'Collector :(.*?)$'
}
def preprocess_data(data):
return [el.strip() for el in data.splitlines() if el.strip()]
def get_header_data(text, json_data = {}):
header_data_list = preprocess_data(text)
# third line in text of header contains Date Created field
json_data['Date Created'] = re.search(r'Date Created:(.*?)$', header_data_list[2]).group(1).strip()
# fourth line in text contains Number of Pages, Client Code, Client Name
json_data['Number of Pages'] = re.search(r'Number of Pages:(.*?)$', header_data_list[3]).group(1).strip()
# fifth line in text contains Client Code and ClientName
json_data['Client Code'] = re.search(r'Client Code - (.*?)Client Name', header_data_list[4]).group(1).strip()
json_data['ClientName'] = re.search(r'Client Name - (.*?)$', header_data_list[4]).group(1).strip()
def iterate_through_regex_and_populate_dictionaries(data_dict, regex_dict, text):
''' For the given pattern of regex_dict, this function iterates through each regex pattern and adds the key value to regex_dict dictionary '''
for key, regex in regex_dict.items():
matched_value = re.search(regex, text)
if matched_value is not None:
data_dict[key] = matched_value.group(1).strip()
def populate_date_notes(data_dict, text):
''' This function populates date and Notes in the data chunk in the form of list to data_dict dictionary '''
data_dict['Date'] = []
data_dict['Notes'] = []
iter = 4
while(iter < len(text)):
date_match = re.search(r'(\d{2}/\d{2}/\d{4})',text[iter])
data_dict['Date'].append(date_match.group(1).strip())
notes_match = re.search(r'\d{2}/\d{2}/\d{4}\s*(.*?)$',text[iter])
data_dict['Notes'].append(notes_match.group(1).strip())
iter += 1
data_index = 1
json_data = {}
pdf_path = r'C:\Users\hpoddar\Desktop\Temp\sample3.pdf' # ENTER YOUR PDF PATH HERE
pdf_text = ''
data_entity_sep_pattern = r'(?=Our Ref.*?Name.*?Ref 1.*?Ref 2)'
if(__name__ == '__main__'):
with pdfplumber.open(pdf_path) as pdf:
index = 0
while(index < len(pdf.pages)):
page = pdf.pages[index]
pdf_text += '\n' + page.extract_text()
index += 1
split_on_data_entity = re.split(data_entity_sep_pattern, pdf_text.strip())
# first data in the split_on_data_entity list will contain the header information
get_header_data(split_on_data_entity[0], json_data)
while(data_index < len(split_on_data_entity)):
data_entity = {}
data_processed = preprocess_data(split_on_data_entity[data_index])
iterate_through_regex_and_populate_dictionaries(data_entity, my_regex_dict_line1, data_processed[0])
iterate_through_regex_and_populate_dictionaries(data_entity, my_regex_dict_line2, data_processed[1])
iterate_through_regex_and_populate_dictionaries(data_entity, my_regex_dict_line3, data_processed[2])
if(len(data_processed) > 3 and data_processed[3] != None and 'Date' in data_processed[3] and 'Notes' in data_processed[3]):
populate_date_notes(data_entity, data_processed)
json_data['data_entity' + str(data_index)] = data_entity
data_index += 1
print(json_data)
Output :
Result string :
{'Date Created': '18/04/2022', 'Number of Pages': '4', 'Client Code': '110203', 'ClientName': 'AWS PTE. LTD.', 'data_entity1': {'Our Ref': '2118881115', 'Name': 'Sky Blue', 'Ref 1': '12-34-56789-2021/2', 'Ref 2': 'F2021004444', 'Amount': '$100.11', 'Total Paid': '$0.00', 'Balance': '$100.11', 'Date of A/C': '01/08/2021', 'Date Received': '10/12/2021', 'Last Paid': '', 'Amt Last Paid': '', 'A/C Status': 'CLOSED', 'Collector': 'Sunny Jane', 'Date': ['04/03/2022'], 'Notes': ['Letter Dated 04 Mar 2022.']}, 'data_entity2': {'Our Ref': '2112221119', 'Name': 'Green Field', 'Ref 1': '98-76-54321-2021/1', 'Ref 2': 'F2021001111', 'Amount': '$233.88', 'Total Paid': '$0.00', 'Balance': '$233.88', 'Date of A/C': '01/08/2021', 'Date Received': '10/12/2021', 'Last Paid': '', 'Amt Last Paid': '', 'A/C Status': 'CURRENT', 'Collector': 'Sam Jason', 'Date': ['11/03/2022', '11/03/2022', '08/03/2022', '08/03/2022', '21/02/2022', '18/02/2022', '18/02/2022'], 'Notes': ['Email for payment', 'Case Status', 'to send a Letter', '845***Ringing, No reply', 'Letter printed - LET: LETTER 2', 'Letter sent - LET: LETTER 2', '845***Line busy']}}
Now once you got the data in the json format, you can load it in a csv file, as a data frame or whatever format you need the data to be in.
Save as xlsx
To save the same in a xlsx file in the format as shown in the image in the question above. We can use xlsx writer to do the same.
Please install the package using pip
pip install xlsxwriter
From the previous code, we have our entire data in the variable json_data, we will be iterating through all the data entities and write the data to appropriate cell specified by row, col in the code.
import xlsxwriter
workbook = xlsxwriter.Workbook('Sample.xlsx')
worksheet = workbook.add_worksheet("Sheet 1")
row = 0
col = 0
# write columns
columns = ['Account History Report', 'All Notes'] + [ key for key in json_data.keys() if 'data_entity' not in key ] + list(json_data['data_entity1'].keys())
worksheet.write_row(row, col, tuple(columns))
row += 1
column_index_map = {}
for index, col in enumerate(columns):
column_index_map[col] = index
# write the header
worksheet.write(row, column_index_map['Date Created'], json_data['Date Created'])
worksheet.write(row, column_index_map['Number of Pages'], json_data['Number of Pages'])
worksheet.write(row, column_index_map['Client Code'], json_data['Client Code'])
worksheet.write(row, column_index_map['ClientName'], json_data['ClientName'])
data_entity_index = 1
#iterate through each data entity and for each key insert the values in the sheet
while True:
data_entity_key = 'data_entity' + str(data_entity_index)
row_size = 1
if(json_data.get(data_entity_key) != None):
for key, value in json_data.get(data_entity_key).items():
if(type(value) == list):
worksheet.write_column(row, column_index_map[key], tuple(value))
row_size = len(value)
else:
worksheet.write(row, column_index_map[key], value)
else:
break
data_entity_index += 1
row += row_size
workbook.close()
Result :
The above code creates a file sample.xlsx in the working directory.

I am unable to use iter_rows() to iterate excel and add cell values as a list of dictionaries

I am new to python but I have been trying to create a list of dictionaries that reads (openpyxl) from an Excel file. Using iter_rows() to read all the rows in the file and then add each row as a dictionary. The script then appends that dictionary to a list but when viewing the list of dictionaries it only shows the last row (or dictionary) is appended several times. I am not sure why its only appending the last row?
Input Excel file
import openpyxl
# Give the location of the file
path = 'C:\\Users\\.....\\pythonExcelDemo.xlsx'
# workbook object is created
wb_obj = openpyxl.load_workbook(path)
thisList = []
inner_dict = {}
sheet_obj = wb_obj.active
for row in sheet_obj.iter_rows(2, 6, 1, 3):
for cell in row:
if cell.column == 1:
inner_dict.update({'Students Name': cell.value})
if cell.column == 2:
inner_dict.update({'Department': cell.value})
if cell.column == 3:
inner_dict.update({'Fund': cell.value})
thisList.append(inner_dict)
print(thisList)
Output-----
[{'Students Name': 'Keli', 'Department': 'Branch', 'Fund': 160}, {'Students Name': 'Keli', 'Department': 'Branch', 'Fund': 160}, {'Students Name': 'Keli', 'Department': 'Branch', 'Fund': 160}, {'Students Name': 'Keli', 'Department': 'Branch', 'Fund': 160}, {'Students Name': 'Keli', 'Department': 'Branch', 'Fund': 160}]
What you're missing here is key point about Python. You're not creating a set of dictionaries. You're creating exactly ONE dictionary, modifying it during each loop, and creating a list with many references to that one dictionary. When you change one, you change them all. You need to create a new dict every loop. Do this:
for row in sheet_obj.iter_rows(2, 6, 1, 3):
inner_dict = {}
for cell in row:
if cell.column == 1:
inner_dict['Students Name'] = cell.value
elif cell.column == 2:
inner_dict['Department'] = cell.value
elif cell.column == 3:
inner_dict['Fund'] = cell.value
thisList.append(inner_dict)
print(thisList)

Nested dictionary groups from excel

I'm new in python and openpyxl. I started to learn in order to make my every day tasks easier and faster at my workplace.
Task:
There is an excel file with a lots of rows, looks like this
excel file
I want to create a daily report based on this excel file. In my example Today is 2019/05/08.
Expected result:
Only show the info where the date is match with Today date.
Expected structure:
required outcome
My solution
In my solution I create a list of the rows where I can find only the Today values. After that I read only that rows and create dictionaries. But the result is nothing. I also in a trouble about how to work with multiple keys. Because there are multiple issue numbers are in the list.
from datetime import datetime
import openpyxl
from openpyxl import load_workbook
from openpyxl.utils import get_column_letter
from openpyxl.utils import column_index_from_string
#Open excel file
excel_path = "\\REE.xlsx"
wb = openpyxl.load_workbook(excel_path, data_only=True)
ws_1 = wb.worksheets[1]
#The Today date. need some format due to excel date handling
today = datetime.today()
today = today.replace(hour=00, minute=00, second=00, microsecond=00)
#Crate a list of the lines where only Today values are present
issue_line_list = []
for cell in ws_1["B"]:
if cell.value == today:
issue_line = cell.row
issue_line_list.append(issue_line)
#Creare a txt file for output
file = open("daily_report.txt", "w")
#The dict what I want to use
dict = []
issue_numbers_list = []
issue = []
#Create a dict for the issues
for line in issue_line_list:
issue_number_value = ws_1.cell(row = line, column = 3).value
issue_numbers_list.append(issue_number_value)
#Create a dict for other information
for line in issue_line_list:
issue_number_value = ws_1.cell(row = line, column = 3).value
by_value = ws_1.cell(row = line, column = 2 ).value
group_value = ws_1.cell(row = line, column = 4).value
events_value = ws_1.cell(row = line, column = 5).value
deadline_value = ws_1.cell(row = line, column = 6).value
try:
deadline_value = deadline_value.strftime('%Y.%m.%d')
except:
deadline_value = ""
issue.append(issue_number_value)
issue.append(by_value)
issue.append(group_value)
issue.append(events_value)
issue.append(deadline_value)
issue.append(deadline_value)
#Append the two dict
dict.append(issue_numbers_list)
dict.append(issue)
#Save it to the txt file.
file.write(dict)
file.close()
Questions
- How to solve the multiple same key issue?
- How to create nested groups?
- What should add or delete to my code in order to get the expected result?
Remark
Openpyxl is not only option. If you have a bettwer/easier/faster way I open for every idea.
Thank you in advance for you support!
Can you try the following:
import pandas as pd
cols = ['date', 'by', 'issue_number', 'group', 'events', 'deadline']
req_cols = ['events', 'deadline']
data = [
['2019-05-07', 'john', '113140', '#issue_closed', 'something different', ''],
['2019-05-08', 'david', '113140', '#task', 'something different', ''],
['2019-05-08', 'victor', '114761', '#task_result', 'something different', ''],
['2019-05-08', 'john', '114761', '#task', 'something different', '2019-05-10'],
['2019-05-08', 'david', '114761', '#task',
'something different', '2019-05-08'],
['2019-05-08', 'victor', '113140', '#task_result', 'something different', ''],
['2019-05-07', 'john', '113140', '#issue_created',
'something different', '2019-05-09'],
['2019-05-07', 'david', '113140', '#location', 'something different', ''],
['2019-05-07', 'victor', '113140', '#issue_closed', 'something different', 'done'],
['2019-05-07', 'john', '113140', '#task_result', 'something different', ''],
['2019-05-07', 'david', '113140', '#task',
'something different', '2019-05-10'],
]
df = pd.DataFrame(data, columns=cols)
df1 = df.groupby(['issue_number', 'group']).describe()[req_cols].droplevel(0, axis=1)['top']
df1.columns = req_cols
print(df1)
Output:
events deadline
issue_number group
113140 #issue_closed something different done
#issue_created something different 2019-05-09
#location something different
#task something different 2019-05-10
#task_result something different
114761 #task something different 2019-05-08
#task_result something different
To open an excel file, you can do the following:
df = pd.read_excel(excel_path, sheet_name=my_sheet)
req_cols = ['EVENTS', 'DEADLINE']
df1 = df.groupby(['ISSUE NUMBER', 'GROUP']).describe()[req_cols].droplevel(0, axis=1)['top']
df1.columns = req_cols
print(df1)
The task almost solved, but I faced a new issue.
The code:
excel_path = "\\REE.xlsx"
my_sheet = 'Events'
cols = ['DATE', 'BY', 'ISSUE NUMBER', 'GROUP', 'EVENTS', 'DEADLINE']
req_cols = ['EVENTS', 'DEADLINE']
df = pd.read_excel(excel_path, sheet_name = my_sheet, columns=cols)
today = datetime.today().strftime('%Y-%m-%d')
today_filter = (df[(df['DATE'] == today)])
df = pd.DataFrame(today_filter, columns=cols)
df1 = df.groupby(['ISSUE NUMBER', 'GROUP']).describe()[req_cols].droplevel(0, axis=1['top']
df1.columns = req_cols
print(df1)
On the 'BY' column there are same values. eg. '#task'. But the script print only once.
int his case
Required result:
114761
#task Jane another words 2019-05-10
#task result John something
#task John something else 2019-05-08
...
...
...
...
My code result:
114761
#task Jane another words 2019-05-10
#task result John something
...
...
...
John #task something else 2019-05-08 do not print it out. Why?
And there is a some result in other options also. If there are more some values at'BY' column the script print out only the first and skip the rest.

Connecting data in python to spreadsheets

I have a dictionary in Python 2.7.9. I want to present the data in my dictionary in a spreadsheet. How can I accomplish this? Note, the dictionary has over 15 different items inside.
Dictionary:
{'Leda Doggslife': '$13.99', 'Carson Busses': '$29.95', 'Derri Anne Connecticut': '$19.25', 'Bobbi Soks': '$5.68', 'Ben D. Rules': '$7.50', 'Patty Cakes': '$15.26', 'Ira Pent': '$16.27', 'Moe Tell': '$10.09', 'Ido Hoe': '$14.47', 'Ave Sectomy': '$50.85', 'Phil Meup': '$15.98', 'Al Fresco': '$8.49', 'Moe Dess': '$19.25', 'Sheila Takya': '$15.00', 'Earl E. Byrd': '$8.37', 'Rose Tattoo': '$114.07', 'Gary Shattire': '$14.26', 'Len Lease': '$11.11', 'Howie Kisses': '$15.86', 'Dan Druff': '$31.57'}
Are you trying to write your dictionary in a Excel Spreadsheet?
In this case, you could use win32com library:
import win32com.client
xlApp = win32com.client.DispatchEx('Excel.Application')
xlApp.Visible = 0
xlBook = xlApp.Workbooks.Open(my_filename)
sht = xlBook.Worksheets(my_sheet)
row = 1
for element in dict.keys():
sht.Cells(row, 1).Value = element
sht.Cells(row, 2).Value = dict[element]
row += 1
xlBook.Save()
xlBook.Close()
Note that this code will work just if the workbook already exists.
Otherwise:
import win32com.client
xlApp = win32com.client.DispatchEx('Excel.Application')
xlApp.Visible = 0
xlBook = xlApp.Workbooks.Add()
sht = xlBook.Worksheets(my_sheet)
row = 1
for element in dict.keys():
sht.Cells(row, 1).Value = element
sht.Cells(row, 2).Value = dict[element]
row += 1
xlBook.SaveAs(mw_filename)
xlBook.Close()
I hope it will be the right answer to your question.

Creating multiple dictionary variables with loop commands?

This is my first time working with python. I'm trying to create a dictionary for each county (23 in total) with year as the key for population and income values. Strong arming the code seems to work, but I'm sure there is an easier way to do it using loops or classes...any suggestions?? Thanks!!!!!
import xlrd
wb= xlrd.open_workbook('C:\Python27\Forecast_test.xls')
popdata=wb.sheet_by_name(u'Sheet1')
incomedata=wb.sheet_by_name(u'Sheet2')
WyomingCnty =('Albany', 'Big Horn',
'Campbell', 'Carbon', 'Converse',
'Crook', 'Fremont', 'Goshen',
'Hot Springs','Johnson', 'Laramie',
'Lincoln', 'Natrona','Niobrara',
'Park', 'Platte', 'Sheridan', 'Sublette',
'Sweetwater', 'Teton', 'Uinta', 'Washakie', 'Weston','Wyoming')
Years = ('y0','y1','y2','y3','y4','y5','y6','y7','y8','y9','y10',
'y11','y12', 'y13', 'y14', 'y15', 'y16', 'y17', 'y18','y19',
'y20','y21','y22','y23','y24','y25','y26','y27','y28','y29','y30')
AlbanyPop = popdata.col_values(colx=1,start_rowx=1,end_rowx=None)
AlbanyIncome= incomedata.col_values(colx=1,start_rowx=1,end_rowx=None)
AlbanyDict1=dict(zip(Years,AlbanyPop))
AlbanyDict2=dict(zip(Years,AlbanyIncome))
BigHornPop = popdata.col_values(colx=2,start_rowx=1,end_rowx=None)
BigHornIncome= incomedata.col_values(colx=2,start_rowx=1,end_rowx=None)
BigHornDict1=dict(zip(Years,BigHornPop))
BigHornDict2=dict(zip(Years,BigHornIncome))
popdict = {}
incdict = {}
for ix, city in enumerate(WyomingCnty):
popdict[city] = dict(zip(Years, popdata.col_values(colx=ix + 1,start_rowx=1,end_rowx=None)
incdict[city] = dict(zip(Years, incomedata.col_values(colx=ix + 1,start_rowx=1,end_rowx=None)
I would just use another dictionary. As in:
import xlrd
wb= xlrd.open_workbook('C:\Python27\Forecast_test.xls')
popdata=wb.sheet_by_name(u'Sheet1') #Import population data
incomedata=wb.sheet_by_name(u'Sheet2') #Import income data
WyomingCnty =('Albany', 'Big Horn',
'Campbell', 'Carbon', 'Converse',
'Crook', 'Fremont', 'Goshen',
'Hot Springs','Johnson', 'Laramie',
'Lincoln', 'Natrona','Niobrara',
'Park', 'Platte', 'Sheridan', 'Sublette',
'Sweetwater', 'Teton', 'Uinta', 'Washakie', 'Weston','Wyoming')
Years = ('y0','y1','y2','y3','y4','y5','y6','y7','y8','y9','y10',
'y11','y12', 'y13', 'y14', 'y15', 'y16', 'y17', 'y18','y19',
'y20','y21','y22','y23','y24','y25','y26','y27','y28','y29','y30')
county_dict = {}
for col, county in enumerate(WyomingCnty):
county_dict[county] = {}
county_popdata = popdata.col_values(colx=col, start_rowx=1, end_rowx=None)
county_incdata = incomedata.col_values(colx=col, start_rowx=1, endrowx=None)
county_dict[county]['population'] = county_popdata
county_dict[county]['income'] = county_incdata
county_dict[county]['pop_by_year'] = dict(zip(Years, county_popdata))
county_dict[county]['inc_by_year'] = dict(zip(Years, county_incdata))

Categories