Two columns of data need to be saved as an excel file - python

I have two columns of data, x and y values, and need to save the file as an excel file to be opened in excel. Are there any modules that can help me with this?
The format needs to be xls
The data looks as follows:
4.20985 17.1047
4.82755 16.4046
3.17238 12.1246
4.50796 18.0955
6.04241 21.1016
4.62863 16.4974
4.32245 14.6536
6.48382 19.7664
5.66514 20.1288
6.11072 22.6859
5.55167 15.7504

It looks like that you'd be good with creating a csv file, but since you've asked about xls, here's an example using xlwt module:
import xlwt
data = """
4.20985 17.1047
4.82755 16.4046
3.17238 12.1246
4.50796 18.0955
6.04241 21.1016
4.62863 16.4974
4.32245 14.6536
6.48382 19.7664
5.66514 20.1288
6.11072 22.6859
5.55167 15.7504
"""
# prepare two-dimensional list
data = [map(float, item.split()) for item in data.split('\n') if item]
# create workbook and add sheet
workbook = xlwt.Workbook()
sheet = workbook.add_sheet('Test')
# loop over two-dimensional list and write data
for index, (value1, value2) in enumerate(data):
sheet.write(index, 0, value1)
sheet.write(index, 1, value2)
# save a workbook
workbook.save('test.xls')
Hope that helps.

Related

How to sort data using XLSXWRITER in Python?

Currently I want to convert content in txt file to excel file using XLSXWRITER in python. The txt file contains some information with a specific datetime but it is not sorted from the latest date to oldest date. Is there any way to sort it and convert to excel file?
you can sort the data in your txt file based on the datetime and then convert it to an Excel file using XlsxWriter in Python. Here's an example code to do this:
import datetime
import xlsxwriter
# Read data from the txt file and sort it based on datetime
with open('data.txt', 'r') as f:
data = sorted(f.readlines(),
key=lambda x: datetime.datetime.strptime(x.split(',')[0],
'%Y-%m-%d %H:%M:%S'))
# Create a new Excel file and worksheet
workbook = xlsxwriter.Workbook('data.xlsx')
worksheet = workbook.add_worksheet()
# Write the sorted data to the worksheet
for row, line in enumerate(data):
values = line.strip().split(',')
for col, value in enumerate(values):
worksheet.write(row, col, value)
# Close the workbook
workbook.close()
In this code, we first read the data from the txt file and sort it based on datetime using the sorted() function and datetime.datetime.strptime() method. Then we create a new Excel file using XlsxWriter and add a new worksheet. Finally, we write the sorted data to the worksheet using a nested for loop and the worksheet.write() method.

how to use element as dataframe name when looping over a list

I need to read data from several sheets in a xlsx file, and save data as a dataframe with the same name as sheet name. Here is the code I use. It can read data from different sheets, however, all dataframes are named as temp. How should I change it. Thanks.
import pandas as pd
sheet_name_list = ['sheet1','sheet2','sheet3']
for temp in sheet_name_list:
temp = pd.read_excel("data_spreadsheet.xlsx", sheet_name = temp)
You can use dictionary:
pd_dict = {}
for temp in sheet_name_list:
pd_dict[temp] = pd.read_excel("data_spreadsheet.xlsx", sheet_name=temp)

Read CSV sheet data and created new one

I have a CSV file which have multiple sheets in it. Want to read it sheet by sheet and filter some data and want to create csv file in same format. how can I do that. Please suggest. I was trying it though pandas.ExcelReader but its not working for CSV file.
you can use the following code for this may help!
import pandas as pd
def read_excel_sheets(xls_path):
"""Read all sheets of an Excel workbook and return a single DataFrame"""
print(f'Loading {xls_path} into pandas')
xl = pd.ExcelFile(xls_path)
df = pd.DataFrame()
columns = None
for idx, name in enumerate(xl.sheet_names):
print(f'Reading sheet #{idx}: {name}')
sheet = xl.parse(name)
if idx == 0:
# Save column names from the first sheet to match for append
columns = sheet.columns
sheet.columns = columns
# Assume index of existing data frame when appended
df = df.append(sheet, ignore_index=True)
return df
the resource for this code is the link below:
click here
and for converting it back to csv you can follow the post which link is
attached here

How to import data from .txt file to a specifc excel sheet with Python?

I am trying to automate a process that basically reads in values from text files into certain excel cells. I have a template in excel that will read data from various sheets under certain names. For example, the template will read in data from "Video scores". Video scores is a .txt file that I copy and paste into excel. There are 5 different text files used in each project so it gets tedious after a while and when there are a lot of projects to complete.
How can I import or copy and paste these .txt files into excel to a specified sheet? I have been using openpyxl for the other parts of this project, but I am open to using another library if it can't be done with openpxl.
I've also tried opening and reading a file, but I couldn't figure out how to do what I want with that either. I have found a list of all the files I need, its just a matter of getting them into excel.
Thanks in advance for anyone who helps.
First, import the TXT file into a list in python, i'm asumming the TXT file is like this
1
2
3
4
....
with open(path_txt, "r") as e:
list1 = [i for i in e]
then, we paste the values of the list on the worksheet you need
from openpyxl import load_workbook
wb = load_workbook(path_xlsx)
ws = wb[sheet_name]
ws["A1"] = "values" #just a header
row = 2 #represent the 2 row of the sheet
column = 1 #represent the column "A" of the sheet
for i in list1:
ws.cell(row=row, column=column).value = i #getting the current cell, and writing the value of the list
row += 1 #just setting the current to the next
wb.save(path_xlsx)
Hope this works for you.
Pandas would do the trick!
Approach:
Have a sheet containing path to your files, separator, the corresponding target sheet names
Now read this excel sheet using pandas and iterate over each row for each file details, read the data, write it to new excel sheet of same workbook.
import pandas as pd
file_details_path = r"/Users/path for xl sheet/file details/File2XlDetails.xlsx"
target_sheet_path = r"/Users/path to target xl sheet/File samples/FiletoXl.xlsx"
# create a writer to save the file content in excel
writer = pd.ExcelWriter(target_sheet_path, engine='xlsxwriter')
file_details = pd.read_excel(file_details_path,
dtype = str,
index_col = False
)
def write_to_excel(file, trg_sheet_name):
# writes it to excel
file.to_excel(writer,
sheet_name = trg_sheet_name,
index = False,
)
# loop through each file record
for index, file_dtl in file_details.iterrows():
# you can print and check the row content for reference
print(file_dtl['File_path'])
print(file_dtl['Separator'])
print(file_dtl['Target_sheet_name'])
# reads file
file = pd.read_csv(file_dtl['File_path'],
sep = file_dtl['Separator'],
dtype = str,
index_col = False,
)
write_to_excel(file, file_dtl['Target_sheet_name'])
writer.save()
Hope this helps! Let me know if you run into any issues...

Can I modify specific sheet from Excel file and write back to the same without modifying other sheets using Pandas | openpyxl

I'll try to explain my problem with an example:
Let's say I have an Excel file test.xlsx which has five tabs (aka worksheets): Sheet1, Sheet2, Sheet3, Sheet4 and sheet5. I am interested to read and modify data in sheet2.
My sheet2 has some columns whose cells are dropdowns and those dropdown values are defined in sheet4 and sheet5. I don't want to touch sheet4 and sheet5. (I mean sheet4 & sheet5 have some references to cells on Sheet2).
I know that I can read all the sheets in excel file using pd.read_excel('test.xlsx', sheetnames=None) which basically gives all sheets as a dictionary(OrderedDict) of DataFrames.
Now I want to modify my sheet2 and save it without disturbing others. So is it posibble to do this using Python Pandas library.
[UPDATE - 4/1/2019]
I am using Pandas read_excel to read whatever sheet I need from my excel file, validating the data with the data in database and updating the status column in the excelfile.
So for writing back the status column in excel I am using openpyxl as shown in the below pseudo code.
import pandas as pd
import openpyxl
df = pd.read_excel(input_file, sheetname=my_sheet_name)
df = df.where((pd.notnull(df)), None)
write_data = {}
# Doing some validations with the data and building my write_data with key
# as (row_number, column_number) and value as actual value to put in that
# cell.
at the end my write_data looks something like this:
{(2,1): 'Hi', (2,2): 'Hello'}
Now I have defined a seperate class named WriteData for writing data using openpyxl
# WriteData(input_file, sheet_name, write_data)
book = openpyxl.load_workbook(input_file, data_only=True, keep_vba=True)
sheet = book.get_sheet_by_name(sheet_name)
for k, v in write_data.items():
row_num, col_num = k
sheet.cell(row=row_num, column=col_num).value = v
book.save(input_file)
Now when I am doing this operation it is removing all the formulas and diagrams. I am using openpyxl 2.6.2
Please correct me if I am doing anything wrong! Is there any better way to do?
Any help on this will be greatly appreciated :)
To modify a single sheet at a time, you can use pandas excel writer:
sheet2 = pd.read_excel("test.xlsx", sheet = "sheet2")
##modify sheet2 as needed.. then to save it back:
with pd.ExcelWriter("test.xlsx") as writer:
sheet2.to_excel(writer, sheet_name="sheet2")

Categories