How to delete rows in Excel with Python - python

I would like to delete some rows in my Excel's file with Python.
File
In fact the interesting part for my use begins Row 6 with "30". I'm looking for a good way to delete the rows that are above.
How should I do that?
Thanks.

If you're not opposed to use Pandas, then try this:
import pandas as pd
table = pd.read_excel("file.xlsx", skiprows=list(range(5)))
Here using skiprows will not read in the first 5 rows, then if you want to save it as an excel file:
table.to_excel("new_file.xlsx")

Related

How do you read rows from a csv file and store it in an array using Python codes?

I have a CSV file, diseases_matrix_KNN.csv which has excel table.
Now, I would like to store all the numbers from the row like:
Hypothermia = [0,-1,0,0,0,0,0,0,0,0,0,0,0,0]
For some reason, I am unable to find a solution to this. Even though I have looked. Please let me know if I can read this type of data in the chosen form, using Python please.
most common way to work with excel is use Pandas.
Here is example:
import pandas as pd
df = pd.read_excel(filename)
print (df.iloc['Hypothermia']). # gives you such result

Delete rows in CSV file after being read by pandas

So I want to have 1 script writing continually to a CSV file, and another script reading periodically from that same CSV file.
What I'm looking for is a way to delete the rows I've just read in from the CSV file (not from my pandas dataframe).
Can anybody help?
# Read data in to dataframe
deviceInfo = pd.read_csv("sampleData.csv", nrows = 100)
# Somehow delete those 100 rows from the CSV file
#JoseAngelSanchez is correct that you might want to read the whole csv into a dataframe, but I think this way lets you get a dataframe with the first 100 rows and still delete them from the csv file.
import pandas as pd
df = pd.read_csv("sampleData.csv")
deviceInfo = df.iloc[:100]
df.iloc[100:].to_csv("sampleData.csv")
Note: if you're doing this repetitively then you'll probably want to write to_csv(...,index=None) or a new index column will be created in the .csv file on each iteration.
You should read the whole document and then delete the rows you don't want
import pandas as pd
df = pd.read_csv("sampleData.csv")
df = df.iloc[100:]
df.to_csv("sampleData.csv")

Pandas excel to python for long column

So I'm very new to python and I'm using Pandas to read an excel file, my file column is having 197 values to it, so when I read them with Pandas, I don't get all of the values " as shown in the picture"
not the full excel sheet is appearing
import pandas as pd
xl =pd.ExcelFile('test.xlsx')
sheet1 = xl.parse()
z=str(sheet1)
z=z.replace('212/',"")
z=z.replace('/1',"")
print(z)
Thanks for helping.
Is your question to show those values? What you see is normal behavior. If you want see specific rows, try loc or iloc.

Creating a dataframe from a csv file in pandas: column issue

I have a messy text file that I need to sort into columns in a dataframe so I
can do the data analysis I need to do. Here is the messy looking file:
Messy text
I can read it in as a csv file, that looks a bit nicer using:
import pandas as pd
data = pd.read_csv('phx_30kV_indepth_0_0_outfile.txt')
print(data)
And this prints out the data aligned, but the issue is that the output is [640 rows x 1 column]. And I need to separate it into multiple columns and manipulate it as a dataframe.
I have tried a number of solutions using StringIO that have worked here before, but nothing seems to be doing the trick.
However, when I do this, there is the issue that the
delim_whitespace=True
Link to docs ^
df = pd.read_csv('phx_30kV_indepth_0_0_outfile.txt', delim_whitespace=True)
Your input file is actually not in CSV format.
As you provided only .png picture, it is even not clear, whether this file
is divided into rows or not.
If not, you have to start from "cutting" the content into individual lines and
read the content from the output file - result of this cutting.
I think, this is the first step, before you can use either read_csv or read_table (of course, with delim_whitespace=True).

Using Python to load template excel file, insert a DataFrame to specific lines and save as a new file

I'm having troubles writing something that I believe should be relatively easy.
I have a template excel file, that has some visualizations on it with a few spreadsheets. I want to write a scripts that loads the template, inserts an existing dataframe rows to specific cells on each sheet, and saves the new excel file as a new file.
The template already have all the cells designed and the visualization, so i will want to insert this data only without changing the design.
I tried several packages and none of them seemed to work for me.
Thanks for your help! :-)
I have written a package for inserting Pandas DataFrames to Excel sheets (specific rows/cells/columns), it's called pyxcelframe:
https://pypi.org/project/pyxcelframe/
It has very simple and short documentation, and the method you need is insert_frame
So, let's say we have a Pandas DataFrame called df which we have to insert in the Excel file ("MyWorkbook") sheet named "MySheet" from the cell B5, we can just use insert_frame function as follows:
from pyxcelframe import insert_frame
from openpyxl import load_workbook
workbook = load_workbook("MyWorkbook.xlsx")
worksheet = workbook["MySheet"]
insert_frame(worksheet=worksheet,
dataframe=df,
row_range=(5, 0),
col_range=(2, 0))
0 as the value of the second element of row_range or col_range means that there is no ending row or column specified, if you need specific ending row/column you can replace 0 with it.
Sounds like a job for xlwings. You didn't post any test data, but modyfing below to suit your needs should be quite straight-forward.
import xlwings as xw
wb = xw.Book('your_excel_template.xlsx')
wb.sheets['Sheet1'].range('A1').value = df[your_selected_rows]
wb.save('new_file.xlsx')
wb.close()

Categories