How to make pandas recognise my xlsx file as multiple-columned datafrane - python

While making my bot to set permissions automatically while it came into a guild, Writing codes for this seemed getting too long. So, I just wanted to made my bot to get xlsx file as dataframe and set permissions from that data inside.
I wanted to make this xlsx file of mine as multiple-columned dataframe, but I don't think my program recognises it as one. Do I have my errors in my code below or I have to change my excel file for it to be rocognised as I wanted?
from pandas import read_excel
perm_data = read_excel('E:/Discord bot/Grail-Relique/data/xlsx/TextPermission.xlsx', header=[0,1], engine='openpyxl')
print(perm_data)
print(perm_data.loc[0,(0,0)])
result

This should do the work:
import pandas as pd
df = pd.read_excel('your/path/to/file.xlsx',
header=[0,1],
index_col=0)
print(df.head())

Related

How to read an .xlsx file on sharepoint into a pandas dataframe?

I have a Python script which loads an .xslx file into a pandas dataframe using read_excel:
import os
import pandas as pd
V_file = "My_file.xlsx"
V_path = r"C:\My_folder"
os.chdir(V_path)
V_df = pd.read_excel(V_file, sheet_name = "Sheet1")
This works for files saved locally. However, I want to read in a file that is saved in Sharepoint. Does anyone know how I can adapt the code above to do this please? And also, if it's not too much trouble, an explanation of what the adapted code is doing exactly please?

Inserting Data into an Excel file using Pandas - Python

I have an excel file that contains the names of 60 datasets.
I'm trying to write a piece of code that "enters" the Excel file, accesses a specific dataset (whose name is in the Excel file), gathers and analyses some data and finally, creates a new column in the Excel file and inserts the information gathered beforehand.
I can do most of it, except for the part of adding a new column and entering the data.
I was trying to do something like this:
path_data = **the path to the excel file**
recap = pd.read_excel(os.path.join(path_data,'My_Excel.xlsx')) # where I access the Excel file
recap['New information Column'] = Some Value
Is this a correct way of doing this? And if so, can someone suggest a better way (that works ehehe)
Thank you a lot!
You can import the excel file into python using pandas.
import pandas as pd
df = pd.read_excel (r'Path\Filename.xlsx')
print (df)
If you have many sheets, then you could do this:
import pandas as pd
df = pd.read_excel (r'Path\Filename.xlsx', sheet_name='sheetname')
print (df)
To add a new column you could do the following:
df['name of the new column'] = 'things to add'
Then when you're ready, you can export it as xlsx:
import openpyxl
# to excel
df.to_excel(r'Path\filename.xlsx')

Reading .ASC format file in python

I am dealing with a certain .asc file format which contains some data regarding weight and height. I just want to find BMI indexes of people with this data. I am not able to make sense of the dataframe formed after reading the data.
import pandas as pd
df = pd.read_table("data.asc")
I am not able to make sense of the result that I get. Please help me out
I recently had to work with a file with the extension "asc". My solution was the following:
I opened the file with a text editor and check the separator for the file, then transform it into a spreadsheet. In my case, I turned the document into a "csv" file.
After that I ran:
import pandas as pd
df = pd.read_csv('path to your file')

Reading XLSB (binary) file with Pandas read_excel using pyxlsb reads empty rows for some xlsb file

I'm trying to read binary Excel files using read_excel method in pandas with pyxlsb engine as below:
import pandas as pd
df = pd.read_excel('test.xlsb', engine='pyxlsb')
If the xlsb file is like this file (Right now, I'm sharing this file via WeTransfer, but if there is a better way to share files on StackOverflow, let me know), the returned dataframe is filled with NaN's. I suspected that it might be because the file was saved with active cell pointing at the empty cells after the data originally. So I tried this:
import pandas as pd
with open('test.xlsb', 'rb') as data:
data.seek(0,0)
df = pd.read_excel(data, engine='pyxlsb')
but it still doesn't seem to work. I also tried reading the data from byte number 0 (from the beginning), writing it into a new file, 'test_1.xlsb', and finally reading it with pandas, but that doesn't work.
with open('test.xlsb','rb') as data:
data.seek(0,0)
with open('test_1.xlsb','wb') as outfile:
outfile.write(data.read())
df = pd.read_excel('test_1.xlsb', engine='pyxlsb')
If anyone has suggestion as to what might be going on and how to resolve it, I'd greatly appreciate the help.

saving a dataframe to csv file (python)

I am trying to restructure the way my precipitations' data is being organized in an excel file. To do this, I've written the following code:
import pandas as pd
df = pd.read_excel('El Jem_Souassi.xlsx', sheetname=None, header=None)
data=df["El Jem"]
T=[]
for column in range(1,56):
liste=data[column].tolist()
for row in range(1,len(liste)):
liste[row]=str(liste[row])
if liste[row]!='nan':
T.append(liste[row])
result=pd.DataFrame(T)
result
This code works fine and through Jupyter I can see that the result is good
screenshot
However, I am facing a problem when attempting to save this dataframe to a csv file.
result.to_csv("output.csv")
The resulting file contains the vertical index column and it seems I am unable to call for a specific cell.
(Hopefully, someone can help me with this problem)
Many thanks !!
It's all in the docs.
You are interested in skipping the index column, so do:
result.to_csv("output.csv", index=False)
If you also want to skip the header add:
result.to_csv("output.csv", index=False, header=False)
I don't know how your input data looks like (it is a good idea to make it available in your question). But note that currently you can obtain the same results just by doing:
import pandas as pd
df = pd.DataFrame([0]*16)
df.to_csv('results.csv', index=False, header=False)

Categories