Using Pandas with XLSB File

Using Pandas with XLSB File - python

Trying to read a xlsb file to create a DF in pandas.
import pandas as pd
a_data = pd.ExcelFile(
r'C:\\Desktop\\a.xlsb')
df_data = pd.read_excel(a_data, 'Sheet1', engine='pyxlsb')
print(df.head())
When I run the script I keep getting this error.
OSError: File contains no valid workbook part

You can use pyxlsb, all latest version of pandas support this.
Use following code:
import pandas as pd
a_data = pd.ExcelFile(r'C:\\Desktop\\a.xlsb')
df = pd.read_excel('a_data', sheet_name='Sheet1', engine='pyxlsb')
You will have to install pyxlsbfirst using command: pip install pyxlsb

Related

ModuleNotFoundError: No module named 'xlxswriter'

Language: Python 3.8.3
I faced this error when I was importing my xlxs file ModuleNotFoundError: No module named 'xlxswriter'
import xlxswriter
import pandas as pd
from pandas import DataFrame
path = ('mypath.xlxs')
xl = pd.ExcelFile(path)
print(xl.sheet_names)
How can I fix this?

Instead of typing xlsx, type xlsx like this:
import xlsxwriter
import pandas as pd
from pandas import DataFrame
path = ('mypath.xlsx')
xl = pd.ExcelFile(path)
print(xl.sheet_names)
It'll work.

The module name is xlsxwriter not xlxswriter, so replace that line with:
import xlsxwriter

PDF Table object list to csv format in Python

I am trying to build a panel database by appending tables by rows using same column names for data tables in pp. 149-157 from this pdf file:
https://www.uv.mx/personal/clelanda/files/2013/02/Garber-2000-Famous-first-bubbles.pdf
Here is the code I am currently using:
!pip install tabula-py
!pip install pandas
import pandas as pd
import tabula
from google.colab import files
def getLocalFiles():
_files = files.upload()
if len(_files) >0:
for k,v in _files.items():
open(k,'wb').write(v)
getLocalFiles()
#directory path
!ls
#Reading pdf tables
file = "bubbles.pdf"
path = 'bubbles.pdf'
tables = tabula.read_pdf(path, pages = [149,150,151,152,153,154,155,156,157], columns= (1,2,3,4,5,6,7))
print(tables)
#passing to csv format
from pandas import DataFrame
df=pd.DataFrame(page_1)
print(df)
df.to_csv('test.csv', index= False)
This is the output data:
In which way could I append all pdf tables?,
thanks in advance

Read_csv from URL into Jupyter

Hi I am unable to read CSV file from the URL by using
import pandas as pd
import numpy as np
data_url = 'https://data.baltimorecity.gov/Financial/Real-Property-Taxes/27w9-urtv.csv'
df = pd.read_csv(data_url)
df.head()
I got an error: "not acceptable"
I also tried different codes importing "requests" but none of them worked. How do I fix this?

Your URL wasnt correct. This should work:
import pandas as pd
data_url = 'https://data.baltimorecity.gov/resource/27w9-urtv.csv'
df = pd.read_csv(data_url)
df.head()

trying to import excel into python

I'm still new to python. I'm trying to import an excel doc into python but I get the filenotfounderror
This is what I'm running:
import pandas as pd
practiceset = (r'C:\Users\michael\Desktop\Work\Transpo\'Transportation2016.xlsx')
df = pd.read_excel(practiceset)
print (df)
The python file is in the same folder as the doc, so I'm confused.

Try this:
import pandas as pd
practiceset = (r'C:\Users\michael\Desktop\Work\Transpo\Transportation2016.xlsx')
df = pd.read_excel(practiceset)
print (df)

Python read issue in excel with pandas

Hi I am new to the open source tech - I am using Anaconda3-5.1.0-Windows-x86_64 & Microsoft Excel 2016.The Excel reading operation using pandas throws error as File not found error for the below code.
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
path= "D:\sample.xlsx"
print(path)
df = pd.read_excel(path, sheet_name = 'Sheet1')
print('Column headings:')
print(df.columns)
The error message is FileNotFoundError: [Errno 2] No such file or directory: 'D:\sample.xlsx'-
I was trying to read 'D:\sample.xlsx' but the function tries to open file as 'D:\sample.xlsx'.
Can anyone please advise on this issue or shall let me know any more details required.

Change
path= "D:\sample.xlsx"
to
path= r"D:\sample.xlsx" or path= "D:\\sample.xlsx"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Pandas with XLSB File - python

Trying to read a xlsb file to create a DF in pandas. import pandas as pd a_data = pd.ExcelFile( r'C:\\Desktop\\a.xlsb') df_data = pd.read_excel(a_data, 'Sheet1', engine='pyxlsb') print(df.head()) When I run the script I keep getting this error. OSError: File contains no valid workbook part

You can use pyxlsb, all latest version of pandas support this. Use following code: import pandas as pd a_data = pd.ExcelFile(r'C:\\Desktop\\a.xlsb') df = pd.read_excel('a_data', sheet_name='Sheet1', engine='pyxlsb') You will have to install pyxlsbfirst using command: pip install pyxlsb

Related

ModuleNotFoundError: No module named 'xlxswriter'

PDF Table object list to csv format in Python

Read_csv from URL into Jupyter

trying to import excel into python

Python read issue in excel with pandas

Categories

Resources