Problem reading CSV file from URL in pandas python - python

I'm trying to read csv file in pandas from this url:
https://www.dropbox.com/s/uh7o7uyeghqkhoy/diabetes.csv
By doing this:
url = "https://www.dropbox.com/s/uh7o7uyeghqkhoy/diabetes.csv">
c = pd.read_csv(url)
Or by doing this:
import pandas as pd
import io
import requests
url="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')))
And i still get the same error message:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 2

Simply:
import pandas as pd
df = pd.read_csv("https://www.dropbox.com/s/uh7o7uyeghqkhoy/diabetes.csv?dl=1")
print(df)
You require a ?dl=1 at the end of your link.
Related

Add ?dl=1 to the end of the URL
import pandas as pd
url = "https://www.dropbox.com/s/uh7o7uyeghqkhoy/diabetes.csv?dl=1"
c = pd.read_csv(url)
print(c)
How to download dropbox csv file to pandas

Related

Read Excel file with pandas from url reponse

I am looking how to read an xlsx file with pandas, with the file hosted on SharePoint. These contents, when shown through reponse.text, are in string but are a binary representation of the file.
PK╚╝ ! #h�╔�╔ �═ �╔[Content_Types].xml ��╔(� ╗
��[O�#��M�;���1��G% {�΀��.Z�E��Ҧݝ�I{��5�╗"j� ���"╚W�J�I!^_Z�"CR�R�;(�
P ��g��U ̸�a!�D�FJ,�`�>�㕱�V?Ɖ
��� �n�}%K��������Pv���k'#�Dv��W� �B0�T�F��U? -?�*_�-K�"� � dM�fb|═"�BndF0x�3UΕ�Nu�
���P�lO�Y�ğ#� �����,g�K#�}�����E=�tD�U�}���O�Q�[��F�|Ix��╚���[H2{�H+╚x�k�]dn�a�╔yZ"N�jͺ�"ih�s�Gn�<j�╚
I would like to know how to read this format into memory so that I can call pd.read_excel with it.
I've tried to use urllib and openpyxl, in this manner:
import openpyxl as excel
import pandas as pd
from io import BytesIO
import urllib
req = urllib.request.Request(url=url, data=payload, headers=headers)
with urllib.request.urlopen(url=req) as reponse:
rsp = reponse.read()
excel.load_workbook(filename=rsp)
But I am getting error 400 Bad Request, from the urllib request module.
The url looks something like this:
https://company.sharepoint.com/sites/test-department/_api/Web/GetFileByServerRelativeUrl('/sites/test-department/DepartmentDocuments/test/Book1.xlsx')/$value?binaryStringResponseBody=true
I found a way to do it. the key was to seek back before passing the file to pandas.
file_ext = self.file_name.split('.')[-1]
if file_ext == 'xlsx':
import pandas as pd
from io import BytesIO
xl = bytes(memoryview(response.content))
memfile =BytesIO()
memfile.write(xl)
memfile.seek(0)
df = pd.io.excel.read_excel(memfile, engine='openpyxl')
print(df.head(10))

Load a csv file python jupyter

I'm new to python and I got this error I couldn't solve
import pandas as pd
import numpy as np
url = 'http://localhost:8888/edit/Downloads/untitled.cvs'
food2014_recalls = pd.read_csv(url)
This is my csv file:
animal,uniq_id,water_need
elephant,1001,500
elephant,1002,600
elephant,1003,550
I got this error:
import pandas as pd
import numpy as np
import io
import requests
url ='http://localhost:8888/edit/Downloads/untitled.cvs'
res =requests.get(url).content
food2014_recalls =pd.read_csv(io.StringIO(res.decode('utf-8')), error_bad_lines=False, comment='#', sep=',')

Read_csv from URL into Jupyter

Hi I am unable to read CSV file from the URL by using
import pandas as pd
import numpy as np
data_url = 'https://data.baltimorecity.gov/Financial/Real-Property-Taxes/27w9-urtv.csv'
df = pd.read_csv(data_url)
df.head()
I got an error: "not acceptable"
I also tried different codes importing "requests" but none of them worked. How do I fix this?
Your URL wasnt correct. This should work:
import pandas as pd
data_url = 'https://data.baltimorecity.gov/resource/27w9-urtv.csv'
df = pd.read_csv(data_url)
df.head()

trying to import excel into python

I'm still new to python. I'm trying to import an excel doc into python but I get the filenotfounderror
This is what I'm running:
import pandas as pd
practiceset = (r'C:\Users\michael\Desktop\Work\Transpo\'Transportation2016.xlsx')
df = pd.read_excel(practiceset)
print (df)
The python file is in the same folder as the doc, so I'm confused.
Try this:
import pandas as pd
practiceset = (r'C:\Users\michael\Desktop\Work\Transpo\Transportation2016.xlsx')
df = pd.read_excel(practiceset)
print (df)

Unable to read a text file into jupyter notebook on mac

I downloaded this dataset and stored it in a folder called AutomobileDataset.
I cross checked the working directory using:
import pandas as pd
import numpy as np
import os
os.chdir("/Users/madan/Desktop/ML/Datasets/AutomobileDataset")
os.getcwd()
Output:
'/Users/madan/Desktop/ML/Datasets/AutomobileDataset'
Then I tried reading the file using pandas:
import pandas as pd
import numpy as np
import os
os.chdir("/Users/madan/Desktop/ML/Datasets/AutomobileDataset")
os.getcwd()
automobile_data = pd.read_csv("AutomobileDataset.txt", sep = ',',
header = None, na_values = '?')
automobile_data.head()
Output:
---------------------------------------------------------------------------
ParserError: Error tokenizing data. C error: Expected 1 fields in line 2, saw 26
Someone please help me with this, I don't know where I am making a mistake.
Can try this!
import os
# Read in a plain text file
with open(os.path.join("c:user/xxx/xx", "xxx.txt"), "r") as f:
text = f.read()
print(text)

Categories