Downloading a Web File directly into Pandas [duplicate] - python

This question already has answers here:
Pandas read_csv from url
(6 answers)
Closed 3 years ago.
I am trying to download a web file with this link in pandas. The issue that I am having is that most tutorials show a file that can be downloaded with a particular extension on the end, which allows for you to more easily directly download it.
This link results in the download of a text file, but it cannot be easily read with conventional methods. How can I download this file directly in pandas with this link.

Data Science Acolyte, all you have to do is this!
import pandas as pd
df = pd.read_csv('ADAMS.txt')
you can try:
df = pd.read_csv('https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:1')

Please try this code below if it works for you:
import pandas as pd
import io
import requests
url="https://www6.sos.state.oh.us/ords/f?p=VOTERFTP:DOWNLOAD::FILE:NO:2:P2_PRODUCT_NUMBER:1"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')))
More details here.

Related

Can I save a table/dataframe to a file (like png/jpg) in python? [duplicate]

This question already has answers here:
How to save a pandas DataFrame table as a png
(13 answers)
Closed 1 year ago.
I found several ways to PRINT tables in nicer formatting but can I also SAVE those outputs to a file (not csv, excel etc.)? They don't even need to be changeable, an image-like representation would be great. I get presentation-ready dataframes that I have to reformat since I'm saving them in excel files at the moment.
Assuming this table is a pandas DataFrame, this library might help:
www.dexplo.org/dataframe_image/
This library would export pandas DataFrames in a jupyter notebook fashioned way.
Example usage:
import pandas as pd
import dataframe_image as dfi
df = pd.DataFrame({'key':[1,2,3],'val':['a','b','c']})
dfi.export(df, 'dataframe.png')

How to convert nested JSON data from a URL into a Pandas dataframe

import json
import pandas
import requests
Convert to Pandas
I know what you're going to say, this has been asked before. But ive gone through a number of posts already and they all require importing the json file into the code already.
So with this code I've been trying to import the json data through a URL, so there is no need to save any files before hand.
Is it even possible?
Please help.
Pandas json_normalize can do just that. Here is an example for which you will have to modify to meet your specific needs:
df = pd.json_normalize(packages_json, record_path='results')
(I omitted the output from the DF because is it unwieldy)

How to download xlsx file from URL and save in data frame via python

I would like the following code to download the xlsx files from the URL and save in drive.
I receive this error:
AttributeError: 'str' object has no attribute 'content'
Below is the code:
import requests
import xlrd
import pandas as pd
filed = 'https://www.icicipruamc.com/downloads/others/monthly-portfolio-disclosures/monthly-portfolio-disclosure-november19/Arbitrage.xlsx'
resp = requests.get(filed)
workbook = xlrd.open_workbook(file_contents = filed.content)
worksheet = workbook.sheet_by_index(0)
first_row = worksheet.row(0)
df = pd.DataFrame(first_row)
pandas already has a function thas converts excel direclty into pandas dataframe (using xlrd):
import pandas as pd
MY_EXCEL_URL="www.yes.com/xl.xlsx"
xl_df = pd.read_excel(MY_EXCEL_URL,
sheet_name='my_sheet',
skiprows=range(5),
skipfooter=0)
then yo can handle /save file using pd.DataFrame.to_excel
This function works, tested individual components. The ICICI website you have seems to give me a 404. So make sure the website works and has an excel sheet before trying this out.
import requests
import pandas as pd
def excel_to_pandas(URL, local_path):
resp = requests.get(URL)
with open(local_path, 'wb') as output:
output.write(resp.content)
df = pd.read_excel(local_path)
return df
print(excel_to_pandas("www.websiteforxls.com", '~/Desktop/my_downloaded.xls'))
As a footnote, this was super simple. And I'm disappointed you couldn't do this on your own. I might not have been able to do this 5 years ago, and that's why I decided to help.
If you want to code. Learn the basics, literally the basics: Class, Functions, Variables, Types, OOP principals. And that's all you need to start. Then you need to learn how to search, and make different components to work together the way you require them too. And with SO, if you show some effort, we are happy to help. We are a community, not a place to solve your homework. Try harder next time.

Import CSV file into Python [duplicate]

This question already has answers here:
How do I read and write CSV files with Python?
(7 answers)
Closed 4 years ago.
I tried several times to import CSV file into python 2.7.15 but it fail.
Please suggest how to import CSV in Python.
Thank you
There are two ways to import a csv file in Python.
First: Using standard Python csv
import csv
with open('filename.csv') as csv_file:
csv_read=csv.reader(csv_file, delimiter=',')
Second: Using pandas
import pandas as pd
data = pd.read_csv('filename.csv')
data.head() # to display the first 5 lines of loaded data
I would suggest to import using pandas since that's the more convenient way to do so.

Using local data in Python Seaborn returns file does not exist [duplicate]

This question already has answers here:
Seaborn load_dataset
(4 answers)
why is jupyter notebook not accepting my csv file link? [duplicate]
(1 answer)
Closed last month.
I am trying to utilize Seaborn to create a visualization.
Here is what I have thus far:
import os.path
directory = os.path.dirname(os.path.abspath(__file__))
import pandas as pd
import seaborn as sns
sns.set(style="whitegrid", color_codes=True)
tel = pd.read_csv('nyc.csv')
nyctel = sns.load_dataset(tel)
sns.stripplot(x="installation_id", y="mounting", hue="mounting", data=nyctel)
The official documentation for load_dataset is completely useless, so I found that someone had already asked a question about how it works here: https://stackoverflow.com/a/30337377/6110631
I followed the format listed in the answer and imported pandas so I could use a local file (saved in the same folder). When I run the program however, I get
IOError: File nyc.csv does not exist
If I use an absolute path I get
IOError: ('http protocol error', 0, 'got a bad status line', None)
It seems the problem is with this line:
nyctel = sns.load_dataset(tel)
because if I omit this line and the line beneath it and add print tel beneath the pd.read_csv line then the program works and it prints out the contents of the file. Somehow load_dataset is not letting me use that file though!
I am using the exact same code as in the answer linked above. Why would this not work for this local file?
The load_dataset() is only necessary to create a Pandas DataFrames, out of an example database. In your case, you created a DataFrame whith pd.read_csv('nyc.csv'), so sns.load_dataset(tel) is unnecessary and not working.
Here is a quote from https://seaborn.pydata.org/introduction.html
Most code in the docs will use the load_dataset() function to get
quick access to an example dataset. There’s nothing special about
these datasets: they are just pandas dataframes, and we could have
loaded them with pandas.read_csv() or built them by hand. Most of the
examples in the documentation will specify data using pandas
dataframes, but seaborn is very flexible about the data structures
that it accepts.
I'm posting this via mobile so it's not tested:
import pandas as pd
import seaborn as sns
import os.path
directory = os.path.dirname(os.path.abspath(__file__))
filename = 'nyc.csv'
file_path = os.path.join(directory, filename)
tel = pd.read_csv(file_path)
sns.set(style="whitegrid", color_codes=True)
nyctel = sns.load_dataset(tel)
sns.stripplot(x="installation_id", y="mounting", hue="mounting", data=nyctel)

Categories