opoening data using pandas in python 3.7

opoening data using pandas in python 3.7 - python

hi i tried to open some data that i downloaded to my documents using pandas with python 3.7
but it doesnt work
this is my code :
import pandas as pd
users=pd.read_csv("ml-100k/u.user",sep="|",names=["User ID","Age","Gender",
"aciation" ,"zipcode"])
user.head()
the eror is :
FileNotFoundError: File b'ml-100k/u.user' does not exist
how can it be that the file doesnt exist if i downloaded it ?
thaks:)

It seems your issue is that your data file is not in the path of your python session. There are a few ways to fix this.
First, your file has .user extension. I believe it should be a .csv extension for pd.read_csv(). Rename the file to make sure the extension and the name of the file are correct. I also advise to make the filename code friendly, substitute whitespaces for _ or - and remove non-alphanumeric characters #*/().
One solution is to provide the full path to the pd.read_csv() function.
pd.read_csv("/home/user/folder/file_name.csv",
sep="|",names=["User ID","Age","Gender","aciation" ,"zipcode"])
If you are using ipython or jupyter notebook you can navigate to the same folder where your file is at with cd path_to_file_folder command and simply pass the file name to the command:
pd.read_csv("file_name.csv",sep="|",
names=["User ID","Age","Gender","aciation" ,"zipcode"])
For more robust solutions check this discussion.

Related

Read Excel file that is located outside the folder containing the module into Pandas DataFrame

I want to read an excel file into pandas DataFrame. The module from which I want to read the file is inputs.py and the excel file (schoolsData.xlsx) that I want to read is outside the folder containing the module.
I'm doing it like this in my code
def read_data_excel(path):
df_file = pd.read_excel(path)
return df_file
school_data = read_data_excel('../schoolsData.xlsx')
Error: No such file or directory: '../schoolsData.xlsx'
The strange thing is that it works fine when I run the function containing this code locally but I get an error when I run the function after installing my published package from PyPi.
What is the right way to do it? Also would is it possible to read the file normally from the installed distributable that is a compressed folder?

The error could be arised because of the current working directory is different when you execute in local than when you execute after installing. Take a look to this to generalize the path without hardcoding it.

Your code should work. Try this to check the folder you are at:
import os
os.path.dirname(os.path.realpath(__file__))

You can always do
df_file = pd.read_excel("../schoolsData.xlsx")
".." will go back outside the current folder and this will be a relative reference.
You can always define an absolute path to that folder as well (that starts from C://whatever).

I am trying to read a csv file using pandas but it fails to recognize it - what could be the issue? (Tkinter)

I have been trying to read this csv file using pandas and then transfer it to a dictionary using:
pandas.read_csv("/Users/vijayaswani/Downloads/England1\ postcodes.csv ", index_col=1).T.to_dict()
but each time I get the error No such file or directory
neither does using the name of the file work and nor does using its path even though the file is not deleted or anything.
What could be the issue?

Looks like you have an extra space in the file path.
Have you tried:
pandas.read_csv("/Users/vijayaswani/Downloads/England1\ postcodes.csv", index_col=1).T.to_dict()
Or
pandas.read_csv("/Users/vijayaswani/Downloads/England1 postcodes.csv", index_col=1).T.to_dict()

entering files path using input()

I'm trying to extract information from excel file, and eventually put the values from the excel in a docx file. when I wrote the code I entered a specific path using the syntax (r"file path), and I had no problem. because the program was created for a friend of mine, and it will run on a different computer, I am looking for a way for my friend to open the exact excel file he wanted. down below you can see some codes. thanks in advance to anybody that spend some time trying to solve it!
this one worked for me:
loc = (r"C:\Users\dddor\Desktop\python24\report_example.xlsx")
this one brings error:
loc=(input('enter file location: '))
I also tried, but same error poped:
loc=("r"+input('enter file location: '))
Even when I copy the file path, that worked for me (both with and without the "r" it doesn't work)

you can use pathlib
This module offers classes representing filesystem paths with semantics appropriate for different operating systems
from pathlib import Path:
file_path = Path(input('enter file location: '))

Import Excel File Using Pandas

I'm trying to import and excel file that I have stored in a folder within a GitHub repository. Based on that the file path should be
"C:\\Users\\'username'\\Documents\\GitHub\\'repository'\\'folder'\\'filename'.xlsx"
But when I enter the code
import pandas as pd
xlsfile="C:\\Users\\'username'\\Documents\\GitHub\\'repository'\\'folder'\\'filename'.xlsx"
xl1=pd.read_excel(xlsfile,sheet_name='sheet',skiprows=21)
I get an error that says the file path I entered doesn't exist. I know that the entire path to the file exists because my working directory also contains the file, so what could I be doing wrong?
I have no experience coding. Thanks.

Remove the "'" in your filename? Is your sheet really named 'sheet'? I think the default is 'sheet1' ect.

There can be multiple things, as Joe stated you probably don't have ' ' around your file names, I'm assuming that they included those so that you input your local filepath in there (i.e. replace 'username' with Jack.Donaghue and so on) an example of this would look something like:"C:/Users/Jack_Donague/Documents/GitHub/YourRepoName/data/datafilename.xlsx"
Also as colbster pointed out to confirm what your sheet is named. I've also experienced some issues with \ vs / in the file names since I'm working on Windows10.
I would recommend trying
import pandas as pd
xlsfile="C:/Users/'username'/Documents/GitHub/'repository'/'folder'/'filename'.xlsx"
xl1=pd.read_excel(xlsfile,sheet_name='sheet',skiprows=21)

Is it possible to download just part of a ZIP file using python zipfile library

I was wondering is there any way by which I can download only a part of a .rar or .zip file without downloading the whole file ? There is a zip file containing files A,B,C and D. I only need A. Can I somehow, use zipfile module so that i can only download 1 file ?
i am trying below code:
r = c.get(file)
z = ZipFile.ZipFile(BytesIO(r.content))
for file1 in z.namelist():
if 'time' not in file1:
print("hi")
z.extractall(file1,download_path + filename)
This code is downloading whole zip file and only extracting specific one. Can i somehow download only the file i Need.
There is similar question here but it shows only approch by command line in linux. That question dosent address how it can be done using python liabraries.

The question #Juggernaut mentioned in a comment is actually very helpful, as it points you in the direction of the solution.
You need to create a replacement for Bytes.IO that returns the necessary information to ZipFile. You will need to get the length of the file, and then get whatever sections ZipFile asks for.
How large are those file? Is it really worth the trouble?

Use remotezip: https://github.com/gtsystem/python-remotezip. You can install it using pip:
pip install remotezip
Usage example:
from remotezip import RemoteZip
with RemoteZip("https://path/to/zip/file.zip") as zip_file:
for file in zip_file.namelist():
if 'time' not in file:
print("hi")
zip_file.extract(file, path="/path/to/extract")
Note that to use this approach, the web server from which you receive the file needs to support the Range header.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

opoening data using pandas in python 3.7 - python

Related

Read Excel file that is located outside the folder containing the module into Pandas DataFrame

I am trying to read a csv file using pandas but it fails to recognize it - what could be the issue? (Tkinter)

entering files path using input()

Import Excel File Using Pandas

Is it possible to download just part of a ZIP file using python zipfile library

Categories

Resources