For a current project, I am planning to clean a Pandas DataFrame off its Null values. For this purpose, I want to use Pandas.DataFrame.fillna, which is apparently a solid soliton for data cleanups.
When running the below code, I am however receiving the following error AttributeError: module 'pandas' has no attribute 'df'. I tried several options to rewrite the line df = pd.df().fillna, none of which changed the outcome.
Is there any smart tweak to get this running?
import string
import json
import pandas as pd
# Loading and normalising the input file
file = open("sp500.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df = pd.df().fillna
When you load the file to the pandas - in your code the data variable is a DataFrame instance. However, you made a typo.
df = pd.json_normalize(data)
df = df.fillna()
Related
I'm using a Jupyter notebook and I'm trying to open a data file and keep getting an error code AttributeError:'pandas._libs.properties.AxisProperty' object has no attribute 'unique'. This is my first time using Jupyter So I am not familiar with any error like this.
import pandas as pd
df = pd.DataFrame
df - pd.read_csv("C:/Users/yusra/OneDrive/Documents/vgsales.csv")
df
You are not using pd.DataFrame right. See below corrected code:
import pandas as pd
df=pd.read_csv("C:/Users/yusra/OneDrive/Documents/vgsales.csv")
df
I'm trying to write the data in the existing zip file to hdfs in parquet format, but I encountered an error like this. I would be glad if you help. (By the way, I'm open to your ideas to make this code serve the same purpose in a different way)
import pandas as pd
import pyarrow.parquet as pq
file = c:/okay.log.gz
df = pd.read_csv(file, compression =gzip, low_memory=false, sep="|", error_badlines=False)
pq.write_table(df, "target_path")
AttributeError: 'DataFrame' object has no attribute 'schema'
I've just run into the same issue, but I assume you've resolved yours. In case you haven't or someone else comes across this with a similar issue, try creating a pyarrow table from the dataframe first.
import pyarrow as pa
import pyarrow.parquet as pq
df = {some dataframe}
table = pa.Table.from_pandas(df)
pq.write_table(table, '{path}')
I am making a GUI applet that needs to analyze data from many csv files (and also update them).
Right now all that I want is to read the data, update it, and then run pd.to_csv() on it.
I did this (first line of the code):
from pandas import read_csv, to_csv # because all that I want from pandas are these two things (for now)
Getting this error:
ImportError: cannot import name 'to_csv' from 'pandas' (C:\Users\<Your good username>\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\__init__.py)
Any advices?
to_csv is a method of DataFrame class. So you can't import it like you import read_csv because read_csv is a function in pandas module but not to_csv.
to_csv is a part of DataFrame class. The below example should clear your doubts:
# importing pandas as pd
import pandas as pd
# list of name, location, pin
nme = ["John", "Jacky", "Victor"]
location = ["USA", "INDIA", "UK", "NL"]
pin = [1120, 10, 770, 1990]
# dictionary of lists
df = pd.DataFrame({'name': nme, 'location': location, 'pin': pin})
# saving the dataframe
df.to_csv('file2.csv', header=False, index=False)
It will create a csv file.
I'm trying to take a dictionary object in python, write it out to a csv file, and then read it back in from that csv file.
But it's not working. When I try to read it back in, it gives me the following error:
EmptyDataError: No columns to parse from file
I don't understand this for two reasons. Firstly, if I used pandas very own to_csv method, it should
be giving me the correct format for a csv. Secondly, when I print out the header values (by doing this : print(df.columns.values) ) of the dataframe that I'm trying to save, it says I do in fact have headers ("one" and "two".) So if the object I was sending out had column names, I don't know why they wouldn't be found when I'm trying to read it back.
import pandas as pd
testing = {"one":1,"two":2 }
df = pd.DataFrame(testing, index=[0])
file = open('testing.csv','w')
df.to_csv(file)
new_df = pd.read_csv("testing.csv")
What am I doing wrong?
Thanks in advance for the help!
The default pandas.DataFrame.to_csv takes a path and not an text io. Just remove the file declaration and directly use the path, pass index = False to skip indexes.
import pandas as pd
testing = {"one":1,"two":2 }
df = pd.DataFrame(testing, index=[0])
df.to_csv('testing.csv', index = False)
new_df = pd.read_csv("testing.csv")
I am doing a data analysis project and while importing the csv file into spyder I am facing this error. Please help me to debug this as I am new to programming.
#import library
>>>import pandas as pd
#read the data from from csv as a pandas dataframe
>>>df = pd.read.csv('/Documents/Melbourne_housing_FULL.csv')
This is the error shown when I use the pd.read.csv command:
File "C:/Users/mylaptop/.spyder-py3/temp.py", line 4, in <module>
df = pd.read.csv('/Documents/Melbourne_housing_FULL.csv')
AttributeError: module 'pandas' has no attribute 'read'
you should use :
df = pd.read_csv('/Documents/Melbourne_housing_FULL.csv')
see here docs
you need to use pandas.read_csv() instead of pandas.read.csv() the error is litterally telling you this method doesn't exist .