CSV Load Error with Pandas - python

Can someone help me figure out what this error is telling me? I don't understand why this csv won't load.
Code:
import pandas as pd
import numpy as np
energy = pd.read_csv('Energy Indicators.csv')
GDP = pd.read_csv('world_bank_new.csv')
ScimEn = pd.read_csv('scimagojr-3.csv')
Error:
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-2-65661166aab4> in <module>()
10
11
---> 12 answer_one()
<ipython-input-2-65661166aab4> in answer_one()
4 energy = pd.read_csv('Energy Indicators.csv')
5 GDP = pd.read_csv('world_bank_new.csv')
----> 6 ScimEn = pd.read_csv('scimagojr-3.csv')
7
8

The read_csv function takes an encoding option. You're going to need to tell Pandas what the file encoding is. Try encoding = "ISO-8859-1".

Related

How to import Missingpy in Python

I have tried all the answers that has been posted on the forum but I keep getting thrown this error
How do I correct this error?
ImportError Traceback (most recent call last)
c:\Users\Godfred King\Desktop\Python\titanic\mlingopractice.ipynb Cell 6 in <cell line: 4>()
2 import sklearn.neighbors._base
3 import sys
----> 4 from missingpy import MissForest
5 sys.modules['sklearn.neighbors.base'] = sklearn.neighbors._base
File c:\Users\Godfred King\AppData\Local\Programs\Python\Python39\lib\site-packages\missingpy\__init__.py:1, in <module>
----> 1 from .knnimpute import KNNImputer
2 from .missforest import MissForest
4 __all__ = ['KNNImputer', 'MissForest']
File c:\Users\Godfred King\AppData\Local\Programs\Python\Python39\lib\site-packages\missingpy\knnimpute.py:13, in <module>
11 from sklearn.utils.validation import check_is_fitted
12 from sklearn.utils.validation import FLOAT_DTYPES
---> 13 from sklearn.neighbors.base import _check_weights
14 from sklearn.neighbors.base import _get_weights
16 from .pairwise_external import pairwise_distances
ImportError: cannot import name '_check_weights' from 'sklearn.neighbors._base' (c:\Users\Godfred King\AppData\Local\Programs\Python\Python39\lib\site-packages\sklearn\neighbors\_base.py)
I have tried the responses posted here No module named 'sklearn.neighbors.base' still existed after all the suggestions what I can take
but I get the same error
It seems to be correct,
try updating your packages,
Code Output:

How to collect a list of latitude and longitude data for a location dataframe?

I recently started coding in Python. I'm importing a csv file containing around 848 unique locations across India, and would like to use the geopy module to add in latitude and longitude for each location. After importing the data, I used this code:
from geopy.geocoders import Nominatim
import pandas as pd
df = pd.read_csv("../project/LocationIndia.csv")
lat = []
long = []
for location in df["Location"]:
# Initialize Nominatim API
address = location + ', India'
geolocator = Nominatim(user_agent="MyApp")
location = geolocator.geocode(address)
lat.append(location.latitude)
long.append(location.longitude)
df['latitude']=latitude
df['longitude']=longitude
df
Output:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [14], in <cell line: 4>()
7 geolocator = Nominatim(user_agent="MyApp")
8 location = geolocator.geocode(address)
----> 9 lat.append(location.latitude)
10 long.append(location.longitude)
11 df['latitude']=latitude
AttributeError: 'NoneType' object has no attribute 'latitude'
Not sure why this is happening. If anyone can help me out, would be great, thanks!
This is the table btw:
Location
0 Kharghar
1 Sector-13 Kharghar
2 Sector 18 Kharghar
3 Sector 20 Kharghar
4 Sector 15 Kharghar
... ...
843 BTM Layout
844 Kuvempu Layout on Hennur Main Road
845 Marathahalli
846 Rajajinagar
847 RMV
I was expecting values to be inputted as columns in the dataframe.
geolocator.geocode() returns None, if no results are found. Try error handling like:
if location is not None:
# your code here

TypeError while formatting pandas.df.pct_change() output to percentage

I'm trying to calculate the daily returns of stock in percentage format from a CSV file by defining a function.
Here's my code:
def daily_ret(ticker):
return f"{df[ticker].pct_change()*100:.2f}%"
When I call the function, I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-7122588f1289> in <module>()
----> 1 daily_ret('AAPL')
<ipython-input-39-7dd6285eb14d> in daily_ret(ticker)
1 def daily_ret(ticker):
----> 2 return f"{df[ticker].pct_change()*100:.2f}%"
TypeError: unsupported format string passed to Series.__format__
Where am I going wrong?
f-strings can't be used to format iterables like that, even Series:
Use map or apply instead:
def daily_ret(ticker):
return (df[ticker].pct_change() * 100).map("{:.2f}%".format)
def daily_ret(ticker):
return (df[ticker].pct_change() * 100).apply("{:.2f}%".format)
import numpy as np
import pandas as pd
df = pd.DataFrame({'A': np.arange(1, 6)})
print(daily_ret('A'))
0 nan%
1 100.00%
2 50.00%
3 33.33%
4 25.00%
Name: A, dtype: object

Error on loading CSV using fread in pydatatable

I have a csv contains about 600K observations, and I'm importing it using fread
DT = dt.fread('C:\\Users\\myamulla\\Desktop\\proyectos_de_py\\7726_analysis\\datasets\\7726export_Jan_23.csv')
It is throwing out an error as -
--------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-3-01684fbecd91> in <module>
----> 1 dt.fread('C:\\Users\\myamulla\\Desktop\\proyectos_de_py\\7726_analysis\\datasets\\7726export_Jan_23.csv')
IOError: Too few fields on line 432815: expected 14 but found only 4 (with sep=','). Set fill=True to ignore this error. <<19731764,2021-01-23 23:30:15,2021-01-23 23:42:20,"Vote for David Borrero, your Republican in HD 105. Potestad betrayed Prez Trump. Borrero is for our values & POTUS Trump.>>
As suggested here, i passed the argument fill=True in fread statement.
DT = dt.fread('C:\\Users\\myamulla\\Desktop\\proyectos_de_py\\7726_analysis\\datasets\\7726export_Jan_23.csv',fill=True)
It executes, but DT will be created EMPTY.
How to get it resolved ?

Unicode ImportError using graphlab with Enthought iPython

I am trying to use GraphLab Create with Enthought Canopy iPython but I'm getting an ImportError that seems to be related to unicode. The line is:
ImportError: /home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/cython/cy_ipc.so: undefined symbol: PyUnicodeUCS4_DecodeUTF8
and this is preceded by:
In [1]: import graphlab
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-4b66ad388e97> in <module>()
----> 1 import graphlab
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/__init__.py in <module>()
5 """
6
----> 7 import graphlab.connect.aws as aws
8
9 import graphlab.deploy
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/aws/__init__.py in <module>()
3 This module defines classes and global functions for interacting with Amazon Web Services.
4 """
----> 5 from _ec2 import get_credentials, launch_EC2, list_instances, set_credentials, status, terminate_EC2
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/aws/_ec2.py in <module>()
15
16 import graphlab.product_key
---> 17 import graphlab.connect.server as glserver
18 import graphlab.connect.main as glconnect
19 from graphlab.connect.main import __catch_and_log__
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/server.py in <module>()
4 """
5
----> 6 from graphlab.cython.cy_ipc import PyCommClient as Client
7 from graphlab.cython.cy_ipc import get_public_secret_key_pair
8 from graphlab_util.config import DEFAULT_CONFIG as default_local_conf
In [1]: import graphlab
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-4b66ad388e97> in <module>()
----> 1 import graphlab
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/__init__.py in <module>()
5 """
6
----> 7 import graphlab.connect.aws as aws
8
9 import graphlab.deploy
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/aws/__init__.py in <module>()
3 This module defines classes and global functions for interacting with Amazon Web Services.
4 """
----> 5 from _ec2 import get_credentials, launch_EC2, list_instances, set_credentials, status, terminate_EC2
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/aws/_ec2.py in <module>()
15
16 import graphlab.product_key
---> 17 import graphlab.connect.server as glserver
18 import graphlab.connect.main as glconnect
19 from graphlab.connect.main import __catch_and_log__
/home/aaron/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/graphlab/connect/server.py in <module>()
4 """
5
----> 6 from graphlab.cython.cy_ipc import PyCommClient as Client
7 from graphlab.cython.cy_ipc import get_public_secret_key_pair
8 from graphlab_util.config import DEFAULT_CONFIG as default_local_conf
The GraphLab forum http://forum.graphlab.com/discussion/84/importerror-undefined-symbol-pyunicodeucs4-decodeutf8 suggests that this is due to Enthought Python being compiled with 2-byte-wide unicode chars. Is there a way to get Enthought to use 4-byte chars since I can't recompile?

Categories