KeyError: 0 when changing time format of data - python

I have a column of data that are date formatted as "%d%m%Y" like "15022016".
I need to convert them as "%Y-%m-%d" like"2016-02-15".
The data frame have 911,462 rows, and the code is as below:
for i in range(0,911462):
df['Date'][i]=datetime.datetime.strftime(datetime.datetime.strptime(df['Date'][i],"%d%m%Y"),"%Y-%m-%d")
Then I met with error as below:
Traceback (most recent call last):
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2393, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<input>", line 2, in <module>
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
return self._getitem_column(key)
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2074, in _getitem_column
result = result[key]
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
return self._getitem_column(key)
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2069, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 1534, in _get_item_cache
values = self._data.get(item)
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 3590, in get
loc = self.items.get_loc(item)
File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2395, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 0
I check the raw data in excel, they are all fine so there should be no problems with the raw data.
It's quite wired that Key Error is 0. I totally have no idea what's wrong with it and how to deal with it.
Thanks for reading and waiting for your help! :)

You need pandas.to_datetime with parameter format:
df = pd.DataFrame({'Date':[15022016,15022016]})
print (df)
Date
0 15022016
1 15022016
df['Date'] = pd.to_datetime(df['Date'], format='%d%m%Y')
print (df)
Date
0 2016-02-15
1 2016-02-15
print (df['Date'].dtype)
datetime64[ns]

Related

Pandas read_csv() parses dates fine but can't index by date

This is strange.
Data (csv):
Date, Hr 1,Hr 2,Hr 3,..
20070701,1128,1072,1173,..
20070702,1131,1092,1287,..
Pretty vanilla use of pd.read_csv():
df = pd.read_csv( filename,
parse_dates=['Date'],
index_col=['Date'])
Date seems to parse fine into the index:
print(df.index[:2])
Output:
DatetimeIndex(['2007-07-01', '2007-07-02'], dtype='datetime64[ns]', name='Date', freq=None)
Now if I try to index a single day?
print(df['2007-7-1']) # or any variation on "2007-07-01" etc
Output:
Traceback (most recent call last):
File "/Users/mjw/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '2007-7-1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "my_file.py", line 108, in <module>
print(df['2007-7-1'])
File "/Users/mjw/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 2800, in __getitem__
indexer = self.columns.get_loc(key)
File "/Users/mjw/opt/anaconda3/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '2007-7-1'
I've also tried to make sure the DatetimeIndex freq is set right
df = df.asfreq('d')
And I get the same junk.
But indexing by year and month works fine, or indexing by year-month-day after selecting a column:
print(df['2007-7']) # works
print(df['Hr 1']['2007-7-1']) # works
But this does not:
print(df['2007-7-1']['Hr 1'])
I can make a custom date parser but the point is that I shouldn't have to do that. "yyyymmdd" isn't exactly hard or unusual. Come on pandas.
Please and thank you!
Use .loc:
print(df.loc["2007-07-01"])
Prints:
Hr 1 1128
Hr 2 1072
Hr 3 1173
Name: 2007-07-01 00:00:00, dtype: int64
For just value of "Hr 2" column:
print(df.loc["2007-07-01", "Hr 2"])
Prints:
1072

Read csv; replace values and save on csv

I have a csv file and I want to modify the first column by removing all "-".
After that, I want to save the changes in that same first column.
import pandas as pd
clean_order = pd.read_csv('C:/Users/(...)/Page_Clean_test.csv', 'w+', delimiter=';', skiprows=0, low_memory=False)
clean_order.loc[clean_order['web_scraper_order'].fillna('').str.replace('-', ''), 'web_scraper_order']
clean_order.to_csv('C:/Users/(...)/Page_Clean_test.csv', index=False)
Error:
File "C:\Users\suiso\PycharmProjects\Teste_SA\venv\lib\site-packages\pandas\core\indexes\base.py", line 2889, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 97, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'web_scraper_order'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:/Users/suiso/PycharmProjects/Teste_SA/Clean Data/Dataframe_comments.py", line 21, in <module>
clean_order.loc[clean_order['web_scraper_order'].fillna('').str.replace('-', ''), 'web_scraper_order']
File "C:\Users\suiso\PycharmProjects\Teste_SA\venv\lib\site-packages\pandas\core\frame.py", line 2899, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\suiso\PycharmProjects\Teste_SA\venv\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc
raise KeyError(key) from err
KeyError: 'web_scraper_order'
Try changing:
clean_order.loc[clean_order['web_scraper_order'].fillna('').str.replace('-', ''), 'web_scraper_order']
To:
clean_order = clean_order[clean_order.loc['web_scraper_order'].fillna('').str.replace('-', '')]
you may use
clean_order['web_scraper_order']=clean_order['web_scraper_order'].str.replace('-','')
clean_order.to_csv('filename.csv',index=Flase)

Python Pandas: Why can't I convert 'Time' to to_datetime? Will not recognize time

Time data looks like this: Time
20:15:00.0
20:16:00.0
20:17:00.0
20:18:00.0
20:19:00.0
20:20:00.0
20:21:00.0
20:22:00.0
20:23:00.0
20:24:00.0
data: https://imgur.com/a/LQIjHGt)
Python recognizes these as:
Date object
**Time** **object**
Open float64
High float64
Low float64
Last float64
I've tried to import data like this:
hour
df = pd.read_csv('ES_1min_2012_vwap_va.txt', sep=",", nrows=1000, parse_dates=True);
df['Time'] = pd.to_datetime(df['Time'])
**ERROR**:
runfile('C:/Users/user/Desktop/Trading/Main/historical data/Index/ES/Intraday Volatility by VIX.py', wdir='C:/Users/user/Desktop/Trading/Main/historical data/Index/ES')
Traceback (most recent call last):
File "C:\Users\user\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Time'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\user\Desktop\Trading\Main\historical data\Index\ES\Intraday Volatility by VIX.py", line 18, in <module>
df['Time'] = pd.to_datetime(df['Time'], errors='ignore')
File "C:\Users\user\miniconda3\lib\site-packages\pandas\core\frame.py", line 2800, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\user\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Time'
Solved this error a month ago but completely forgot, pls help
I think there's a space in front of ' Time', you can use skipinitialspace=True:
df = pd.read_csv('test.csv', sep=',', nrows=1000, parse_dates=True, skipinitialspace=True)

During handling of the above exception, another exception occured

so basically what I'm trying to do is read a column from a csv file to an array then do calculations with that array. I have successfully gotten the array 'rawSunlightData' from the csv file but for some reason every time I try to select a variable from 'raySunlightData' array I get the error [During handling of the above exception, another exception occured] I can print the whole rawSunlightData but can't print individual values like rawSunlightData[0]
cleanSunlightData = []
rawSunlightData = pd.read_csv('Average daily sunlight per month.csv', header = None)
rawSunlightData = rawSunlightData.drop(rawSunlightData.columns[[0]], axis=1)
print(rawSunlightData[0])
i = 0
while i <= len(rawSunlightData):
arrayDivider = []
m = 0
while m < 12:
x = i + m
print(x)
m += 1
i += 12
the error message is
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3078, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/kennethwong/Desktop/Singapore crop yield /Downloaded data/Data cleaner.py", line 67, in <module>
cleanSunlightData()
File "/Users/kennethwong/Desktop/Singapore crop yield /Downloaded data/Data cleaner.py", line 46, in cleanSunlightData
print(rawSunlightData[0])
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
its okay guy I found out why, im still new to coding so I make mistakes... when you pull data from csv file and store it in a data frame, it is NOT an array ! you will have to convert it to an array by array.to_records()

Pandas reading html table

import pandas as pd
import pandas_datareader.data as web
coins = pd.read_html('https://coinmarketcap.com/')
for name in coins[0][1][1:]:
print(name)
Results in the error message below. When I print coins, I get the complete table, but when I try and get specific info it gives me this error message. I know this format works as I have copied it exactly from other exercises I have been learning from, and have just changed the website. Many thanks.
C:\Users\AppData\Local\Programs\Python\Python36-32\python.exe C:/Users/Desktop/python_work/crypto/crypto_corr.py
Traceback (most recent call last):
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2525, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Desktop/python_work/crypto/crypto_corr.py", line 6, in <module>
for name in coins[0][1][1:]:
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 1
Process finished with exit code 1
If df is a dataframe, indexing like df[column] looks for columns called column. In your case, coins[0] is a dataframe, which does not have a column 1. However, it does have a column Name, so to print all names do the following:
df = coins[0]
for name in df['Name']:
print(name)

Categories