I was using a CSV and took out a few dates then resaved it and suddenly my code that worked before did not work anymore. I tried referencing the old CSV and got the same error despite it running fine before.
Here is what I had tried:
import numpy as np
import pandas as pd
Q = pd.read_csv("Data_V3.csv")
Q['date'] = pd.to_datetime(Q['date'], format='%m-%d-%Y')
The CVS looks like this:
date: flow:
1/1/1930 1300
I also tried:
Q['date'] = pd.to_datetime(Q['date'], format='%Y-%m-%d')
The full error is: "ValueError: time data '1/1/1930' does not match format '%m-%d-%Y' (match)"
Thank you!
You need to use / instead of - in your code to match the format.
Q['date'] = pd.to_datetime(Q['date'], format='%m/%d/%Y')
When you read the error, it gives you some information you probably skip over when you read it:
"ValueError: time data '1/1/1930' does not match format '%m-%d-%Y' (match)"
That is an exact example of what broke your code. Then it gives you the format as well. The only difference that breaks it is a / from the example given.
Hey did you try to change you're .csv into :
date: flow:
01/01/1930 1300
Related
In my Databricks notebook, I am getting ParseException in the last line of the code below when converting string to Date data type. The column in csv file does correctly have hiring_date in a date format.
Question: What I may be doing wrong here and how can we fix the error?
Remark: I am using python and NOT scala. I do not know scala.
from pyspark.sql.functions import *
df = spark.read.csv(".../Test/MyFile.csv", header="true", inferSchema="true")
df2 = df.withColumn("hiring_date",df["hiring_date"].cast('DateType'))
If it is the last line of your code, with reference to this doc, the code should be modified as follows:
df2 = df.withColumn("hiring_date", df.hiring_date.cast(DateType()))
It seems you put a wrong value for cast function.
The following code would work as well:
df2 = df.withColumn("hiring_date", df["hiring_date"].cast('Date'))
I am trying to read date column from a csv file. This column contains dates in just one format. Please see data below:
The problem arises when I am trying to read it using dateparser.
dateparse=lambda x:datetime.strptime(x, '%m/%d/%Y').date()
df = pd.read_csv('products.csv', parse_dates=['DateOfRun'], date_parser=dateparse)
Above logic works fine most of the cases, but sometimes randomly i get error that format is not matching, example below:
ValueError: time data '2020-02-23' does not match format '%m/%d/%Y'
Does anyone know how is this possible? Because that yyyy-mm-dd format is not in my data.. ANy tips will be useful.
Thanks
The problem happens when you open the csv file in Excel. Excel by default (and based on your OS settings) automatically changes the date format. For instance, in USA the default format is MM/DD/YYYY so if you have a date in a csv file such as YYYY-MM-DD it will automatically change it to MM/DD/YYYY.
The solution is to NOT open the csv file in Excel before manipulating it in Python. IF you must open it to inspect it either look at it in Python or in notepad or some other text editor.
I always assume that dates are going to be screwed up because someone might have opened it in Excel and so I test for the proper format and then change it if I get an AssertionError.
As an example if you want to change dates from YYYY-MM-DD try this:
from datetime import datetime
def change_dates(date_string):
try:
assert datetime.strptime(date_string, '%m/%d/%y'), 'format error'
return date_string
except AssertionError, ValueError:
dt = datetime.strptime(date_string, '%Y-%m-%d')
return dt.strftime('%m/%d/%Y')
I use:
Python 3.7
SAS v7.1 Eterprise
I want to export some data (from library) from SAS to CSV. After that I want to import this CSV to Pandas Dataframe and use it.
I have problem, because when I export data from SAS with this code:
proc export data=LIB.NAME
outfile='path\to\export\file.csv'
dbms=csv
replace;
run;
Every column were exported correctly instead of Column with Date. In SAS I see something like:
06NOV2018
16APR2018
and so on... In CSV it looks the same. But if i import this CSV to DataFrame, unfortunatelly, Python see the column with date as Object/string instead of date type.
So here is my question. How Can I export whole library to CSV from SAS with correct type of column (ecpessially column with Date). Maybe I should convert something before Export? Plz help me with this, In SAS I'm new, i want to just import Data from it and use it in Python.
Before you write something, keep in mind, that I had tried with pandas read_sas function, but during this command I've got such Exception with error:
df1 = pd.read_sas(path)
ValueError: Unexpected non-zero end_of_first_byte Exception ignored
in: 'pandas.io.sas._sas.Parser.process_byte_array_with_data' Traceback
(most recent call last): File "pandas\io\sas\sas.pyx", line 31, in
pandas.io.sas._sas.rle_decompress
I put fillna function and show the same error :/
df = pd.DataFrame.fillna((pd.read_sas(path)), value="")
I tried with sas7bdat module in Python, but I've got the same error.
Then I tried with sas7bdat_converter module. But CSV has the same values in Date column, so problem with dtype will arrive after convert csv to DataFrame.
Have you got any sugestions? I've spent 2 days tried to figure it out, but without any positive results :/
Regarding the read_sas error, a Git issue has been reported but closed for lack of reproducible example. However, I can easily import SAS data files with Pandas using .sas7bdat files generated from SAS 9.4 base (possibly the v7.1 Enterprise is the issue).
However, consider using parse_dates argument of read_csv as it can convert your date DDMMMYY format to datetime during import. No change needed with your SAS exported dataset.
sas_df = pd.read_csv(r"path\to\export\file.csv", parse_dates = ['DATE_COLUMN'])
I'm trying to stream Live Quotes using the IexFinance API, keep in mind this is my first coding attempt. I've managed to be get the stock quote prices through python but I'm unsure how I would get that data then onto Excel.
From my understanding I would need to get this data into a csv file in order to export that into excel. I've tried adding the code df.to_csv('stock.csv') but I get the error 'StockReader' object has no attribute 'to_csv'
import pandas as pd
from iexfinance.stocks import stock
batch=Stock(['amd', 'tsla'], output_format='pandas')
batch.get_price
df.to_csv('stock.csv')
General pointer: Looks like you need to read into df variable first.
This line:
Stock(['amd', 'tsla'], output_format='pandas')
According to the guidance returns a dataframe
So:
df = Stock(['amd', 'tsla'], output_format='pandas')
Or, as you have now discovered:
df = batch.get_price
So I have a bit of code in python which tries to get home prices from zillow. I am following the documentation exactly but I still get errors. The code:
import quandl
quandl.ApiConfig.api_key = "I have a key here in the code"
data = quandl.get("http://www.quandl.com/api/v3/datasets/ZILL/S00022_A.csv", returns="numpy")
This, however, returns:
raise ValueError(Message.ERROR_COLUMN_INDEX_TYPE % dataset)
ValueError: The column index must be expressed as an integer for http://www.quandl.com/api/v3/datasets/ZILL/S00022_A.csv.
What does this mean and how do I fix it? Thanks in advance.
The code quandl.get() goes with the installed csv file and not an URL. So please import a dataset code and try to import it in your code by
quandl.get('WIKI/GOOGL')
Here, I have imported a dataset for stock prediction of Google