I am trying to parse a json object whith following code in python 3.
import json
str = '{"created_at":"Sun Aug 30 13:59:15 +0000 2015","id":637987951842951168,"id_str":"637987951842951168","text":"The Truth About the Iran Vatican False Prophet Anglo-American Western Alliance for Antichrist Israel: Palestin... http:\/\/t.co\/G79X164K9g","source":"\u003ca href=\"http:\/\/twitterfeed.com\" rel=\"nofollow\"\u003etwitterfeed\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":311859117,"id_str":"311859117","name":"Miko Furura","screen_name":"MikoFurura","location":"","url":null,"description":null,"protected":false,"verified":false,"followers_count":10,"friends_count":3,"listed_count":2,"favourites_count":4,"statuses_count":1264,"created_at":"Mon Jun 06 05:32:44 +0000 2011","utc_offset":32400,"time_zone":"Osaka","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"EBEBEB","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme7\/bg.gif","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme7\/bg.gif","profile_background_tile":false,"profile_link_color":"990000","profile_sidebar_border_color":"DFDFDF","profile_sidebar_fill_color":"F3F3F3","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_3_normal.png","profile_image_url_https":"https:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_3_normal.png","default_profile":false,"default_profile_image":true,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"trends":[],"urls":[{"url":"http:\/\/t.co\/G79X164K9g","expanded_url":"http:\/\/bit.ly\/1KvlIEu","display_url":"bit.ly\/1KvlIEu","indices":[114,136]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1440943155619"}'
c = json.loads(str)
print(c['id'])
when I execute the script, I got an error:
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 270 (char 269)
I have parsed many json objects with this code and can't understand what is wrong with it now, or what is wrong with this particular json object.
Regards.
The solution is to use r in from of your string
str = r'{"created_at":"Sun Aug 30 13:59:15 ...}'
This helps interpret your str variable as a raw string so you won't have trouble with the backslashes inside the json string.
In this part you could remove double quotes (") from html.
"source":"\u003ca href=\"http:\/\/twitterfeed.com\" rel=\"nofollow\"\u003etwitterfeed\u003c\/a\u003e"
to
"source":"\u003ca href=http:\/\/twitterfeed.com rel=nofollow\u003etwitterfeed\u003c\/a\u003e"
the extra double quotes are creating cyclic errors in JSON parser and HTML is fine without double quotes inside elements.
Try putting r before the string in str. I just tried it and it worked for me. Check out Lexical Analysis for more info.
str = r'{"created_at":"Sun Aug 30 13:59:15 +0000 2015","id":637987951842951168,"id_str":"637987951842951168","text":"The Truth About the Iran Vatican False Prophet Anglo-American Western Alliance for Antichrist Israel: Palestin... http:\/\/t.co\/G79X164K9g","source":"\u003ca href=\"http:\/\/twitterfeed.com\" rel=\"nofollow\"\u003etwitterfeed\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":311859117,"id_str":"311859117","name":"Miko Furura","screen_name":"MikoFurura","location":"","url":null,"description":null,"protected":false,"verified":false,"followers_count":10,"friends_count":3,"listed_count":2,"favourites_count":4,"statuses_count":1264,"created_at":"Mon Jun 06 05:32:44 +0000 2011","utc_offset":32400,"time_zone":"Osaka","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"EBEBEB","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme7\/bg.gif","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme7\/bg.gif","profile_background_tile":false,"profile_link_color":"990000","profile_sidebar_border_color":"DFDFDF","profile_sidebar_fill_color":"F3F3F3","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_3_normal.png","profile_image_url_https":"https:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_3_normal.png","default_profile":false,"default_profile_image":true,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"trends":[],"urls":[{"url":"http:\/\/t.co\/G79X164K9g","expanded_url":"http:\/\/bit.ly\/1KvlIEu","display_url":"bit.ly\/1KvlIEu","indices":[114,136]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1440943155619"}'
I have a time String like this:
07/01/2015-14:31:58.520
I use this command line to convert it:
import time
timeStr = "07/01/2015-14:31:58.520"
time.strptime(timeStr,'%d/%m/%y-%H:%M:%S.%f')
But this returns:
ValueError: time data '07/01/2015-14:31:58.520' does not match format
'%d/%m/%y-%H:%M:S.%f'
My python version is 2.7.7
%y denotes a 2 digit year, but your string has a 4 digit year. Use %Y (capital Y) to denote a 4 digit year. See the docs for more information.
time.strptime(timeStr, '%d/%m/%Y-%H:%M:%S.%f')
Note that datetime.strptime may be more useful, as it will return a full datetime object rather than a tuple. The format syntax is essentially the same.
It should have been capital Y for year (%Y in place of %y)
time.strptime(timeStr,'%d/%m/%Y-%H:%M:%S.%f')
You need to use %Y instead of %y
time.strptime(timeStr,'%d/%m/%Y-%H:%M:%S.%f')
To get a datetime object, use python-dateutil
To install
pip install python-dateutil
Then
t = "07/01/2015-14:31:58.520"
from dateutil import parser
>>>parser.parse(t)
datetime.datetime(2015, 7, 1, 14, 31, 58, 520000)
tim = parser.parse(t)
>>>str(tim.date())
'2015-07-01'
All operations to datetime objects is possible.
the time.strptime syntax %d/%m/%y-%H:%M:%S.%f is incorrect, it should be
"%d/%m/%Y-%H:%M:%S.%f"
where the only difference is that %y has become %Y. The reason is because from the docs %y is without century number ( [00,99] ), whereas %Y is with century number, which is the syntax you use with "2015"
Tested and functinal in python 2.7.5 and 3.4.1
Edit: Zero answers when I started typing this, 6 answers by time of post, sorry about that!
Edit #2: datetime.strptime functions similarly, so if you want to use that as well, you can!
There is a datetime string that I would like to convert back into a date. The time zone is giving me trouble and I don't know how to solve it.
datetime.datetime.strptime(json_event['date_time'], '%a, %d %b %Y %H:%M:%S %Z')
I get the error message:
ValueError: time data 'Tue, 08 Apr 2014 17:57:34 -0000' does not match
format '%a, %d %b %Y %H:%M:%S %Z'
If I leave %Z out, I get this error message:
ValueError: unconverted data remains: -0000
The date is originally a UTC:
current_date = datetime.datetime.utcnow()
UPDATE:
I would like to solve this natively without any external libraries such as dateutil.parser, hence the solution in the duplicate doesn't help me.
import dateutil.parser
date = dateutil.parser.parse(json_event['date_time'])
If you don't have dateutil, get it.
pip install python-dateutil
If you are always getting UTC times: Ignore the last 6 chars (space, sign, 4 digts) and then convert to datetime as you've done without the %Z.
One issue you'll have is that your system will assume that it is your local timezone and if you convert it to any other timezone, it will convert wrongly. In that case, next step is to use this answer from another question.
If you get non-UTC times as well:
crop out the last 6 chars.
Do the strptime on the last 4 digits, with the format HHMM (%H%M) --> Y
Get the sign and reverse in step 5 below.
Then get the rest of the datetime as you have above (leaving those last 6 chars and no %Z in the format) --> X
Then X-Y (or X+Y, invert what is got from step 3) will give you a datetime object. Then follow the steps in the linked answer to make the datetime obj timezone aware.
In Python, how can I convert a string like this:
Thu, 16 Dec 2010 12:14:05 +0000
to ISO 8601 format, while keeping the timezone?
Please note that the orginal date is string, and the output should be string too, not datetime or something like that.
I have no problem to use third parties libraries, though.
Using dateutil:
import dateutil.parser as parser
text = 'Thu, 16 Dec 2010 12:14:05 +0000'
date = parser.parse(text)
print(date.isoformat())
# 2010-12-16T12:14:05+00:00
Python inbuilt datetime package has build in method to convert a datetime object to isoformat. Here is a example:
>>>from datetime import datetime
>>>date = datetime.strptime('Thu, 16 Dec 2010 12:14:05', '%a, %d %b %Y %H:%M:%S')
>>>date.isoformat()
output is
'2010-12-16T12:14:05'
I wrote this answer primarily for people, who work in UTC and doesn't need to worry about time-zones. You can strip off last 6 characters to get that string.
Python 2 doesn't have very good internal library support for timezones, for more details and solution you can refer to this answer on stackoverflow, which mentions usage of 3rd party libraries similar to accepted answer.