Decoding JSON file from Twitter in Python using simplejson - python

A small part of my JSON file looks like the following. It passed a JSON validator. (I added cl
{
"next_page": "?page=2&max_id=210389654296469504&q=cocaine&geocode=40.665572%2C-73.923557%2C10mi&rpp=100",
"completed_in": 0.289,
"max_id_str": "210389654296469504",
"since_id_str": "0",
"refresh_url": "?since_id=210389654296469504&q=cocaine&geocode=40.665572%2C-73.923557%2C10mi",
"results": [
{
"iso_language_code": "en",
"to_user_id": 486935435,
"to_user_id_str": "486935435",
"profile_image_url_https": "https://si0.twimg.com/profile_images/1561856049/Zak_W_Photo_normal.jpg",
"from_user_id_str": "82389940",
"text": "#Bill__Murray cocaine > productivity! Last night I solved the euro crisis and designed a new cat. If I could only find that napkin.",
"from_user_name": "Zak Williams",
"in_reply_to_status_id_str": "210319741322133504",
"profile_image_url": "http://a0.twimg.com/profile_images/1561856049/Zak_W_Photo_normal.jpg",
"id": 210389654296469500,
"to_user": "Bill__Murray",
"source": "<a href="http://twitter.com/#!/download/iphone" rel="nofollow">Twitter for iPhone</a>",
"in_reply_to_status_id": 210319741322133500,
"to_user_name": "Bill Murray",
"location": "Brooklyn",
"from_user": "zakwilliams",
"from_user_id": 82389940,
"metadata": {
"result_type": "recent"
},
"geo": "null",
"created_at": "Wed, 06 Jun 2012 15:16:17 +0000",
"id_str": "210389654296469504"
}
]
}
When I try to load this in Python by typing the following code I get the following error.
Code
import simplejson as json
testname = 'test.txt'
record = json.loads(testname)
Error
raise JSONDecodeError("No JSON object could be decoded", s, idx)
simplejson.decoder.JSONDecodeError: No JSON object could be decoded: line 1 column 0 (char 0)
What am I doing wrong? In fact, I generated the file by using simplejson.dump

The json.loads() function loads JSON data from a string, and you are just giving it a file name. The string test.txt is not a valid JSON string. Try the following to load JSON data from a file:
with open(testname) as f:
record = json.load(f)
(If you're using an old version of Python that doesn't support the with statement (as possibly indicated by your use of the old simplejson), then you'll have to open and close the file yourself.)

Related

JSON.parse cause unexpected end of json input but my json is correct

i have a python script that create a json and i have a nodejs script that read the json:
python script
with open("music.json", "w") as write_file:
json.dump(music_found, write_file, indent=4)
music_found is an array of object
nodejs
import fs from 'fs'
import { cwd } from 'process';
let rawdata = fs.readFileSync(`${cwd()}/music.json`);
let music = JSON.parse(rawdata)
console.log(music);
i get the message unexpected end of json input
json example
[
{
"user": "some_user1",
"file": "##enlbq\\_Music\\Infected Mushroom\\Return to the Sauce [2017] [HMCD94]\\09 - Infected Mushroom - Liquid Smoke.flac",
"size": 42084572,
"slots": true,
"speed": 1003176
},
{
"user": "some_user2",
"file": "##xfjpb\\Musiikkia\\Infected Mushroom\\Return to the Sauce\\09 Infected Mushroom - Liquid Smoke.flac",
"size": 24617421,
"slots": true,
"speed": 541950
},
{
"user": "some_user3",
"file": "##rxjpv\\MyMusic\\Infected Mushroom\\Infected Mushroom - Return To The Sauce (2017) [CD FLAC]\\09 - Liquid Smoke.flac",
"size": 41769608,
"slots": true,
"speed": 451671
}
]
my json is well formatted no ? i'm on that since 4hours and im stuck on it... very annoying ^^ hope somehone help
readFileSync
If the encoding option is specified then this function returns a
string. Otherwise it returns a buffer.
You need to either set the encoding or use the buffer's .toString() method
Can you make sure the path is correct ?
i can read your data normally in js so i think the problem is just the path

how to read json object in python [duplicate]

This question already has answers here:
can't read json file with python. getting type error: json object is 'TextIOWrapper'
(3 answers)
Closed 5 years ago.
I have json file named "panamaleaks50k.json". I want to get ['text'] field from the json file but it shows me following error
the JSON object must be str, bytes or bytearray, not 'TextIOWrapper'
this is my code
with open('C:/Users/bilal butt/Desktop/PanamalEakJson.json','r') as lst:
b = json.loads(lst)
print(b['text'])
my json file look
[
{
"fullname": "Mohammad Fayyaz",
"id": "885800668862263296",
"likes": "0",
"replies": "0",
"retweets": "0",
"text": "Love of NS has been shown in PanamaLeaks scandal verified by JIT...",
"timestamp": "2017-07-14T09:58:31",
"url": "/mohammadfayyaz/status/885800668862263296",
"user": "mohammadfayyaz"
},
{
"fullname": "TeamPakistanPTI \u00ae",
"id": "885800910357749761",
"likes": "0",
"replies": "0",
"retweets": "0",
"text": "RT ArsalanISF: #PanamaLeaks is just a start. U won't believe whr...",
"timestamp": "2017-07-14T09:59:29",
"url": "/PtiTeampakistan/status/885800910357749761",
"user": "PtiTeampakistan"
}
]
how I can read all ['text'] and just single ['text'] field?
You should pass the file contents (i.e. a string) to json.loads(), not the file object itself. Try this:
with open(file_path) as f:
data = json.loads(f.read())
print(data[0]['text'])
There's also the json.load() function which accepts a file object and does the f.read() part for you under the hood.
Use json.load(), not json.loads(), if your input is a file-like object (such as a TextIOWrapper).
Given the following complete reproducer:
import json, tempfile
with tempfile.NamedTemporaryFile() as f:
f.write(b'{"text": "success"}'); f.flush()
with open(f.name,'r') as lst:
b = json.load(lst)
print(b['text'])
...the output is success.

How to read JSON objects from Tweet.py results

I am trying to read the JSON file created by Tweet.py. However, whatever I tried I am receiving an ValueError consistently.
ValueError: Expecting property name: line 1 column 3 (char 2)
JSON results are in the format of:
{ 'Twitter Data' : [ {
"contributors": null,
"coordinates": null,
"created_at": "Tue Oct 24 15:55:21 +0000 2017",
"entities": {
"hashtags": ["#football"]
}
} , {
"contributors": johnny,
"coordinates": null,
"created_at": "Tue Oct 24 15:55:21 +0000 2017",
"entities": {
"hashtags": ["#football" , "#FCB"]
}
} , ... ] }
There are at least 50 of these JSON objects in the file, which are separated by commas.
My Python script to read this json file is:
twitter_data=[]
with open('#account.json' , 'r') as json_data:
for line in json_data:
twitter_data.append(json.loads(line))
print twitter_data
Tweet.py writes these Json objects by using:
json.dump(status._json,file,sort_keys = True,indent = 4)
I would appreciate any help and guidance on how to read this file!
Thank you.
The { 'Twitter Data' bit should be { "Twitter Data" as well as "Johnny"
That is to say keys and values (strings) must be enclosed in double quotes.
with open("#account.json","r") as json_data:
data = json_data.readlines()
twitter_data.append(json.loads(data))
Also, Haven't used this myself but this might be of help as well: https://jsonlint.com
First off, as both #Rob and #silent have noted, 'Twitter Data' should be "Twitter Data". Json needs double quotes, not single quotes to delimit a string.
Secondly, when reading with json.load() it expects a file Object, so when calling json.load(), just pass in json_data and it will read the whole json file into memory:
with open('#account.json' , 'r') as json_data:
contents = json.load(json_data)
EDIT:
for handling multiple objects at once:
def get_objs(f):
content = f.read()
# Get each object in the contents of the file object.
# This is kinda clunky and inelegant, but it should work
objs = ['{}{}'.format(i, '}') for i in content.split('},')]
# Last json_obj probably got an unnecessary "}" at the end, so trim the
# last character from it
objs[-1] = objs[-1][0:-1]
json_objs = [json.loads(i) for i in objs]
return json_objs
and then just go:
with open('#account.json', 'r') as json_data:
json_objs = get_objs(json_data)
Hopefully this will work for you. It did for me when I tested it on a simalarly formatted json file.

TypeError: string indices must be integers // working with JSON as dict in python

Okay, so I've been banging my head on this for the last 2 days, with no real progress. I am a beginner with python and coding in general, but this is the first issue I haven't been able to solve myself.
So I have this long file with JSON formatting with about 7000 entries from the youtubeapi.
right now I want to have a short script to print certain info ('videoId') for a certain dictionary key (refered to as 'key'):
My script:
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key']['Items']['id']['videoId'])
# print(trailers['key']['videoId'] gives same response
Error:
print(trailers['key']['Items']['id']['videoId'])
TypeError: string indices must be integers
It does work when I want to print all the information for the dictionary key:
This script works
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key'])
Also print(type(trailers)) results in class 'dict', as it's supposed to.
My JSON File is formatted like this and is from the youtube API, youtube#searchListResponse.
{
"kind": "youtube#searchListResponse",
"etag": "",
"nextPageToken": "",
"regionCode": "",
"pageInfo": {
"totalResults": 1000000,
"resultsPerPage": 1
},
"items": [
{
"kind": "youtube#searchResult",
"etag": "",
"id": {
"kind": "youtube#video",
"videoId": ""
},
"snippet": {
"publishedAt": "",
"channelId": "",
"title": "",
"description": "",
"thumbnails": {
"default": {
"url": "",
"width": 120,
"height": 90
},
"medium": {
"url": "",
"width": 320,
"height": 180
},
"high": {
"url": "",
"width": 480,
"height": 360
}
},
"channelTitle": "",
"liveBroadcastContent": "none"
}
}
]
}
What other information is needed to be given for you to understand the problem?
The following code gives me all the videoId's from the provided sample data (which is no id's at all in fact):
import json
with open('sampledata', 'r') as datafile:
data = json.loads(datafile.read())
print([item['id']['videoId'] for item in data['items']])
Perhaps you can try this with more data.
Hope this helps.
I didn't really look into the youtube api but looking at the code and the sample you gave it seems you missed out a [0]. Looking at the structure of json there's a list in key items.
import json
f = open ('json1.json', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['items'][0]['id']['videoId'])
I've not used json before at all. But it's basically imported in the form of dicts with more dicts, lists etc. Where applicable. At least from my understanding.
So when you do type(trailers) you get type dict. Then you do dict with trailers['key']. If you do type of that, it should also be a dict, if things work correctly. Working through the items in each dict should in the end find your error.
Pythons error says you are trying find the index/indices of a string, which only accepts integers, while you are trying to use a dict. So you need to find out why you are getting a string and not dict when using each argument.
Edit to add an example. If your dict contains a string on key 'item', then you get a string in return, not a new dict which you further can get a dict from. item in the json for example, seem to be a list, with dicts in it. Not a dict itself.

Unable to convert returned JSON data into Dict in Scrapy

I am using scrapy to get a certain piece of data from here. As I was suggested here, I used the following code in my script:
pattern = re.compile(r"qubit_product_list = (.*?);", re.M)
script = hxs.select("//script[contains(., 'qubit_product_list')]/text()").extract()[0]
data = pattern.search(script).group(1)
j_data = json.loads(data)
self.log('After calling LOAD Begins')
self.log(j_data) #It is not printing ANYTHING!!!!
self.log('After calling LOAD Ends')
self.log('\n---------------------------------\n')
Which outputs following from variable data:
{
"9102-DBL-sprung slat base": {
"id": "9102",
"name": "Imperial Bedstead",
"url": "/p/Imperial_Bedstead.htm",
"description": "Double - Sprung Slat Base",
"unit_price": 429.99,
"unit_sale_price": 429.99,
"currency": "GBP",
"sku_code": "BENT:1320B-Beech",
"category": "Bed Frames",
"stock": 100
},
"9102-KS-sprung slat base": {
"id": "9102",
"name": "Imperial Bedstead",
"url": "/p/Imperial_Bedstead.htm",
"description": "Kingsize - Sprung Slat Base",
"unit_price": 439.98996,
"unit_sale_price": 439.98996,
"currency": "GBP",
"sku_code": "BENT:1326B-Beech",
"category": "Bed Frames",
"stock": 100
}
}
Now, I want to convert this json like structure to python dict. I tried following but it returns unicode type.
j_data = json.loads(data)
So, how do I get Array/Dict in Python 2.7? Ironically same loads method is returning of type dict when using scrapy shell.
Try this:
#typecasting the JSON to string for json.loads to work
data = str(data)
#returning type dict from json
j_data = json.loads(data)
#typecasting the dict to string before writing to log
self.log(str(j_data))

Categories