How to make a json string by reading file line by line? - python

I have a file like this:
machineA=1.2.3.4.5
machineB=2.5.7.8.9
I need to read above file and make a string like this for each line:
{"test":"hello world.","server":"machineA","ip":"1.2.3.4.5","reason":"data"}
{"test":"hello world.","server":"machineB","ip":"2.5.7.8.9","reason":"data"}
As you can see in my above json string only value of server and ip is changing and other remaining two keys, it's value stays same always. How can I generate a string like this for each line?
I am not able to figure out how to make correspoding json for each line and print them out on the console.
f = open('hosts.txt')
line = f.readline()
while line:
print(line)
machine=line.split("=")[0]
ip=line.split("=")[1]
# now how to make a json string for each line
line = f.readline()
f.close()

I'd suggest making a dictionary then using the built in json module to convert it to a JSON string, because I find that cleaner to read.
import json
mydict = {
"test": "hello world.",
"server": machine,
"ip": ip,
"reason": "data"
}
json.dumps(mydict)
See Robᵩ's answer for how to do it without the json module

You don't have to use the JSON library just because the output format is JSON. Try one of these:
output_line = '{"test":"hello world.","server":"%s","ip":"%s","reason":"data"}'%(
machine, ip)
output_line = '{{"test":"hello world.","server":"{}","ip":"{}","reason":"data"}'.format(
machine, ip)
output_line = f'{{"test":"hello world.","server":"{machine}","ip":"{ip}","reason":"data"}'
Note that the final line only works on recent versions of Python3.

Related

Convert json with data from each id in different lines into one line per id with python

I have a json file with the following format:
{
"responses":[
{
"id":"123",
"cid":"01A",
"response":{nested lists and dictionaries}
},
{
"id":"456",
"cid":"54G",
"response":{nested lists and dictionaries}
}
]}
And so on.
And I want to convert it into a json file like this:
{"id":"123", "cid":"01A", "response":{nested lists and dictionaries}},
{"id":"456", "cid":"54G", "response":{nested lists and dictionaries}}
or
{responses:[
{"id":"123", "cid":"01A", "response":{nested lists and dictionaries}},
{"id":"456", "cid":"54G", "response":{nested lists and dictionaries}}
]}
I don't care about the surrounding format as long as I have the information for each ID in just one line.
I have to do this while reading it because things like pd.read_json don't read this kind of file.
Thanks!
Maybe just dump it line wise? But I guess I didn't understand your question right?
import json
input_lines = {"responses": ...}
with open("output.json", "w") as f:
for line in input_lines["responses"]:
f.write(json.dumps(line) + "\n")
You can use the built-in json library to print each response on a separate line. The json.dump() function has an option to indent, if you want that, but its default is to put everything on one line, like what you want.
Here's an example that works for the input you showed in your post.
#!/usr/bin/env python3
import json
import sys
with open(sys.argv[1]) as json_file:
obj = json.load(json_file)
print("{responses:[")
for response in obj['responses']:
print(json.dumps(response))
print("]}")
Usage (assuming you named the program format_json.py):
$ chmod +x format_json.py
$ format_json.py my_json_input.json > my_json_output.json
Or, if you're not in a command-line environment, you can also hardcode the input and output filenames:
#!/usr/bin/env python3
import json
import sys
infile = 'my_json_input.json'
outfile = 'my_json_output.json'
with open(infile) as json_file:
obj = json.load(json_file)
print("{responses:[", file=outfile)
for response in obj['responses']:
print(json.dumps(response), file=outfile)
print("]}", file=outfile)

How to load cookies from a text file in python requests

I've been trying to load raw cookies (not json) but using a txt file instead of declaring it directly inside a variable but getting no success , Here's my code ->
cookies.txt
{"PHPSESSID": "ibd1biktq4tfm3k4j790juf19d", "security": "impossible"}
my python script
file = open("cookies.txt", "r")
file = file.readlines()
str1 = " "
cookies = str1.join(file) # for converting list into string
requests.get("http://localhost/dvwa", cookies=cookies)
output : TypeError: string indices must be integers
Also if i do print(cookies) it outputs {"PHPSESSID": "ibd1biktq4tfm3k4j790juf19d", "security": "impossible"} and declaring the same output directly into the variable "cookies" works ...
Can anyone please clarify what i am doing wrong ?
Try to parse the string to Python dictionary using ast.literal_eval:
from ast import literal_eval
with open("cookies.txt", "r") as f_in:
cookies = literal_eval(f_in.read())
requests.get("http://localhost/dvwa", cookies=cookies)

Reading a text file of dictionaries stored in one line

Question
I have a text file that records metadata of research papers requested with SemanticScholar API. However, when I wrote requested data, I forgot to add "\n" for each individual record. This results in something looks like
{<metadata1>}{<metadata2>}{<metadata3>}...
and this should be if I did add "\n".
{<metadata1>}
{<metadata2>}
{<metadata3>}
...
Now, I would like to read the data. As all the metadata is now stored in one line, I need to do some hacks
First I split the cluttered dicts using "{".
Then I tried to convert the string line back to dict. Note that I do consider line might not be in a proper JSON format.
import json
with open("metadata.json", "r") as f:
for line in f.readline().split("{"):
print(json.loads("{" + line.replace("\'", "\"")))
However, there is still an error message
JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
I am wondering what should I do to recover all the metadata I collected?
MWE
Note, in order to get metadata.json file I use, use the following code, it should work out of the box.
import json
import urllib
import requests
baseURL = "https://api.semanticscholar.org/v1/paper/"
paperIDList = ["200794f9b353c1fe3b45c6b57e8ad954944b1e69",
"b407a81019650fe8b0acf7e4f8f18451f9c803d5",
"ff118a6a74d1e522f147a9aaf0df5877fd66e377"]
for paperID in paperIDList:
response = requests.get(urllib.parse.urljoin(baseURL, paperID))
metadata = response.json()
record = dict()
record["title"] = metadata["title"]
record["abstract"] = metadata["abstract"]
record["paperId"] = metadata["paperId"]
record["year"] = metadata["year"]
record["citations"] = [item["paperId"] for item in metadata["citations"] if item["paperId"]]
record["references"] = [item["paperId"] for item in metadata["references"] if item["paperId"]]
with open("metadata.json", "a") as fileObject:
fileObject.write(json.dumps(record))
The problem is that when you do the split("{") you get a first item that is empty, corresponding to the opening {. Just ignore the first element and everything works fine (I added an r in your quote replacements so python considers then as strings literals and replace them properly):
with open("metadata.json", "r") as f:
for line in f.readline().split("{")[1:]:
print(json.loads("{" + line).replace(r"\'", r"\""))
As suggested in the comments, I would actually recommend recreating the file or saving a new version where you replace }{ by }\n{:
with open("metadata.json", "r") as f:
data = f.read()
data_lines = data.replace("}{","}\n{")
with open("metadata_mod.json", "w") as f:
f.write(data_lines)
That way you will have the metadata of a paper per line as you want.

How to remove comment lines from a JSON file in python

I am getting a JSON file with following format :
// 20170407
// http://info.employeeportal.org
{
"EmployeeDataList": [
{
"EmployeeCode": "200005ABH9",
"Skill": CT70,
"Sales": 0.0,
"LostSales": 1010.4
}
]
}
Need to remove the extra comment lines present in the file.
I tried with the following code :
import json
import commentjson
with open('EmployeeDataList.json') as json_data:
employee_data = json.load(json_data)
'''employee_data = json.dump(json.load(json_data))'''
'''employee_data = commentjson.load(json_data)'''
print(employee_data)`
Still not able to remove the comments from the file and bring
the JSON file in correct format.
Not getting where things are going wrong? Any direction in this regard is highly appreciated.Thanks in advance
You're not using commentjson correctly. It has the same interface as the json module:
import commentjson
with open('EmployeeDataList.json', 'r') as handle:
employee_data = commentjson.load(handle)
print(employee_data)
Although in this case, your comments are simple enough that you probably don't need to install an extra module to remove them:
import json
with open('EmployeeDataList.json', 'r') as handle:
fixed_json = ''.join(line for line in handle if not line.startswith('//'))
employee_data = json.loads(fixed_json)
print(employee_data)
Note the difference here between the two code snippets is that json.loads is used instead of json.load, since you're parsing a string instead of a file object.
Try JSON-minify:
JSON-minify minifies blocks of JSON-like content into valid JSON by removing all whitespace and JS-style comments (single-line // and multiline /* .. */).
I usually read the JSON as a normal file, delete the comments and then parse it as a JSON string. It can be done in one line with the following snippet:
with open(path,'r') as f: jsonDict = json.loads('\n'.join(row for row in f if not row.lstrip().startswith("//")))
IMHO it is very convenient because it does not need CommentJSON or any other non standard library.
Well that's not a valid json format so just open it like you would a text document then delete anything from// to \n.
with open("EmployeeDataList.json", "r") as rf:
with open("output.json", "w") as wf:
for line in rf.readlines():
if line[0:2] == "//"
continue
wf.write(line)
Your file is parsable using HOCON.
pip install pyhocon
>>> from pyhocon import ConfigFactory
>>> conf = ConfigFactory.parse_file('data.txt')
>>> conf
ConfigTree([('EmployeeDataList',
[ConfigTree([('EmployeeCode', '200005ABH9'),
('Skill', 'CT70'),
('Sales', 0.0),
('LostSales', 1010.4)])])])
If it is the same number of lines every time you can just do:
fh = open('EmployeeDataList.NOTjson',"r")
rawText = fh.read()
json_data = rawText[rawText.index("\n",3)+1:]
This way json_data is now the string of text without the first 3 lines.

How to load json data from serialised file and process them in python 3?

I am trying to create a program in python3 (Mac OS X) and tkinter. It takes an incremental id, the datetime.now and a third string as variables. For example,
a window opens displaying : id / date time / "hello world". The user makes a choice and presses a save button. The inputs are being serialised as json and saved in a file.
mytest = dict([('testId',testId), ('testDate',testDate), ('testStyle',testStyle)])
with open('data/test.txt', mode = 'a', encoding = 'utf-8') as myfile:
json.dump(mytest, myfile, indent = 2)
myfile.close()
the result in the file is
{
"testStyle": "blabla",
"testId": "8",
"testDate": "2013-05-09 13:32"
}{
"testDate": "2013-05-09 13:41",
"testId": "9",
"testStyle": "blabla"
}
As a python newbie, I want to load the file data and make some checks, like "If user made another entry at 2013-05-09, display a message saying that you already entered data for today." What is the proper way to load all these json data ? The list will expand each day and will contain lots of data.
Instead of directly storing the dictionary you can store a list of dictionaries which can be loaded back into an list which can be modified and appended to
import json
mytest1 = dict([('testId','testId1'), ('testDate','testDate1'), ('testStyle','testStyle1')])
json_values = []
json_values.append(mytest1)
s = json.dumps(json_values)
print(s)
json_values = None
mytest2 = dict([('testId','testId2'), ('testDate','testDate2'), ('testStyle','testStyle2')])
json_values = json.loads(s)
json_values.append(mytest2)
s = json.dumps(json_values)
print(s)
You could simply load the file and parse it:
with open(path, mode="r", encoding="utf-8") as myfile:
data = json.loads(myfile.read())
Now you can do with data whatever you want.
If your file is really big, then I suppose you should use a proper database instead.

Categories