Passing a json file to a method in Python - python

I have a json file on my computer that I need to pass to the following function:
def read_json(file):
try:
logging.debug('Reading from input')
return read_json_from_string(json.load(file))
finally:
logging.debug('Done reading')
How can I move a file from my computer to a method in python? I have tried the following:
file = os.path.abspath('theFile.json')
Then attempting to run the method as
read_json(file)
But I get the following error:
TypeError: expected file
I also tried:
file = open('theFile.json', 'r')
But I always get an error related to the 'file' not being a file.

=====UPDATED=====
Now includes an example of calling the function
Try something like:
import logging
import json
def read_json(file):
try:
print('Reading from input')
with open(file, 'r') as f:
return json.load(f)
finally:
print('Done reading')
return_dict = read_json("my_file.json")
print return_dict

json.load takes a file-like object, and you're passing it a str containing the path. Try this instead:
path = os.path.abspath('theFile.json')
with open(path) as f:
read_json(f)
Note that json.load returns a dictionary, not a string. Also, the finally: is executed even if an exception is raised in the try:, so you'll always log "Done reading", even if an error occurred and the read was aborted.

Related

Reading a binary file Python (pickle) [duplicate]

I created some data and stored it several times like this:
with open('filename', 'a') as f:
pickle.dump(data, f)
Every time the size of file increased, but when I open file
with open('filename', 'rb') as f:
x = pickle.load(f)
I can see only data from the last time.
How can I correctly read file?
Pickle serializes a single object at a time, and reads back a single object -
the pickled data is recorded in sequence on the file.
If you simply do pickle.load you should be reading the first object serialized into the file (not the last one as you've written).
After unserializing the first object, the file-pointer is at the beggining
of the next object - if you simply call pickle.load again, it will read that next object - do that until the end of the file.
objects = []
with (open("myfile", "rb")) as openfile:
while True:
try:
objects.append(pickle.load(openfile))
except EOFError:
break
There is a read_pickle function as part of pandas 0.22+
import pandas as pd
obj = pd.read_pickle(r'filepath')
The following is an example of how you might write and read a pickle file. Note that if you keep appending pickle data to the file, you will need to continue reading from the file until you find what you want or an exception is generated by reaching the end of the file. That is what the last function does.
import os
import pickle
PICKLE_FILE = 'pickle.dat'
def main():
# append data to the pickle file
add_to_pickle(PICKLE_FILE, 123)
add_to_pickle(PICKLE_FILE, 'Hello')
add_to_pickle(PICKLE_FILE, None)
add_to_pickle(PICKLE_FILE, b'World')
add_to_pickle(PICKLE_FILE, 456.789)
# load & show all stored objects
for item in read_from_pickle(PICKLE_FILE):
print(repr(item))
os.remove(PICKLE_FILE)
def add_to_pickle(path, item):
with open(path, 'ab') as file:
pickle.dump(item, file, pickle.HIGHEST_PROTOCOL)
def read_from_pickle(path):
with open(path, 'rb') as file:
try:
while True:
yield pickle.load(file)
except EOFError:
pass
if __name__ == '__main__':
main()
I developed a software tool that opens (most) Pickle files directly in your browser (nothing is transferred so it's 100% private):
https://pickleviewer.com/ (formerly)
Now it's hosted here: https://fire-6dcaa-273213.web.app/
Edit: Available here if you want to host it somewhere: https://github.com/ch-hristov/Pickle-viewer
Feel free to host this somewhere.

what is an exception handler for

I have a script which wants to load integers from a text file. If the file does not exist I want the user to be able to browse for a different file (or the same file in a different location, I have UI implementation for that).
What I don't get is what the purpose of Exception handling, or catching exceptions is. From what I have read it seems to be something you can use to log errors, but if an input is needed catching the exception won't fix that. I am wondering if a while loop in the except block is the approach to use (or don't use the try/except for loading a file)?
with open(myfile, 'r') as f:
try:
with open(myfile, 'r') as f:
contents = f.read()
print("From text file : ", contents)
except FileNotFoundError as Ex:
print(Ex)
You need to use to while loop and use a variable to verify in the file is found or not, if not found, set in the input the name of the file and read again and so on:
filenotfound = True
file_path = myfile
while filenotfound:
try:
with open(file_path, 'r') as f:
contents = f.read()
print("From text file : ", contents)
filenotfound = False
except FileNotFoundError as Ex:
file_path = str(input())
filenotfound = True

File handling with functions?

So I got this code that is supposed to sort a dictionary within a json file alphabetically by key:
import json
def values(infile,outfile):
with open(infile):
data=json.load(infile)
data=sorted(data)
with open(outfile,"w"):
json.dump(outfile,data)
values("values.json","values_out.json")
And when I run it I get this error:
AttributeError: 'str' object has no attribute 'read'
I'm pretty sure I messed something up when I made the function but I don't know what.
EDIT: This is what the json file contains:
{"two": 2,"one": 1,"three": 3}
You are using the strings infile and outfile in your json calls, you need to use the file description instance, that you get using as keyword
def values(infile,outfile):
with open(infile) as fic_in:
data = json.load(fic_in)
data = sorted(data)
with open(outfile,"w") as fic_out:
json.dump(data, fic_out)
You can group, with statements
def values(infile, outfile):
with open(infile) as fic_in, open(outfile, "w") as fic_out:
json.dump(sorted(json.load(fic_in)), fic_out)
You forgot to assign the file you opened to a variable. In your current code you open a file, but then try to load the filename rather than the actual file. This code should run because you assign the file object reference to my_file.
import json
def values(infile,outfile):
with open(infile) as my_file:
data=json.load(my_file)
data=sorted(data)
with open(outfile,"w"):
json.dump(outfile,data)
values("values.json","values_out.json")

Remove a JSON file if an exception occurs

I am writing a program which stores some JSON-encoded data in a file, but sometimes the resulting file is blank (because there wasn't found any new data). When the program finds data and stores it, I do this:
with open('data.tmp') as f:
data = json.load(f)
os.remove('data.tmp')
Of course, if the file is blank this will raise an exception, which I can catch but does not let me to remove the file. I have tried:
try:
with open('data.tmp') as f:
data = json.load(f)
except:
os.remove('data.tmp')
And I get this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "MyScript.py", line 50, in run
os.remove('data.tmp')
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
How could I delete the file when the exception occurs?
How about separating out file reading and json loading? json.loads behaves exactly same as json.load but uses a string.
with open('data.tmp') as f:
dataread = f.read()
os.remove('data.tmp')
#handle exceptions as needed here...
data = json.loads(dataread)
I am late to the party. But the json dump and load modules seem to keep using files even after writing or reading data from them. What you can do is use dumps or loads modules to get the string representation and then use normal file.write() or file.read() on the result.
For example:
with open('file_path.json'), 'w') as file:
file.write(json.dumps(json_data))
os.remove('file_path.json')
Not the best alternative but it saves me a lot especially when using temp dir.
you need to edit the remove part, so it handles the non-existing case gracefully.
import os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
try:
if os.stat(fn).st_size > 0:
os.remove(fn) if os.path.exists(fn) else None
except OSError as e: # this would be "except OSError, e:" before Python 2.6
if e.errno != errno.ENOENT:
raise
see also Most pythonic way to delete a file which may not exist
you could extract the silent removal in a separate function.
also, from the same other SO question:
# python3.4 and above
import contextlib, os
try:
fn = 'data.tmp'
with open(fn) as f:
data = json.load(f)
except:
with contextlib.suppress(FileNotFoundError):
if os.stat(fn).st_size > 0:
os.remove(fn)
I personally like the latter approach better - it's explicit.

pickle.dump dumps nothing when appending to file

User may give a bunch of urls as command line args. All URLs given in the past are serialized with pickle. The script checks all given URLs, if they are unique then they are serialized and appended to a file. At least that's what should be happening. Nothing is being appended. However when I open the file in write mode,the new, unique URL is written. So what gives? Code is:
def get_new_urls():
if(len(urls.URLs) != 0): # check if empty
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
toDump = []
for arg in urls.URLs:
if (arg in cereal):
print("Duplicate URL {0} given, ignoring it.".format(arg))
else:
toDump.append(arg)
except Exception as e:
print("Holy bleep something went wrong: {0}".format(e))
return(toDump)
urlsToDump = get_new_urls()
print(urlsToDump)
# TODO: append new URLs
if(urlsToDump):
with open(urlFile, 'ab') as f:
pickle.dump(urlsToDump, f)
# TODO check HTML of each page against the serialized copy
with open(urlFile, 'rb') as f:
try:
cereal = pickle.load(f)
print(cereal)
except EOFError: # your URL file is empty, bruh
pass
Pickle writes out the data you give it in a special format, e.g. it will write some header/metadata/etc, to the file you give it.
It is not intended to work this way; concatenating two pickle files doesn't really make sense. To achieve a concatenation of your data, you'd need to first read whatever is in the file into your urlsToDump, then update your urlsToDump with any new data, and then finally dump it out again (overwriting the whole file, not appending).
After
with open(urlFile, 'rb') as f:
you need a while loop, to repeatedly unpickle (repeatedly read) from the file until hitting EOF.

Categories