Basically I have created a dictionary in python, say
dictionary = {'test': [1, 2, 3], 'other': [100]}
but I want to now write a program that would generate a dict file(say file1.dict) containing the dictionary and a idx file(say file2.idx) containing its inverted index posting.
I would suggest using .json format for any objects you want to store. For that you can import the json library and then use json.dump to store the object and json.load to get it back.
If you want to mess with the format a bit you can use json.dumps which returns a string, then change whatever you want in that string and writing it to a file.
I would not suggest using the file formats you are describing because they are not standard as someone else mentioned.
First to create a file use the following code-
Note: as .dict and .idx are not standard extensions I have used .txt
dict_file = open("sample.txt", "w+")
idx_file = open("sample.txt", "w+")
Now to write something in it use-
dict_file.write(#pass the dictionary in it)
idx_file.write(#pass the inverted index)
At last add the following code to close it-
dict_file.close()
idx_file.close()
Related
I've a text input file with different inputs as below stored in dictionary and list format respectively. A portion of the text file is below:
#Drive-Size-Mapping
{'root' : 50 , '\usr' : 20, 'swap' : 1}
#OS-Version
[7.1, 7.2, 7.3, 7.4, 7.5, 7.6]
As you can see the first line is in Dictionary format and the 2nd line is in List format.
Now, when I'm reading and storing the lines in variables, Python fails to recognize the format. This is my code I'm trying:
f=open("C:\\Users\xyz\\Desktop\\Inputs.txt")
lines=f.readlines()
i = lines.index("#Drive-Size-Mapping\n")
di=dict()
di = lines[i+1]
print(di[0])
j = lines.index("#OS-Version\n")
list = lines[j+1]
print(list[0])
The out of above code is:
{
{
i.e it just prints the item which is present in the first array index of the array.
What I want python to interpret these variables as dictionary and List respectively. So , if I do print di["root"]
My code should recognize it as dictionary and print as 50. And
print(list[0]) should recognize it as list and output 7.1.
Please let me know how should the file lines be manipulated so as to make python interpret them in the format they are actually present?
Thank You
Just adding to the above answers:
Use:
ast.literal_eval()
But remove/replace "\" from the keys of the dictionary as it can create unicode error in literal_eval() function
One more error in the code, after converting to dictionary, you cannot do d[0] as dictionaries are not indexed
The function readlines does what it says: it reads lines. You and I can recognize that the two lines are a Python dict and a Python list, but Python cannot and also should not.
It seems to me that you are reading configuration settings in from a file with Python source in it. A more standard way to do that is to use an .ini file. Look at the module configparser.
The lines you are getting from the file are string. If you want to convert them to dict or list objects, you can use literal_eval:
from ast import literal_eval
...
d = literal_eval(lines[i+1])
...
l = literal_eval(lines[i+1])
If you can change the format of the input file, I would look into modifying it to a json file to store and load your objects conveniently with the json module.
Which is the best way to store dictionary of strings in file(as they are big) and load it partially in python. Dictionary of strings here means, keyword would be a string and the value would be a list of strings.
Dictionary storing in appended form to check keys, if available not update or else update. Then use keys for post processing.
Usually a dictionary is stored in JSON.
I'll leave here a link:
Convert Python dictionary to JSON array
You could simply write the dictionary to a text file, and then create a new dictionary that only pulls certain keys and values from that text file.
But you're probably best off exploring the json module.
Here's a straighforward way to write a dict called "sample" to a file with the json module:
import json
with open('result.json', 'w') as fp:
json.dump(sample, fp)
On the loading side, we'd need to know more about how you want to choose which keys to load from the JSON file.
The above answers are great, but i hate using JSON, i have had issues with pickle before that corrupted my data, so what i do is, i use numpy's save and load
To save np.save(filename,dict)
to load dict = np.load(filename).item()
really simple and works well, as far as loading partially goes, you could always split the dictionary into multiple smaller dictionaries and save them as individual files, maybe not a very concrete solution but it could work
to split the dictionary you could do something like this
temp_dict = {}
for i,k in enumerate(dict.keys()):
if i%1000 == 0:
np.save("records-"+str(i-1000)+"-"+str(i)+".npy",temp_dict)
temp_dict = {}
temp_dict[k]=dict[k].value()
then for loading just do something like
my_dict={}
all_files = glob.glob("*.npy")
for f in all_files:
dict = np.load(filename).item()
my_dict.update(dict)
If this is for some sort of database type use then save yourself the headache and use TinyDB. It uses JSON format when saving to disc and will provide you the "partial" loading that you're looking for.
I only recommend TinyDB as this seems to be the closest to what you're looking to achieve, maybe try googling for other databases if this isn't your fancy there's TONS of them out there!
I am using
csvFile=open("yelloUSA.csv",'w+')
to make a new csv file.
now I want to write the file names from 1-10. But I want to use formatting to write the names. how can I do that?
I used below code but it is showing error
for i in range(0,10):
csvFile=open("{0}.csv",'w+').format(i)
writer=csv.writer(csvFile)
The format is a method of a string object. You are applying it to the result of open function, which if a file object. You need to apply it to the filename string
filename = "{0}.csv".format(i)
csvFile=open(filename ,'w+')
I have a list of lists which I want to write to a text file. Then, I have to read that text file, edit that list of lists and again write it back to the text file.
Now, I am not able to write it without string format. So, when I read it, I can not make the required changes because it becomes a string.
For my case, list of lists looks like
A = [[0,[0.0,1010.0,10.0],[10.0,1110.0,10.0]],[0,[10.0,1011.0,15.0],[15.0,1111.0,19.0]]]
I didn't find any solution for this problem. Any help is highly appreciated.
Note: When I read A and try converting it to float, it was trying to convert [ into float. Hence, it didn't work.
You should dump into the file as-is, there's many ways of doing this, one way will be to use json.
import json
A=[[0,[0.0,1010.0,10.0],[10.0,1110.0,10.0]],[0,[10.0,1011.0,15.0],[15.0,1111.0,19.0]]]
json.dump(A, open("file.txt","w")) # your file path
To load it you can do:
json.load(open("file.txt"))
Another way is the dump the file as a string of the list and retrieve it using ast.literal_eval as suggested by Jean-François.
I have a very big list of lists. One of my programs does this:
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
out.write (str(power_time_array))
Now another independent script need to read this list of lists back.
How do I do this?
What I have tried:
with open (file_name,'r') as app_trc_file :
power_trace_of_application.append (app_trc_file.read())
Note: power_trace_application is a list of list of lists.
This stores it as a list with one element as a huge string.
How does one efficiently store and retrieve big lists or list of lists from files in python?
You can serialize your list to json and deserialize it back. This really doesn't change anything in representation, your list is already valid json:
import json
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
json.dump(power_time_array, out)
and then just read it back:
with open (file_name,'r') as app_trc_file :
power_trace_of_application = json.load(app_trc_file)
For speed, you can use a json library with C backend (like ujson). And this works with custom objects too.
Use Json library to efficiently read and write structured information (in the form of JSON) to a text file.
To write data on the file, use json.dump() , and
To retrieve json data from file, use json.load()
It will be faster:
from ast import literal_eval
power_time_array = [[1,2,3],[1,2,3]]
with open(file_name, 'w') as out:
out.write(repr(power_time_array))
with open(file_name,'r') as app_trc_file:
power_trace_of_application.append(literal_eval(app_trc_file.read()))