My dictionary is as follows:
my_d ={'2010-01-02': [0.696083, 0.617865], '2010-01-01': [0.697253, 0.618224], '2010-01-03': [0.696083, 0.617865]}
and i wish to output like the following:
DATE,EUR,GBP
2010-01-02,0.696083,0.617865
2010-01-03,0.697253,0.618224
2010-01-03,0.696083,0.617865
i have tried this:
my_list = ["date", "EUR", "GBP"]
with open(fname, 'w',newline = '') as my_csv:
csv_writer = csv.writer(my_csv, delimiter=',')
csv_writer.writerow(my_list)
csv_writer.writerows(my_d)
but the csv file looks nothing like that. Please also be aware further up in my code the user has the ability to choose how many values each key has, so in this example it has two, in another it may have three this would also mean my_list would increase in size, but i believe the first row will already cater for that, i just need help with the rest of the rows, i need it to work for any amount of keys.
You need to loop over your dictionary items and then write to your file,also its better to use csv module for dealing with csv files :
my_list = ["date", "EUR", "GBP"]
import csv
my_d ={'2010-01-02': [0.696083, 0.617865], '2010-01-01': [0.697253, 0.618224], '2010-01-03': [0.696083, 0.617865]}
with open('fname.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',')
spamwriter.writerow(my_list)
for mydate in sorted(my_d.keys()):
spamwriter.writerow([mydate]+my_d[mydate])
Note the absence of newline='' when opening the file. This allows the file to have "normal" newline characters (usually '\n' or '\r\n')
Note that as you are in python 3.x you can open the file with w mode!
Related
I want to create a word dictionary. The dictionary looks like
words_meanings= {
"rekindle": "relight",
"pesky":"annoying",
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}
keys_letter=[]
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
Output: rekindle , pesky, verge, maneuver, accountability
Here rekindle , pesky, verge, maneuver, accountability they are the keys and relight, annoying, border, activity, responsibility they are the values.
Now I want to create a csv file and my code will take input from the file.
The file looks like
rekindle | pesky | verge | maneuver | accountability
relight | annoying| border| activity | responsibility
So far I use this code to load the file and read data from it.
from google.colab import files
uploaded = files.upload()
import pandas as pd
data = pd.read_csv("words.csv")
data.head()
import csv
reader = csv.DictReader(open("words.csv", 'r'))
words_meanings = []
for line in reader:
words_meanings.append(line)
print(words_meanings)
This is the output of print(words_meanings)
[OrderedDict([('\ufeffrekindle', 'relight'), ('pesky', 'annoying')])]
It looks very odd to me.
keys_letter=[]
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
Now I create an empty list and want to append only key values. But the output is [OrderedDict([('\ufeffrekindle', 'relight'), ('pesky', 'annoying')])]
I am confused. As per the first code block it only included keys but now it includes both keys and their values. How can I overcome this situation?
I would suggest that you format your csv with your key and value on the same row. Like this
rekindle,relight
pesky,annoying
verge,border
This way the following code will work.
words_meanings = {}
with open(file_name, 'r') as file:
for line in file.readlines():
key, value = line.split(",")
word_meanings[key] = value.rstrip("\n")
if you want a list of the keys:
list_of_keys = list(word_meanings.keys())
To add keys and values to the file:
def add_values(key:str, value:str, file_name:str):
with open(file_name, 'a') as file:
file.writelines(f"\n{key},{value}")
key = input("Input the key you want to save: ")
value = input(f"Input the value you want to save to {key}:")
add_values(key, value, file_name)```
You run the same block of code but you use it with different objects and this gives different results.
First you use normal dictionary (check type(words_meanings))
words_meanings = {
"rekindle": "relight",
"pesky":"annoying",
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}
and for-loop gives you keys from this dictionary
You could get the same with
keys_letter = list(words_meanings.keys())
or even
keys_letter = list(words_meanings)
Later you use list with single dictionary inside this list (check type(words_meanings))
words_meanings = [OrderedDict([('\ufeffrekindle', 'relight'), ('pesky', 'annoying')])]
and for-loop gives you elements from this list, not keys from dictionary which is inside this list. So you move full dictionary from one list to another.
You could get the same with
keys_letter = words_meanings.copy()
or even the same
keys_letter = list(words_meanings)
from collections import OrderedDict
words_meanings = {
"rekindle": "relight",
"pesky":"annoying",
"verge": "border",
"maneuver": "activity",
"accountability":"responsibility",
}
print(type(words_meanings))
keys_letter = []
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
#keys_letter = list(words_meanings.keys())
keys_letter = list(words_meanings)
print(keys_letter)
words_meanings = [OrderedDict([('\ufeffrekindle', 'relight'), ('pesky', 'annoying')])]
print(type(words_meanings))
keys_letter = []
for x in words_meanings:
keys_letter.append(x)
print(keys_letter)
#keys_letter = words_meanings.copy()
keys_letter = list(words_meanings)
print(keys_letter)
The default field separator for the csv module is a comma. Your CSV file uses the pipe or bar symbol |, and the fields also seem to be fixed width. So, you need to specify | as the delimiter to use when creating the CSV reader.
Also, your CSV file is encoded as Big-endian UTF-16 Unicode text (UTF-16-BE). The file contains a byte-order-mark (BOM) but Python is not stripping it off, so you will notice the string '\ufeffrekindle' contains the FEFF UTF-16-BE BOM. That can be dealt with by specifying encoding='utf16' when you open the file.
import csv
with open('words.csv', newline='', encoding='utf-16') as f:
reader = csv.DictReader(f, delimiter='|', skipinitialspace=True)
for row in reader:
print(row)
Running this on your CSV file produces this:
{'rekindle ': 'relight ', 'pesky ': 'annoying', 'verge ': 'border', 'maneuver ': 'activity ', 'accountability': 'responsibility'}
Notice that there is trailing whitespace in the key and values. skipinitialspace=True removed the leading whitespace, but there is no option to remove the trailing whitespace. That can be fixed by exporting the CSV file from Excel without specifying a field width. If that can't be done, then it can be fixed by preprocessing the file using a generator:
import csv
def preprocess_csv(f, delimiter=','):
# assumes that fields can not contain embedded new lines
for line in f:
yield delimiter.join(field.strip() for field in line.split(delimiter))
with open('words.csv', newline='', encoding='utf-16') as f:
reader = csv.DictReader(preprocess_csv(f, '|'), delimiter='|', skipinitialspace=True)
for row in reader:
print(row)
which now outputs the stripped keys and values:
{'rekindle': 'relight', 'pesky': 'annoying', 'verge': 'border', 'maneuver': 'activity', 'accountability': 'responsibility'}
As I found that no one able to help me with the answer. Finally, I post the answer here. Hope this will help other.
import csv
file_name="words.csv"
words_meanings = {}
with open(file_name, newline='', encoding='utf-8-sig') as file:
for line in file.readlines():
key, value = line.split(",")
words_meanings[key] = value.rstrip("\n")
print(words_meanings)
This is the code to transfer a csv to a dictionary. Enjoy!!!
I'm trying to create a function that would add entries to a json file. Eventually, I want a file that looks like
[{"name" = "name1", "url" = "url1"}, {"name" = "name2", "url" = "url2"}]
etc. This is what I have:
def add(args):
with open(DATA_FILENAME, mode='r', encoding='utf-8') as feedsjson:
feeds = json.load(feedsjson)
with open(DATA_FILENAME, mode='w', encoding='utf-8') as feedsjson:
entry = {}
entry['name'] = args.name
entry['url'] = args.url
json.dump(entry, feedsjson)
This does create an entry such as {"name"="some name", "url"="some url"}. But, if I use this add function again, with different name and url, the first one gets overwritten. What do I need to do to get a second (third...) entry appended to the first one?
EDIT: The first answers and comments to this question have pointed out the obvious fact that I am not using feeds in the write block. I don't see how to do that, though. For example, the following apparently will not do:
with open(DATA_FILENAME, mode='a+', encoding='utf-8') as feedsjson:
feeds = json.load(feedsjson)
entry = {}
entry['name'] = args.name
entry['url'] = args.url
json.dump(entry, feeds)
json might not be the best choice for on-disk formats; The trouble it has with appending data is a good example of why this might be. Specifically, json objects have a syntax that means the whole object must be read and parsed in order to understand any part of it.
Fortunately, there are lots of other options. A particularly simple one is CSV; which is supported well by python's standard library. The biggest downside is that it only works well for text; it requires additional action on the part of the programmer to convert the values to numbers or other formats, if needed.
Another option which does not have this limitation is to use a sqlite database, which also has built-in support in python. This would probably be a bigger departure from the code you already have, but it more naturally supports the 'modify a little bit' model you are apparently trying to build.
You probably want to use a JSON list instead of a dictionary as the toplevel element.
So, initialize the file with an empty list:
with open(DATA_FILENAME, mode='w', encoding='utf-8') as f:
json.dump([], f)
Then, you can append new entries to this list:
with open(DATA_FILENAME, mode='w', encoding='utf-8') as feedsjson:
entry = {'name': args.name, 'url': args.url}
feeds.append(entry)
json.dump(feeds, feedsjson)
Note that this will be slow to execute because you will rewrite the full contents of the file every time you call add. If you are calling it in a loop, consider adding all the feeds to a list in advance, then writing the list out in one go.
Append entry to the file contents if file exists, otherwise append the entry to an empty list and write in in the file:
a = []
if not os.path.isfile(fname):
a.append(entry)
with open(fname, mode='w') as f:
f.write(json.dumps(a, indent=2))
else:
with open(fname) as feedsjson:
feeds = json.load(feedsjson)
feeds.append(entry)
with open(fname, mode='w') as f:
f.write(json.dumps(feeds, indent=2))
Using a instead of w should let you update the file instead of creating a new one/overwriting everything in the existing file.
See this answer for a difference in the modes.
One possible solution is do the concatenation manually, here is some useful
code:
import json
def append_to_json(_dict,path):
with open(path, 'ab+') as f:
f.seek(0,2) #Go to the end of file
if f.tell() == 0 : #Check if file is empty
f.write(json.dumps([_dict]).encode()) #If empty, write an array
else :
f.seek(-1,2)
f.truncate() #Remove the last character, open the array
f.write(' , '.encode()) #Write the separator
f.write(json.dumps(_dict).encode()) #Dump the dictionary
f.write(']'.encode()) #Close the array
You should be careful when editing the file outside the script not add any spacing at the end.
this, work for me :
with open('file.json', 'a') as outfile:
outfile.write(json.dumps(data))
outfile.write(",")
outfile.close()
I have some code which is similar, but does not rewrite the entire contents each time. This is meant to run periodically and append a JSON entry at the end of an array.
If the file doesn't exist yet, it creates it and dumps the JSON into an array. If the file has already been created, it goes to the end, replaces the ] with a , drops the new JSON object in, and then closes it up again with another ]
# Append JSON object to output file JSON array
fname = "somefile.txt"
if os.path.isfile(fname):
# File exists
with open(fname, 'a+') as outfile:
outfile.seek(-1, os.SEEK_END)
outfile.truncate()
outfile.write(',')
json.dump(data_dict, outfile)
outfile.write(']')
else:
# Create file
with open(fname, 'w') as outfile:
array = []
array.append(data_dict)
json.dump(array, outfile)
You aren't ever writing anything to do with the data you read in. Do you want to be adding the data structure in feeds to the new one you're creating?
Or perhaps you want to open the file in append mode open(filename, 'a') and then add your string, by writing the string produced by json.dumps instead of using json.dump - but nneonneo points out that this would be invalid json.
import jsonlines
object1 = {
"name": "name1",
"url": "url1"
}
object2 = {
"name": "name2",
"url": "url2"
}
# filename.jsonl is the name of the file
with jsonlines.open("filename.jsonl", "a") as writer: # for writing
writer.write(object1)
writer.write(object2)
with jsonlines.open('filename.jsonl') as reader: # for reading
for obj in reader:
print(obj)
visit for more info https://jsonlines.readthedocs.io/en/latest/
You can simply import the data from the source file, read it, and save what you want to append to a variable. Then open the destination file, assign the list data inside to a new variable (presumably this will all be valid JSON), then use the 'append' function on this list variable and append the first variable to it. Viola, you have appended to the JSON list. Now just overwrite your destination file with the newly appended list (as JSON).
The 'a' mode in your 'open' function will not work here because it will just tack everything on to the end of the file, which will make it non-valid JSON format.
let's say you have the following dicts
d1 = {'a': 'apple'}
d2 = {'b': 'banana'}
d3 = {'c': 'carrot'}
you can turn this into a combined json like this:
master_json = str(json.dumps(d1))[:-1]+', '+str(json.dumps(d2))[1:-1]+', '+str(json.dumps(d3))[1:]
therefore, code to append to a json file will look like below:
dict_list = [d1, d2, d3]
for i, d in enumerate(d_list):
if i == 0:
#first dict
start = str(json.dumps(d))[:-1]
with open(str_file_name, mode='w') as f:
f.write(start)
else:
with open(str_file_name, mode='a') as f:
if i != (len(dict_list) - 1):
#middle dicts
mid = ','+str(json.dumps(d))[1:-1]
f.write(mid)
else:
#last dict
end = ','+str(json.dumps(d))[1:]
f.write(end)
I have developed a script that produces a CSV file. On inspection of the file, some cell's are being interpreted not the way I want..
E.g In my list in python, values that are '02e4' are being automatically formatted to be 2.00E+04.
table = [['aa02', 'fb4a82', '0a0009'], ['02e4, '452ca2', '0b0004']]
ofile = open('test.csv', 'wb')
for i in range(0, len(table)):
for j in range(0, len(table[i]):
ofile.write(table[i][j] + ",")
ofile.write("\n")
This gives me:
aa02 fb4a82 0a0009
2.00E+04 452ca2 0b0004
I've tried using the csv.writer instead where writer = csv.writer(ofile, ...)
and giving attributes from the lib (e.g csv.QUOTE_ALL)... but its the same output as before..
Is there a way using the CSV lib to automatically format all my values as strings before it's written?
Or is this not possible?
Thanks
Try setting the quoting parameter in your csv writer to csv.QUOTE_ALL.
See the doc for more info:
import csv
with open('myfile.csv', 'wb') as csvfile:
wtr = csv.writer(csvfile, quoting=csv.QUOTE_ALL)
wtr.writerow(...)
Although it sounds like the problem might lie with your csv viewer. Excel has a rather annoying habit of auto-formatting data like you describe.
If you want the '02e4' to show up in excel as "02e4" then annoyingly you have to write a csv with triple-double quotes: """02e4""". I don't know of a way to do this with the csv writer because it limits your quote character to a character. However, you can do something similar to your original attempt:
table = [['aa02', 'fb4a82', '0a0009'], ['02e4', '452ca2', '0b0004']]
ofile = open('test.csv', 'wb')
for i in range(0, len(table)):
for j in range(len(table[i])):
ofile.write('"""%s""",'%table[i][j])
ofile.write("\n")
If opened in a text editor your csv file will read:
"""aa02""","""fb4a82""","""0a0009""",
"""02e4""","""452ca2""","""0b0004""",
This produces the following result in Excel:
If you wanted to use any single character quotation you could use the csv module like so:
import csv
table = [['aa02', 'fb4a82', '0a0009'], ['02e4', '452ca2', '0b0004']]
ofile = open('test.csv', 'wb')
writer = csv.writer(ofile, delimiter=',', quotechar='|',quoting=csv.QUOTE_ALL)
for i in range(len(table)):
writer.writerow(table[i])
The output in the text editor will be:
|aa02|,|fb4a82|,|0a0009|
|02e4|,|452ca2|,|0b0004|
and Excel will show:
My data, that I have in a Python list, can contain quotes, etc.:
the Government's
projections`
indexation" that
Now I'd like to write it into a CSV file but it seems the special chars "break" the CSV structure.
csv.register_dialect('doublequote', quotechar='"', delimiter=';', quoting=csv.QUOTE_ALL)
with open ( 'csv_data.csv', 'r+b' ) as f:
header = next (csv.reader(f))
dict_writer = csv.DictWriter(f, header, -999, dialect='doublequote')
dict_writer.writerow(csv_data_list)
It usually can write up to the first 50 lines or so.
I tried to delete the next row from the source list and it wrote to 60 lines.
Is there any "better" way of writing all sorts of data into a CSV?
I'm trying sth like this:
data['title'] = data['title'].replace("'", '`' )
but that doesn't seem to be right?
i just wondering how i can read special field from a CVS File with next structure:
40.0070222,116.2968604,2008-10-28,[["route"], ["sublocality","political"]]
39.9759505,116.3272935,2008-10-29,[["route"], ["establishment"], ["sublocality", "political"]]
the way that on reading cvs files i used to work with:
with open('routes/stayedStoppoints', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='"')
The problem with that is the first 3 fields no problem i can use:
for row in spamreader:
row[0],row[1],row[2] i can access without problem. but in the last field and i guess that with csv.reader(csvfile, delimiter=',', quotechar='"') split also for each sub-list:
so when i tried to access just show me:
[["route"]
Anyone has a solution to handle the last field has a full list ( list of list indeed)
[["route"], ["sublocality","political"]]
in order to can access to each category.
Thanks
Your format is close to json. You only need to wrap each line in brackets, and to quote the dates.
For each line l just do:
lst=json.loads(re.sub('([0-9]+-[0-9]+-[0-9]+)',r'"\1"','[%s]'%(l)))
results in lst being
[40.0070222, 116.2968604, u'2008-10-28', [[u'route'], [u'sublocality', u'political']]]
You need to import the json parser and regular expressions
import json
import re
edit: you asked how to access the element containing 'route'. the answer is
lst[3][0][0]
'political' is at
lst[3][1][1]
If the strings ('political' and others) may contain strings looking like dates, you should go with the solution by #unutbu
Use line.split(',', 3) to split on just the first 3 commas:
import json
with open(filename, 'rb') as csvfile:
for line in csvfile:
row = line.split(',', 3)
row[3] = json.loads(row[3])
print(row)
yields
['40.0070222', '116.2968604', '2008-10-28', [[u'route'], [u'sublocality', u'political']]]
['39.9759505', '116.3272935', '2008-10-29', [[u'route'], [u'establishment'], [u'sublocality', u'political']]]
That is not a valid CSV file. The csv module won't be able to read this.
If the line structure is always like this (two numbers, a date, and a nested list), you can do this:
import ast
result = []
with open('routes/stayedStoppoints') as infile:
for line in infile:
coord_x, coord_y, datestr, objstr = line.split(",", 3)
result.append([float(coord_x), float(coord_y),
datestr, ast.literal_eval(objstr)])
Result:
>>> result
[[40.0070222, 116.2968604, '2008-10-28', [['route'], ['sublocality', 'political']]],
[39.9759505, 116.3272935, '2008-10-29', [['route'], ['establishment'], ['sublocality', 'political']]]]