How can I write a dictionary to a csv file? - python

The dictionary looks like the following.
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
I want to get rid of ' [ ] in my csv file
I want to get the output csv as,
Qatar 68.61994212 59.03245947 55.10905996
Burundi 0.051012487 0.048311391 0.046681908
Japan 9.605144835 9.247590692 9.542878595

Try the code below. The reason you are getting '[]' is because you might be trying to write the val of the dictionary as-is which is a list. Instead you need to retrieve the values in the list and then write it.
import csv
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
with open('./result.csv', 'w') as res_file:
csv_writer = csv.writer(res_file)
for k, v in res.items():
res_val = [x for x in v]
res_val.insert(0, k)
csv_writer.writerow(res_val)
OUTPUT:
The contents of the file (result.csv) are as below:
Burundi,0.051012487,0.048311391,0.046681908
Japan,9.605144835,9.247590692,9.542878595
Qatar,68.61994212,59.03245947,55.10905996

Aside from Jay-s answer if you are allowed to use Pandas then you can use panda-s to_csv function to just make the csv in one line.
import pandas as pd
df = pd.DataFrame(res)
df.to_csv('my_result.csv', index=False)

Try this:
[(k,) + tuple(res[k]) for k in res]
You will get list of tuples likes this which you can write to a csv file:
[('Burundi', '0.051012487', '0.048311391', '0.046681908'), ('Japan', '9.605144835', '9.247590692', '9.542878595'), ('Qatar', '68.61994212', '59.03245947', '55.10905996')]

Pandas will do it:
import pandas as pd
res = {'Qatar': ['68.61994212', '59.03245947', '55.10905996'],
'Burundi': ['0.051012487', '0.048311391', '0.046681908'],
'Japan': ['9.605144835', '9.247590692', '9.542878595', ]}
df = pd.DataFrame.from_dict(res, orient='index')
df.to_csv('res.csv', header=False)
Be sure to use "orient='index'" when creating the dataframe so that you get the correct row indexing in the csv
Qatar,68.61994212,59.03245947,55.10905996
Burundi,0.051012487,0.048311391,0.046681908
Japan,9.605144835,9.247590692,9.542878595

Related

How do I give three separate lists three separate columns in a csv file?

I'm trying to output my three lists(the one's stored in word_storage) into a csv file but they all get crammed into the first column of the spreadsheet. Is there a way to give all three of the lists separate columns?
Suppose you have three lists like this,
a = list(range(1,4))
b = list(range(2,5))
c = list(range(3,6))
You can convert them into a csv file using this,
for i in range(len(a)):
for lst in [a,b,c]:
csv += f"{lst[i]},"
csv = csv.rstrip(",")
csv += "\n"
with open("test.csv", "w+") as f:
f.write(csv)
Using Pandas to csv function will be helpful here.
import pandas as pd
df = pd.DataFrame({'name': ['Raphael', 'Donatello'],
'mask': ['red', 'purple'],
'weapon': ['sai', 'bo staff']})
df.to_csv(index=False)
The above example uses there list and gives a header as Name,mask and weapon

How to return a specific data structure with inner dictionary of lists

I have a csv file (image attached) and to take the CSV file and create a dictionary of lists with the format "{method},{number},{orbital_period},{mass},{distance},{year}" .
So far I have code :
import csv
with open('exoplanets.csv') as inputfile :
reader = csv.reader(inputfile)
inputm = list(reader)
print(inputm)
but my output is coming out like ['Radial Velocity', '1', '269.3', '7.1', '77.4', '2006']
when I want it to look like :
"Radial Velocity" : {"number":[1,1,1], "orbital_period":[269.3, 874.774, 763.0], "mass":[7.1, 2.21, 2.6], "distance":[77.4, 56.95, 19.84], "year":[2006.0, 2008.0, 2011.0] } , "Transit" : {"number":[1,1,1], "orbital_period":[1.5089557, 1.7429935, 4.2568], "mass":[], "distance":[200.0, 680.0], "year":[2008.0, 2008.0, 2008.0] }
Any ideas on how I can alter my code?
Hey SKR01 welcome to Stackoverflow!
I would suggest working with the pandas library. It is meant for table like contents that you have there. What you are then looking for is a groupby on your #method column.
import pandas as pd
def remove_index(row):
d = row._asdict()
del d["Index"]
return d
df = pd.read_csv("https://docs.google.com/uc?export=download&id=1PnQzoefx-IiB3D5BKVOrcawoVFLIPVXQ")
{row.Index : remove_index(row) for row in df.groupby('#method').aggregate(list).itertuples()}
The only thing that remains is removing the nan values from the resulting dict.
If you don't want to use Pandas, maybe something like this is what you're looking for:
import csv
with open('exoplanets.csv') as inputfile :
reader = csv.reader(inputfile)
inputm = list(reader)
header = inputm.pop(0)
del header[0] # probably you don't want "#method"
# create and populate the final dictionary
data = {}
for row in inputm:
if row[0] not in data:
data[row[0]] = {h:[] for h in header}
for i, h in enumerate(header):
data[row[0]][h].append(row[i+1])
print(data)
This is a bit complex, and I'm questioning why you want the data this way, but this should get you the output format you want without requiring any external libraries like Pandas.
import csv
with open('exoplanets.csv') as input_file:
rows = list(csv.DictReader(input_file))
# Create the data structure
methods = {d["#method"]: {} for d in rows}
# Get a list of fields, trimming off the method column
fields = list(rows[1])[1:]
# Fill in the data structure
for method in methods:
methods[method] = {
# Null-trimmed version of listcomp
# f: [r[f] for r in rows if r["#method"] == method and r[f]]
f: [r[f] for r in rows if r["#method"] == method]
for f
in fields
}
Note: This could be one multi-tiered list/dict comprehension, but I've broken it apart for clarity.

Csv to json by the same key-python

I have a big csv file (aprx. 1GB) that I want to convert to a json file in the following way:
the csv file has the following structure:
header: tid;inkey;outkey;value
values:
tid1;inkey1;outkey1;value1
tid1;inkey2;outkey2;value2
tid2;inkey2;outkey3;value2
tid2;inkey4;outkey3;value2
etc.
The idea is to convert this csv to a json with the following structure, basically to group everything by "tid":
{
"tid1": {
"inkeys":["inkey1", "inkey2"],
"outkeys":["outkey1", "outkey2"]
}
}
I can imagine how to do it normal python dicts and lists, but my problem is also the huge amount of data that i have to process. I suppose pandas can help here, but I am still very confused with this tool.
I think this should be straight-forward to do with standard Python data structures such as defaultdict. Unless you have very limited memory, I see no reason why a 1gb file will be problematic using a straight-forward approach.
Something like (did not test):
from collections import defaultdict
import csv
import json
out_data = defaultdict(lambda: {"inkeys": [], "outkeys": [], "values": []})
with file("your-file.csv") as f:
reader = csv.reader(f):
for line in reader:
tid, inkey, outkey, value = line
out_data[tid]["inkeys"].append(inkey)
out_data[tid]["outkeys"].append(outkey)
out_data[tid]["values"].append(value)
print(json.dumps(out_data))
There might be a faster or more memory efficient way to do it with Pandas or others, but simplicity and zero dependencies go a long way.
First you need to use pandas and read your csv into a dataframe. Say the csv is saved in a file called my_file.csv then you call
import pandas as pd
my_df = pd.read_csv('my_file.csv')
Then you need to convert this dataframe to the form that you specified. The following call will convert it to a dict with the specified structure
my_json = dict(my_df.set_index('tid1').groupby(level=0).apply(lambda x : x.to_json(orient = 'records')))
Now you can export it to a json file if you want
import json
with open('my_json.json', 'w') as outfile:
json.dump(my_json, outfile)
You can use Pandas with groupby and a dictionary comprehension:
from io import StringIO
import pandas as pd
mystr = StringIO("""tid1;inkey1;outkey1;value1
tid1;inkey2;outkey2;value2
tid2;inkey2;outkey3;value2
tid2;inkey4;outkey3;value2""")
# replace mystr with 'file.csv'
df = pd.read_csv(mystr, sep=';', header=None, names=['tid1', 'inkeys', 'outkeys'])
# group by index
grouper = df.groupby(level=0)
# nested dictionary comprehension with selected columns
res = {k: {col: v[col].tolist() for col in ('inkeys', 'outkeys')} for k, v in grouper}
print(res)
{'tid1': {'inkeys': ['outkey1', 'outkey2'], 'outkeys': ['value1', 'value2']},
'tid2': {'inkeys': ['outkey3', 'outkey3'], 'outkeys': ['value2', 'value2']}}
Similar to the other defaultdict() answer:
from collections import defaultdict
d = defaultdict(lambda: defaultdict(list))
with open('file.txt') as in_file:
for line in in_file:
tid, inkey, outkey, value = line.strip().split(';')
d[tid]['inkeys'].append(inkey)
d[tid]['outkeys'].append(outkey)
d[tid]['values'].append(value)

Convert this list of lists in CSV

I am a novice in Python, and after several searches about how to convert my list of lists into a CSV file, I didn't find how to correct my issue.
Here is my code :
#!C:\Python27\read_and_convert_txt.py
import csv
if __name__ == '__main__':
with open('c:/python27/mytxt.txt',"r") as t:
lines = t.readlines()
list = [ line.split() for line in lines ]
with open('c:/python27/myfile.csv','w') as f:
writer = csv.writer(f)
for sublist in list:
writer.writerow(sublist)
The first open() will create a list of lists from the txt file like
list = [["hello","world"], ["my","name","is","bob"], .... , ["good","morning"]]
then the second part will write the list of lists into a csv file but only in the first column.
What I need is from this list of lists to write it into a csv file like this :
Column 1, Column 2, Column 3, Column 4 ......
hello world
my name is bob
good morning
To resume when I open the csv file with the txtpad:
hello;world
my;name;is;bob
good;morning
Simply use pandas dataframe
import pandas as pd
df = pd.DataFrame(list)
df.to_csv('filename.csv')
By default missing values will be filled in with None to replace None use
df.fillna('', inplace=True)
So your final code should be like
import pandas as pd
df = pd.DataFrame(list)
df.fillna('', inplace=True)
df.to_csv('filename.csv')
Cheers!!!
Note: You should not use list as a variable name as it is a keyword in python.
I do not know if this is what you want:
list = [["hello","world"], ["my","name","is","bob"] , ["good","morning"]]
with open("d:/test.csv","w") as f:
writer = csv.writer(f, delimiter=";")
writer.writerows(list)
Gives as output file:
hello;world
my;name;is;bob
good;morning

save a list of different Dataframes to json

I have different pandas dataframes, which I put in a list.
I want to save this list in json (or any other format) which can be read by R.
import pandas as pd
def create_df_predictions(extra_periods):
"""
make a empty df for predictions
params: extra_periods = how many prediction in the future the user wants
"""
df = pd.DataFrame({ 'model': ['a'], 'name_id': ['a'] })
for col in range(1, extra_periods+1):
name_col = 'forecast' + str(col)
df[name_col] = 0
return df
df1 = create_df_predictions(9)
df2 = create_df_predictions(12)
list_df = [df1, df2]
The question is how to save list_df in a readable format for R? Note that df1 and df2 are have a different amount of columns!
don't know panda DataFrames in detail, so maybe this won't work. But in case it is kind of a traditional dict, you should be able to use the json module.
df1 = create_df_predictions(9)
df2 = create_df_predictions(12)
list_df = [df1, df2]
You can write it to a file, using json.dumps(list_df), which will convert your list of dicts to a valid json representation.
import json
with open("my_file", 'w') as outfile:
outfile.write(json.dumps(list_df))
Edit: as commented by DaveR dataframes are't serializiable. You can convert them to a dict and then dump the list to json.
import json
with open("my_file", 'w') as outfile:
outfile.write(json.dumps([df.to_dict() for df in list_df]))
Alternatively pd.DataFrame and pd.Series have a to_json() method, maybe have a look at those as well.
To export the list of DataFrames to a single json file, you should convert the list into a DataFrame and then use the to_json() function as shown below:
df_to_export = pd.DataFrame(list_df)
json_output = df_to_export.to_json()
with open("output.txt", 'w') as outfile:
outfile.write(json_output)
This will export the full dataset to a single json string and export it to a file.

Categories