I have this data in .csv format:
I want to convert it into .json format like this :
{
"title": "view3",
"sharedWithOrganization": false,
"sharedWithUsers": [
"81241",
"81242",
"81245",
"81265"
],
"filters": [{"field":"Account ID","comparator":"==","value":"prod"}]
},
{
"title": "view3",
"sharedWithOrganization": true,
"sharedWithUsers": [],
"filters": [{"field":"Environment_AG","comparator":"=#","value":"Development"}]
}
Below these are the conversion for comparator
'equals' means '=='
'not equal' means '!='
'contains' means '=#'
'does not contain' means '!=#'
Can you please help me convert .csv to .json I am unable to convert using python .
What I would do, without giving you the proper answer (doing it yourself is better for learning).
First : Create an Object containing your informations
class View():
def __init__(self, title, field, comparator, value, sharedWithOrganization, user1, user2, user3, user4, user5, user6):
self.title = title
self.field = field
self.comparator = comparator
self.value = value
self.sharedWithOrganization = sharedWithOrganization
self.user1 = user1
...
self.user6 = user6
Then I would load the CSV and create an object for each line, and store them in a Dict with the following structure :
loadedCsv = { "Your line title (ex : view3)" : [List of all the object with the title view3] }
Yes, with this point of view, there is data redundancy of the title parameter, you can chose to remove it from the object.
When this is done, I would, for each title in my dictionary, get all the element I need and format them in JSON by using "import json" (c.f python documentation : https://docs.python.org/3/library/json.html)
Hehere I'm posting my work on your doubt.. hope u and others will find it helpful.
But I want you to try urself....
import csv
import json
def csv_to_json(csvFilePath, jsonFilePath):
jsonArray = []
jsonArray2 = []
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for row in csvReader:
if row["comparator"] == "equals":
row["comparator"]="=="
elif row["comparator"]=="not equal":
row["comparator"]="!#"
elif row["comparator"]=="contains":
row["comparator"]="=#"
elif row["comparator"]=="does not contain":
row["comparator"]="!=#"
final_data={
"title":row["title"],
"sharedWithOrganization":bool(row["sharedWithOrganization"]),
"sharedWithUsers": [
row["user1"],
row["user2"],
row["user3"],
row["user4"],
row["user5"],
row["user6"]
],
"filters":[ {"field":row['field'],"comparator":row["comparator"],"value":row["value"]} ]
}
jsonArray.append(final_data)
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonString = json.dumps(jsonArray, indent=4)
jsonf.write(jsonString)
csvFilePath = r'test.csv'
jsonFilePath = r'test11.json'
csv_to_json(csvFilePath, jsonFilePath)
Related
I have a python code that takes some data from excel file and export`s it to json file.
import json
from collections import OrderedDict
from itertools import islice
from openpyxl import load_workbook
wb = load_workbook('E:\test.xlsx')
sheet = wb['Sheet1']
deviceData_list = []
for row in islice(sheet.values, 1, sheet.max_row):
deviceData = OrderedDict()
deviceData['hostname'] = row[2]
deviceData['ip'] = row[7]
deviceData['username'] = row[13]
deviceData['password'] = row[15]
deviceData['secret'] = row[9]
deviceData_list.append(deviceData)
j = json.dumps(deviceData_list)
print(j)
with open('data.json', 'w') as f:
f.write(j)
it outputs json file like this:
[{"hostname": "sw1", "ip": "8.8.8.8", "username": "contoso", "password": "contoso", "secret": "contoso"}, {"hostname": "sw2", "ip": "8.8.8.9", "username": "contoso", "password": "contoso2", "secret": "contoso2"}]
and what I would like is to make it look like this:
{"switch_list": [{"hostname": "sw1","ip": "8.8.8.8","username": "contoso","password": "contoso","secret": "contoso"},{"hostname": "sw2","ip": "8.8.8.9","username": "contoso","password": "contoso2","secret": "contoso2"}]}
So basically I need to put "{ "switch_list":" in front of the current output and "]}" at the end, and whatever silly idea I had, I got different result. I figured out two ways to do it , first before json.dump , and second to just edit the json file after it is created, but i do not know what to target since "switch_list" is kind of outside :) This also means that I`m a dummy regarding Python or programming in general :) Any help is appreciated, I will not post what I tried since it is uselles. This is also my first post here so please forgive any stupidity. Cheers
Instead of:
j = json.dumps(deviceData_list)
output = {"switch_list": deviceData_list}
j = json.dumps(output)
This creates a new dictionary where the only key is switch_list and its contents are your existing list. Then you dump that data.
Change
j = json.dumps(deviceData_list)
to something like:
j = json.dumps({"switch_list": deviceData_list})
Okay so I have a spreadsheet and I want to put all the entries into my nested dictionary lists.
I decided to use two for loops to iterate through the spreadsheet. And safe the value of the cell to the according nested dictionary.
Here is my code (I know it's shit I'm pretty inexperienced):
def SpGetLink():
global SpDic
for row in SpDicGen():
Data = dict.fromkeys(SpDic.values(), {})
Data[SpDic[row]]["Link"] = []
Data[SpDic[row]]["Title"] = []
for col in range(1000):
if ws.cell(row=row, column=col + 6).hyperlink is not None:
data = str(ws.cell(row=row, column=col + 6).hyperlink.target)
if data.startswith("http"):
if data not in Data[SpDic[row]]["Link"]:
Data[SpDic[row]]["Link"].append(data)
json.dump(Data, open("Data.json", "w+"), indent=4) # , sort_keys=True)
else:
Data[SpDic[row]]["Title"].append(data)
SpDic is a seperate Dictionary to get the corresponding Name to the row.
Now my Problem is the following.
When I open Data.json every list that should contain all links in the corresponding row contains the same 5 links which are the last 5 links the the spreadsheet. It looks something like this:
"smile": {
"Link": [
"https://media.giphy.com/media/k7J8aS3xpmhpK/giphy.gif",
"https://media.giphy.com/media/aY1HMl4E1Ju1y/giphy.gif",
"https://media.giphy.com/media/RLJxQtX8Hs7XytaoyX/giphy.gif",
"https://media.giphy.com/media/1448TKNMMg4BFu/giphy.gif",
"https://media.giphy.com/media/b7l5cvG94cqo8/giphy.gif"
],
"Title": []
},
"grin": {
"Link": [
"https://media.giphy.com/media/k7J8aS3xpmhpK/giphy.gif",
"https://media.giphy.com/media/aY1HMl4E1Ju1y/giphy.gif",
"https://media.giphy.com/media/RLJxQtX8Hs7XytaoyX/giphy.gif",
"https://media.giphy.com/media/1448TKNMMg4BFu/giphy.gif",
"https://media.giphy.com/media/b7l5cvG94cqo8/giphy.gif"
],
"Title": []
},
"laugh": {
"Link": [
"https://media.giphy.com/media/k7J8aS3xpmhpK/giphy.gif",
"https://media.giphy.com/media/aY1HMl4E1Ju1y/giphy.gif",
"https://media.giphy.com/media/RLJxQtX8Hs7XytaoyX/giphy.gif",
"https://media.giphy.com/media/1448TKNMMg4BFu/giphy.gif",
"https://media.giphy.com/media/b7l5cvG94cqo8/giphy.gif"
],
"Title": []
},
Does anyone have an Idea why this is happening and how to fix it ?
I think the reason why your data is overwritten each time is because you are dumping your json inside the for loop. I moved it outside and I think this should do the trick.
def SpGetLink():
Data = []
global SpDic
for row in SpDicGen():
Data = dict.fromkeys(SpDic.values(), {})
Data[SpDic[row]]["Link"] = []
Data[SpDic[row]]["Title"] = []
for col in range(1000):
if ws.cell(row=row, column=col + 6).hyperlink is not None:
data = str(ws.cell(row=row, column=col + 6).hyperlink.target)
if data.startswith("http"):
if data not in Data[SpDic[row]]["Link"]:
Data[SpDic[row]]["Link"].append(data)
else:
Data[SpDic[row]]["Title"].append(data)
with open("Data.json", "w") as f:
json.dump(Data, f, indent=4)
I fixed it by just making one of the for loops into a function and calling it in another for loop.
I have this method that writes json data to a file. The title is based on books and data is the book publisher,date,author, etc. The method works fine if I wanted to add one book.
Code
import json
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','a') as outfile:
json.dump(data,outfile , default = set_default)
def set_default(obj):
if isinstance(obj,set):
return list(obj)
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
JSON File with one book/one method call
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
}
However if I call the method multiple times , thus adding more book data to the json file. The format is all wrong. For instance if I simply call the method twice with a main method of
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
createJson("william-golding-lord of the flies","william","golding","1944","134","Penguin Books")
My JSON file looks like
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
} {
"william-golding-lord of the flies": [
["pageCount:134", "publisher:Penguin Books", "firstName:william","lastName:golding", "date:1944"]
]
}
Which is obviously wrong. Is there a simple fix to edit my method to produce a correct JSON format? I look at many simple examples online on putting json data in python. But all of them gave me format errors when I checked on JSONLint.com . I have been racking my brain to fix this problem and editing the file to make it correct. However all my efforts were to no avail. Any help is appreciated. Thank you very much.
Simply appending new objects to your file doesn't create valid JSON. You need to add your new data inside the top-level object, then rewrite the entire file.
This should work:
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
# Load any existing json data,
# or create an empty object if the file is not found,
# or is empty
try:
with open('data.json') as infile:
data = json.load(infile)
except FileNotFoundError:
data = {}
if not data:
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','w') as outfile:
json.dump(data,outfile , default = set_default)
A JSON can either be an array or a dictionary. In your case the JSON has two objects, one with the key stephen-king-it and another with william-golding-lord of the flies. Either of these on their own would be okay, but the way you combine them is invalid.
Using an array you could do this:
[
{ "stephen-king-it": [] },
{ "william-golding-lord of the flies": [] }
]
Or a dictionary style format (I would recommend this):
{
"stephen-king-it": [],
"william-golding-lord of the flies": []
}
Also the data you are appending looks like it should be formatted as key value pairs in a dictionary (which would be ideal). You need to change it to this:
data[title].append({
'firstName': firstName,
'lastName': lastName,
'date': date,
'pageCount': pageCount,
'publisher': publisher
})
When I run my below query and there is no data in the values such as ["VT","NCR","N","DT","RD"], the query fails.
With the error message of
ValueError: dict contains fields not in fieldnames: ‘VT’
Is there a way to say if there is no data in any of the values still carry on running the query to grab data for the values that have data in python?
For example: the ‘TRY’, ‘CATCH’, or ’PASS’ method?
I have been struggling on this for days, could someone show me how to do this?
My Code:
from datetime import datetime
from elasticsearch import Elasticsearch
import csv
es = Elasticsearch(["9200"])
res = es.search(index="search", body=
{
"_source": ["VT","NCR","N","DT","RD"],
"query": {
"bool": {
"must": [{"range": {"VT": {
"gte": "now/d",
"lte": "now+1d/d"}}},
{"wildcard": {"user": "mike*"}}]}}},size=10)
csv_file = 'File_' + str(datetime.now().strftime('%Y_%m_%d - %H.%M.%S')) + '.csv'
header_names = { 'VT': 'Date', 'NCR': 'ExTime', 'N': 'Name', 'DT': 'Party', ' RD ': 'Period'}
with open(csv_file, 'w', newline='') as f:
header_present = False
for doc in res['hits']['hits']:
my_dict = doc['_source']
if not header_present:
w = csv.DictWriter(f, my_dict.keys())
w.writerow(header_names,)
header_present = True
w.writerow(my_dict)
I'd like to point out a flaw in your comment
# will write DATE, TIME, ... in correct place
w.writerow(header_names,)
Actually, it writes out the values of the the dictionary under the headers of the keys... Therefore you're writing two headers, basically.
Regarding the error, according to the documentation , you can ignore missing fields and set default values when they don't exist
The optional restval parameter specifies the value to be written if the dictionary is missing a key in fieldnames. If the dictionary passed to the writerow() method contains a key not found in fieldnames, the optional extrasaction parameter indicates what action to take. If it is set to 'raise', the default value, a ValueError is raised. If it is set to 'ignore', extra values in the dictionary are ignored.
For example
with open(csv_file, 'w', newline='') as f:
# Open one csv for all the results
w = csv.DictWriter(f, fieldnames=header_names.keys(), restval='', extrasaction='ignore')
# There's only one header, don't need a boolean flag
w.writeheader()
# proceed to write results
for doc in res['hits']['hits']:
my_dict = doc['_source']
# Parse this dictionary however you need to write a valid CSV row
w.writerow(my_dict)
Otherwise, don't use a DictWriter and form the CSV row yourself. You can use dict.get() to extract values, but set default values that don't exist in the data
I want to extract the 'avail' value from the JSON output that look like this.
{
"result": {
"code": 100,
"message": "Command Successful"
},
"domains": {
"yolotaxpayers.com": {
"avail": false,
"tld": "com",
"price": "49.95",
"premium": false,
"backorder": true
}
}
}
The problem is that the ['avail'] value is under ["domains"]["domain_name"] and I can't figure out how to get the domain name.
You have my spider below. The first part works fine, but not the second one.
import scrapy
import json
from whois.items import WhoisItem
class whoislistSpider(scrapy.Spider):
name = "whois_list"
start_urls = []
f = open('test.txt', 'r')
global lines
lines = f.read().splitlines()
f.close()
def __init__(self):
for line in lines:
self.start_urls.append('http://www.example.com/api/domain/check/%s/com' % line)
def parse(self, response):
for line in lines:
jsonresponse = json.loads(response.body_as_unicode())
item = WhoisItem()
domain_name = list(jsonresponse['domains'].keys())[0]
item["avail"] = jsonresponse["domains"][domain_name]["avail"]
item["domain"] = domain_name
yield item
Thank you in advance for your replies.
Currently, it tries to get the value by the "('%s.com' % line)" key.
You need to do the string formatting correctly:
domain_name = "%s.com" % line.strip()
item["avail"] = jsonresponse["domains"][domain_name]["avail"]
Assuming you are only expecting one result per response:
domain_name = list(jsonresponse['domains'].keys())[0]
item["avail"] = jsonresponse["domains"][domain_name]["avail"]
This will work even if there is a mismatch between the domain in the file "test.txt" and the domain in the result.
To get the domain name from above json response you can use list comprehension , e.g:
domain_name = [x for x in jsonresponse.values()[0].keys()]
To get the "avail" value use same method, e.g:
avail = [x["avail"] for x in jsonresponse.values()[0].values() if "avail" in x]
to get the values in string format you should call it by index 0 e.g:
domain_name[0] and avail[0] because list comprehension results stored in list type variable.
More info on list comprehension