how can i fill my key's json automatically - python

I have a json template that I would like to autofill.
currently i do it manually like this:
for i in range(len(self.dict_trie_csv_infos)):
self.template_json[i]['name'] = self.dict_trie_csv_infos[i]['name']
self.template_json[i]['title'] = self.dict_trie_csv_infos[i]['title']
self.template_json[i]['startDateTime'] = self.dict_trie_csv_infos[i]['startDateTime']
self.template_json[i]['endDateTime'] = self.dict_trie_csv_infos[i]['endDateTime']
self.template_json[i]['address'] = self.dict_trie_csv_infos[i]['address']
self.template_json[i]['locationName'] = self.dict_trie_csv_infos[i]['locationName']
self.template_json[i]['totalTicketsCount'] = self.dict_trie_csv_infos[i]
.....etc..
as you can see they have the same key's name,
i tried to do it with a loop but it didn't worked.
How can i fill it automatically (is it possible)?
Thanks for your answer

If you're matching names you could do something like:
for i in range(len(self.dict_trie_csv_infos)):
for key in list(self.dict_trie_csv_infos.keys()):
self.template_json[i][key] = self.dict_trie_csv_infos[i][key]

Related

How to update field with bibtexparser ?

I want to read a .bib file (Which I would be downloading frequently) and then read all the entries in it and wherever a predefined set of fields inside the entry is missing, add the specific field with some static information and then update the file.
For example, If I have a file input.bib like this:
#article{ ISI:000361215300002,
Abstract = {{some abstract}},
Year = {{2016}},
Volume = {{47}}
}
Then I would execute something like this:
python code.py < input.bib > input.bib
And inside the code.py, I want to do something like:
def populateKey(data):
for entry in data.entries:
if 'key' not in entry:
entry['key'] = KEY_VALUE
bib_str = ""
for line in sys.stdin:
bib_str += line
bib_data = loads(bib_str)
populateKey(bib_data)
bibtex_str = bibtexparser.dumps(bib_data)
print bibtex_str
After I execute above code, I am getting an output that looks like:
#article{ ISI:000361215300002,
abstract = {some abstract},
key = value
volume = {47},
year = {2016},
}
The bibtex module is corrupting my format in that it makes everything lowercase and removes redundant brackets and jumbles the fields. Is there a way to not overwrite the file and just add a specific field wherever the specific field is not present?

How to check many variables in Python if not null?

I'm writing a python scraper code for OpenData and I have one question about : how to check if all values aren't filled in site and if it is null change value to null.
My scraper is here.
Currently I'm working on it to optimalize.
My variables now look like:
evcisloval = soup.find_all('td')[3].text.strip()
prinalezival = soup.find_all('td')[5].text.strip()
popisfaplnenia = soup.find_all('td')[7].text.replace('\"', '')
hodnotafaplnenia = soup.find_all('td')[9].text[:-1].replace(",", ".").replace(" ", "")
datumdfa = soup.find_all('td')[11].text
datumzfa = soup.find_all('td')[13].text
formazaplatenia = soup.find_all('td')[15].text
obchmenonazov = soup.find_all('td')[17].text
sidlofirmy = soup.find_all('td')[19].text
pravnaforma = soup.find_all('td')[21].text
sudregistracie = soup.find_all('td')[23].text
ico = soup.find_all('td')[25].text
dic = soup.find_all('td')[27].text
cislouctu = soup.find_all('td')[29].text
And Output :
scraperwiki.sqlite.save(unique_keys=["invoice_id"],
data={ "invoice_id":number,
"invoice_price":hodnotafaplnenia,
"evidence_no":evcisloval,
"paired_with":prinalezival,
"invoice_desc":popisfaplnenia,
"date_received":datumdfa,
"date_payment":datumzfa,
"pay_form":formazaplatenia,
"trade_name":obchmenonazov,
"trade_form":pravnaforma,
"company_location":sidlofirmy,
"court":sudregistracie,
"ico":ico,
"dic":dic,
"accout_no":cislouctu,
"invoice_attachment":urlfa,
"invoice_url":url})
I googled it but without success.
First, write a configuration dict of your variables in the form:
conf = {'evidence_no': (3, str.strip),
'trade_form': (21, None),
...}
i.e. key is the output key, value is a tuple of id from soup.find_all('td') and of an optional function that has to be applied to the result, None otherwise. You don't need those Slavic variable names that may confuse other SO members.
Then iterate over conf and fill the data dict.
Also, run soup.find_all('td') before the loop.
tds = soup.find_all('td')
data = {}
for name, (num, func) in conf.iteritems():
text = tds[num].text
# replace text with None or "NULL" or whatever if needed
...
if func is None:
data[name] = text
else:
data[name] = func(text)
This will remove a lot of duplicated code. Easier to maintain.
Also, I am not sure the strings "NULL" are the best way to write missing data. Doesn't sqlite support Python's real None objects?
Just read your attached link, and it seems what you want is
evcisloval = soup.find_all('td')[3].text.strip() or "NULL"
But be careful. You should only do this with strings. If the part before or is either empty or False or None, or 0, they will all be replaced with "NULL"

How to check for blank fields in input json data in python?

Suppose in my python below function, i am getting the json feeds like below
def mapper_1(self, key, line):
j_feed = json.loads(line)
unicoded = j_feed[u'category_description'].encode("utf-8")
cn = j_feed[u'categoryname']
location = j_feed[u'location']
How to check if there is any blank fields for data in categoryname/categorydescription/location from the input.json.
Say you are unsure of your fields, you can use .get and provide it a default sentinel value
fields = ['categoryname', 'categorydescription', 'location']
for field in fields:
print j_feed.get(field, "not set!")

Using Keys as Variables in Python

There is probably a term for what I'm attempting to do, but it escapes me. I'm using peewee to set some values in a class, and want to iterate through a list of keys and values to generate the command to store the values.
Not all 'collections' contain each of the values within the class, so I want to just include the ones that are contained within my data set. This is how far I've made it:
for value in result['response']['docs']:
for keys in value:
print keys, value[keys] # keys are "identifier, title, language'
#for value in result['response']['docs']:
# collection = Collection(
# identifier = value['identifier'],
# title = value['title'],
# language = value['language'],
# mediatype = value['mediatype'],
# description = value['description'],
# subject = value['subject'],
# collection = value['collection'],
# avg_rating = value['avg_rating'],
# downloads = value['downloads'],
# num_reviews = value['num_reviews'],
# creator = value['creator'],
# format = value['format'],
# licenseurl = value['licenseurl'],
# publisher = value['publisher'],
# uploader = value['uploader'],
# source = value['source'],
# type = value['type'],
# volume = value['volume']
# )
# collection.save()
for value in result['response']['docs']:
Collection(**value).save()
See this question for an explanation on how **kwargs work.
Are you talking about how to find out whether a key is in a dict or not?
>>> somedict = {'firstname': 'Samuel', 'lastname': 'Sample'}
>>> if somedict.get('firstname'):
>>> print somedict['firstname']
Samuel
>>> print somedict.get('address', 'no address given'):
no address given
If there is a different problem you'd like to solve, please clarify your question.

How do I access a dictionary value for use with the urllib module in python?

Example - I have the following dictionary...
URLDict = {'OTX2':'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=OTX2&action=view_all',
'RAB3GAP':'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=RAB3GAP1&action=view_all',
'SOX2':'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=SOX2&action=view_all',
'STRA6':'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=STRA6&action=view_all',
'MLYCD':'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=MLYCD&action=view_all'}
I would like to use urllib to call each url in a for loop, how can this be done?
I have successfully done this with with the urls in a list format like this...
OTX2 = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=OTX2&action=view_all'
RAB3GAP = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=RAB3GAP1&action=view_all'
SOX2 = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=SOX2&action=view_all'
STRA6 = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=STRA6&action=view_all'
MLYCD = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=MLYCD&action=view_all'
URLList = [OTX2,RAB3GAP,SOX2,STRA6,PAX6,MLYCD]
for URL in URLList:
sourcepage = urllib.urlopen(URL)
sourcetext = sourcepage.read()
but I want to also be able to print the key later when returning data. Using a list format the key would be a variable and thus not able to access it for printing, I would lonly be able to print the value.
Thanks for any help.
Tom
Have you tried (as a simple example):
for key, value in URLDict.iteritems():
print key, value
Doesn't look like a dictionary is even necessary.
dbs = ['OTX2', 'RAB3GAP', 'SOX2', 'STRA6', 'PAX6', 'MLYCD']
urlbase = 'http://lsdb.hgu.mrc.ac.uk/variants.php?select_db=%s&action=view_all'
for db in dbs:
sourcepage = urllib.urlopen(urlbase % db)
sourcetext = sourcepage.read()
I would go about it like this:
for url_key in URLDict:
URL = URLDict[url_key]
sourcepage = urllib.urlopen(URL)
sourcetext = sourcepage.read()
The url is obviously URLDict[url_key] and you can retain the key value within the name url_key. For exemple:
print url_key
On the first iteration will printOTX2.

Categories