Python list variable substitution within multiline string...almost(edited) - python

(Edited, previous WAAAY below) -
I got it to work by moving the string subst within the for loop, but the result automatically puts 5 tabs to the left that I can't seem to get rid of
I wanted to leave it up for a little bit to see if anyone has an answer and maybe to help someone that follows me...
Code:
for i in dns_list:
with open("output.txt", "a") as output:
alert_dns = textwrap.dedent("""
{
\"tests\":[
{{
\"token\":\"DNS\",
\"type\":\"text\",
\"operator\":\"contains\",
\"preservecase\":false,
\"value\":\"%s\"
}
]
""")%(i)
alert_dns=alert_dns.strip()
output.write(alert_dns.strip())
(Previous)
I have a list of domain names, I need to iterate through the list(dns_list) and place the variable 'insert' into a multiline string(alert_dns)-
alert_dns="""
{{
\"tests\":[
{{
\"token\":\"DNS\",
\"type\":\"text\",
\"operator\":\"contains\",
\"preservecase\":false,
\"value\":\"{insert}\"
}}
]
}}
"""
dns_list=[]
temp_file_name = 'daily.csv'
with open(temp_file_name, 'r') as temp_file:
lines = temp_file.read()
dns = re.findall(urlmarker.WEB_URL_REGEX,lines)
for i in dns:
dns_list.append(i)
with open("output.txt", "w") as output:
for i in dns_list:
for insert in alert_dns:
# i=insert
alert_dns.format(i)
output.write(alert_dns+'\n')
I keep getting --
alert_dns.format(i)
KeyError: 'insert'

Instead of alert_dns.format(i)
you should
alert_dns.format(insert=i)

Alright this is the answer if anyone looks for it...
with open("output.txt", "w") as output:
for i in dns_list:
alert_dns = textwrap.dedent("""\
{
\"tests\":[
{{
\"token\":\"DNS\",
\"type\":\"text\",
\"operator\":\"contains\",
\"preservecase\":false,
\"value\":\"%s\"
}
]
""")%(i)
output.write(alert_dns+'\n')

Related

unable to add the variable value into a query using f string in python

I am having a JSON script and I want to add a variable value into it using f-string in python but I was unable to do it.
name = "harsha"
query = """mutation {
createUsers(input:{
user_name: f"{name}"
})
{
users{
user_name
}
}
}
"""
when i am printing query i am getting the same value not the change value harsha
Everything within the triple quotes is interpreted as being part of the string, not as code. Move the f in front of the triple quotes to have it be interpreted as an f-string.
This will require escaping the rest of the braces, so you'll end up with this.
name = "harsha"
query = f"""mutation {{
createUsers(input:{{
user_name: {name}
}})
{{
users{{
user_name
}}
}}
}}
"""

Syntax to load nested in nested keys of JSON files

I have a big tree in a JSON file and I'm searching the python syntax for loading nested in nested keys from this JSON.
Assume I have this :
{
"FireWall": {
"eth0": {
"INPUT": {
"PING": 1,
}
}
}
}
According to the man page and some questions in Stackoverflow i tried this (and some variations) :
import json
config = open('config.json', 'r')
data = json.load('config')
config.close()
if data['{"FireWall", {"eth0", {"INPUT", {"Ping"}}}}'] == 1:
print('This is working')
With no result. What is the right way to do this (as simple as possible) ? Thank you !
You are trying data = json.load('config') to load string not file object and data['{"FireWall", {"eth0", {"INPUT", {"Ping"}}}}'] it's not right way to access nested dictionary key value.
import json
with open('config.json', 'r') as f:
data = json.load(f)
if data["FireWall"]["eth0"]["INPUT"]["Ping"] == 1:
print('This is working')
data is a nested dictionary, so:
data["FireWall"]["eth0"]["INPUT"]["Ping"]
will be equal to 1; or at least it will when you fix your call to json.load.
Try this:
data["FireWall"]["eth0"]["INPUT"]["PING"]
This will give you the value in PING

Writing JSON data in python. Format

I have this method that writes json data to a file. The title is based on books and data is the book publisher,date,author, etc. The method works fine if I wanted to add one book.
Code
import json
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','a') as outfile:
json.dump(data,outfile , default = set_default)
def set_default(obj):
if isinstance(obj,set):
return list(obj)
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
JSON File with one book/one method call
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
}
However if I call the method multiple times , thus adding more book data to the json file. The format is all wrong. For instance if I simply call the method twice with a main method of
if __name__ == '__main__':
createJson("stephen-king-it","stephen","king","1971","233","Viking Press")
createJson("william-golding-lord of the flies","william","golding","1944","134","Penguin Books")
My JSON file looks like
{
"stephen-king-it": [
["pageCount:233", "publisher:Viking Press", "firstName:stephen", "date:1971", "lastName:king"]
]
} {
"william-golding-lord of the flies": [
["pageCount:134", "publisher:Penguin Books", "firstName:william","lastName:golding", "date:1944"]
]
}
Which is obviously wrong. Is there a simple fix to edit my method to produce a correct JSON format? I look at many simple examples online on putting json data in python. But all of them gave me format errors when I checked on JSONLint.com . I have been racking my brain to fix this problem and editing the file to make it correct. However all my efforts were to no avail. Any help is appreciated. Thank you very much.
Simply appending new objects to your file doesn't create valid JSON. You need to add your new data inside the top-level object, then rewrite the entire file.
This should work:
def createJson(title,firstName,lastName,date,pageCount,publisher):
print "\n*** Inside createJson method for " + title + "***\n";
# Load any existing json data,
# or create an empty object if the file is not found,
# or is empty
try:
with open('data.json') as infile:
data = json.load(infile)
except FileNotFoundError:
data = {}
if not data:
data = {}
data[title] = []
data[title].append({
'firstName:', firstName,
'lastName:', lastName,
'date:', date,
'pageCount:', pageCount,
'publisher:', publisher
})
with open('data.json','w') as outfile:
json.dump(data,outfile , default = set_default)
A JSON can either be an array or a dictionary. In your case the JSON has two objects, one with the key stephen-king-it and another with william-golding-lord of the flies. Either of these on their own would be okay, but the way you combine them is invalid.
Using an array you could do this:
[
{ "stephen-king-it": [] },
{ "william-golding-lord of the flies": [] }
]
Or a dictionary style format (I would recommend this):
{
"stephen-king-it": [],
"william-golding-lord of the flies": []
}
Also the data you are appending looks like it should be formatted as key value pairs in a dictionary (which would be ideal). You need to change it to this:
data[title].append({
'firstName': firstName,
'lastName': lastName,
'date': date,
'pageCount': pageCount,
'publisher': publisher
})

Parse Code from Text - Python

I am analyzing StackOverflow's dump file "Posts.Small.xml" using pySpark. I want to separate 'code block' from 'text' in a Row. A typical parsed row looks like:
['[u"<p>I want to use a track-bar to change a form\'s opacity.</p>
<p>This is my code:</p>
<pre><code>decimal trans = trackBar1.Value / 5000;
this.Opacity = trans;
</code></pre>
<p>When I try to build it, I get this error:</p>
<blockquote>
<p>Cannot implicitly convert type \'decimal\' to \'double\'.
</p>
</blockquote>
<p>I tried making <code>trans</code> a <code>double</code>, but then the control doesn\'t work.',
'", u\'This code has worked fine for me in VB.NET in the past.',
'\', u"</p>
When setting a form\'s opacity should I use a decimal or double?"]']
I've tried "itertools" and some python functions but couldn't get the result.
My initial code to extract the above row is:
postsXml = textFile.filter( lambda line: not line.startswith("<?xml version=")
postsRDD = postsXml.map(............)
tokensentRDD = postsRDD.map(lambda x:(x[0], nltk.sent_tokenize(x[3])))
new = tokensentRDD.map(lambda x: x[1]).take(1)
a = ''.join(map(str,new))
b = a.replace("<", "<")
final = b.replace(">", ">")
nltk.sent_tokenize(final)
Any ideas are appreciated!
You can extract the code contents by using XPath (the lxml library will help) and then extract the text content selecting everything else, for example:
import lxml.etree
data = '''<p>I want to use a track-bar to change a form's opacity.</p>
<p>This is my code:</p> <pre><code>decimal trans = trackBar1.Value / 5000; this.Opacity = trans;</code></pre>
<p>When I try to build it, I get this error:</p>
<p>Cannot implicitly convert type 'decimal' to 'double'.</p>
<p>I tried making <code>trans</code> a <code>double</code>.</p>'''
html = lxml.etree.HTML(data)
code_blocks = html.xpath('//code/text()')
text_blocks = html.xpath('//*[not(descendant-or-self::code)]/text()')
The easiest way will probably be to apply a regex to the text, matching tags '' and ''. That would enable you to find the code blocks. You don't say what you would do with them afterwards, though. So ...
from itertools import zip_longest
sample_paras = [
"""<p>I want to use a track-bar to change a form\'s opacity.</p>
<p>This is my code:</p>
<pre><code>decimal trans = trackBar1.Value / 5000;
this.Opacity = trans;
</code></pre>
<p>When I try to build it, I get this error:</p>
<blockquote>
<p>Cannot implicitly convert type \'decimal\' to \'double\'. </p>
</blockquote>
<p>I tried making <code>trans</code> a <code>double</code>, but then the control doesn\'t work.""",
"""This code has worked fine for me in VB.NET in the past.""",
"""</p>
When setting a form\'s opacity should I use a decimal or double?""",
]
single_block = " ".join(sample_paras)
import re
separate_code = re.split(r"</?code>", single_block)
text_blocks, code_blocks = zip(*zip_longest(*[iter(separate_code)] * 2))
print("Text:\n")
for t in text_blocks:
print("--")
print(t)
print("\n\nCode:\n")
for t in code_blocks:
print("--")
print(t)

indexError, searching within

I am writing a program that will read a CSV file with data that looks like this:
"10724_artifact11679.jpg","H. 3 1/4 in. (8.26 cm)","10.210.114","This artwork is currently on display in Gallery 171","11679"
And write it into an HTML table. I only want the files that say, in the 3rd position, "This artwork is not on display".. but I've been having issues with this set of data
import csv
metlist4 = []
newList = csv.reader(open("v2img_10724_list.csv", 'r'))
for row in newList:
metlist4.append(row)
artifact_template = """<td>
<div>
<img src= "%(image)s" alt = "artifact" />
<p>Dimensions: %(dimension)s </p>
<p>Accession #: %(accession)s </p>
<p>Display: %(display)s </p>
<p>index2: %(index2)s </p>
</div>
</td>"""
html_list = []
count = 5794
for artifact in metlist4:
if artifact[3] in ["This artwork is not on display"]:
artifactinfo = {}
artifactinfo["image"]=artifact[0]
artifactinfo["dimension"]=artifact[1]
artifactinfo["accession"]=artifact[2]
artifactinfo["display"]=artifact[3]
artifactinfo["index2"]=count
count = count + 1
html_list.append(artifact_template % artifactinfo)
else:
pass
f = open("v3display_test.txt", "w")
f.write("\n".join(html_list))
f.close()
I get this error, but only when I run the entire metlist4...
File "/Users/Rose/Documents/workspace/METProjectFOREAL/src/no_display_Met4.py", line 34, in <module>
if artifact[3] in ["This artwork is not on display"]:
IndexError: list index out of range
if I run just a section, for example metlist4[0:500], the error does not occur. Any ideas or suggestions would be greatly appreciated!! Thanks!
There is at least one row that doesn't have a 4th element. Perhaps the line is empty.
Test for the length, and print the row to test:
if len(artifact) < 4:
print 'short row', artifact
If it is an empty line, just skip it:
if not artifact: continue
You are using a lot of verbose and redundant code; there is no need to build a separate list when you can just loop over the csv.reader() object directly, and there is no need to add an empty else: pass block either.
Idiomatic Python code would be:
artifact_template = """<td>
<div>
<img src= "%(image)s" alt = "artifact" />
<p>Dimensions: %(dimension)s </p>
<p>Accession #: %(accession)s </p>
<p>Display: %(display)s </p>
<p>index2: %(index2)s </p>
</div>
</td>"""
html_list = []
fields = 'image dimension accession display'.split()
with open("v2img_10724_list.csv", 'rb') as inputfile:
reader = csv.DictReader(inputfile, fields=fields, restval='_ignored')
for count, artifact in enumerate(reader, 5794):
if artifact and artifact['display'] == "This artwork is not on display":
artifactinfo["index2"] = count
html_list.append(artifact_template % artifact)
This use a csv.DictReader() instead to create the dictionaries per row, a with statement to ensure the file is closed when done, and enumerate() with a start value to track count.

Categories