I'm editing a PDF template with using pdftk
command = ("pdftk " + '"' +
template + '"' +
" fill_form " + '"' +
pathUser + user['mail'] + ".xfdf" + '"' +
" output " + '"' +
pathUser + user['mail'] + ".pdf" + '"' +
" need_appearances")
command = command.replace('/', '\\')
os.system(command)
First I'm writing my data in a .xfdf file
for key, value in user.items():
print(key, value)
fields.append(u"""<field name="%s"><value>%s</value></field>""" % (key, value))
tpl = u"""<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>
%s
</fields>
</xfdf>""" % "\n".join(fields)
f = open(pathUser + user['mail'] + '.xfdf', 'wb')
f.write(tpl.encode("utf-8"))
f.close()
I fetch the template and as shown above, write the data from the xfdf to pdf but for some reason, only the ime gets written.
Templates get fetched using some basic conditional logic as shown below:
for item in user['predavanja']:
user[acthead + str(actn)] = item
actn += 1
for item in user['radionice']:
user[acthead + str(actn)] = item
actn += 1
for item in user['izlet']:
user[acthead + str(actn)] = item
actn += 1
print(actn)
templates = {}
templates['0'] = "Template/2019/certificate_2019.pdf"
templates['5'] = "Template/2019/certificate_2019_5.pdf"
templates['10'] = "Template/2019/certificate_2019_10.pdf"
templates['15'] = "Template/2019/certificate_2019_15.pdf"
templates['20'] = "Template/2019/certificate_2019_20.pdf"
templates['25'] = "Template/2019/certificate_2019_25.pdf"
templates['30'] = "Template/2019/certificate_2019_30.pdf"
templates['35'] = "Template/2019/certificate_2019_35.pdf"
templates['40'] = "Template/2019/certificate_2019_40.pdf"
templates['45'] = "Template/2019/certificate_2019_45.pdf"
templates['50'] = "Template/2019/certificate_2019_50.pdf"
I'm writing this data
user['id'] = data['recommendations'][0]['role_in_team']['user']['id']
user['ime'] = data['recommendations'][0]['role_in_team']['user']['first_name']
user['prezime'] = data['recommendations'][0]['role_in_team']['user']['last_name']
user['tim'] = data['recommendations'][0]['role_in_team']['team']['short_name']
user['mail'] = data['recommendations'][0]['role_in_team']['user']['estudent_email']
user['puno_ime'] = (data['recommendations'][0]['role_in_team']['user']['first_name'] + ' ' +
data['recommendations'][0]['role_in_team']['user']['last_name'])
user['predavanja'] = predavanja
user['radionice'] = radionice
user['izlet'] = izlet
One note. predavanja, radionice and izlet are lists.
I've tried printing tpl which shows all the data being properly added to the scheme.
Turns out the issue was the naming of the variables since they didn't match the field names in the acroform PDF. So the solution was to rename the variables in the code to match the field names.
Related
This is my code to create a hashtag file. The issue is it does not put the # for the first hashtag and at he end it puts a double hashtag like below.
passiveincome, #onlinemarketing, #wahmlife, #cash, #entrepreneurlifestyle, #makemoneyonline, #makemoneyfast, #entrepreneurlifestyle, #mlm, #mlm
How do I get the code to remove the double output and put the # at the beginning?
import random, os, sys
basepath = os.path.dirname(sys.argv[0]) + "/"
outputpath = "C:/Users/matth/OneDrive/Desktop/Create hashtags/"
paragraphsmin = 9
paragraphsmax = 9
sentencemin = 1
sentencemax = 1
keywords = []
for line in open(basepath + "/base.txt", "r"):
keywords.append(line.replace("\n",""))
keywordlist = []
keyword = open(basepath + "/text-original.txt", "r")
for line in keyword:
keywordlist.append(line.replace("\n", "\n"))
def type(name):
value = name[random.randint(0,len(name)-1)]
return value
"""
def xyz(num):
s1 = '' + type(keywordlist).strip()
return eval('s' + str(num))
"""
def s1():
return '' + type(keywordlist).strip()
def randomSentence():
sent = eval("s" + str(random.randint(1,1)) + "()")
return sent
for keyword in keywords:
outputfile = open(outputpath + keyword.replace(" ", " ") + ".txt", "w")
outputfile.write('')
for p in range(1,random.randint(paragraphsmin,paragraphsmax) + 1):
outputfile.write('')
for s in range(1,random.randint(sentencemin,sentencemax) + 1):
sentence = randomSentence()
if str(sentence)[0] == "\"":
outputfile.write("" + str(sentence)[0] + str(sentence)[1] + str(sentence)[2:] + " ")
else:
outputfile.write("" + str(sentence)[0] + str(sentence)[1:] + ", #")
outputfile.write('')
outputfile.write(sentence.replace("", "") + "")
outputfile.close()
Try replacing
outputfile.write("" + str(sentence)[0] + str(sentence)[1:] + ", #")
with
outputfile.write("#" + str(sentence)[0] + str(sentence)[1:] + ", ")
I am working on a Python (3) XML parser that should extract the text content of specific nodes from every xml file within a folder. Then, the script should write the collected data into a tab-separated text file. So far, all the functions seem to be working. The script returns all the information that I want from the first file, but it always breaks, I believe, when it starts to parse the second file.
When it breaks, it returns "TypeError: 'str' object is not callable." I've checked the second file and found that the functions work just as well on that as the first file when I remove the first file from the folder. I'm very new to Python/XML. Any advice, help, or useful links would be greatly appreciated. Thanks!
import xml.etree.ElementTree as ET
import re
import glob
import csv
import sys
content_file = open('WWP Project/WWP_texts.txt','wt')
quotes_file = open('WWP Project/WWP_quotes.txt', 'wt')
list_of_files = glob.glob("../../../Documents/WWPtextbase/distribution/*.xml")
ns = {'wwp':'http://www.wwp.northeastern.edu/ns/textbase'}
def content(tree):
lines = ''.join(ET.tostring(tree.getroot(),encoding='unicode',method='text')).replace('\n',' ').replace('\t',' ').strip()
clean_lines = re.sub(' +',' ', lines)
return clean_lines.lower()
def quotes(tree):
quotes_list = []
for node in tree.findall('.//wwp:quote', namespaces=ns):
quote = ET.tostring(node,encoding='unicode',method='text')
clean_quote = re.sub(' +',' ', quote)
quotes_list.append(clean_quote)
return ' '.join(str(v) for v in quotes_list).replace('\t','').replace('\n','').lower()
def pid(tree):
for node in tree.findall('.//wwp:sourceDesc//wwp:author/wwp:persName[1]', namespaces=ns):
pid = node.attrib.get('ref')
return pid.replace('personography.xml#','') # will need to replace 'p:'
def trid(tree): # this function will eventually need to call OT (.//wwp:publicationStmt//wwp:idno)
for node in tree.findall('.//wwp:sourceDesc',namespaces=ns):
trid = node.attrib.get('n')
return trid
content_file.write('pid' + '\t' + 'trid' + '\t' +'text' + '\n')
quotes_file.write('pid' + '\t' + 'trid' + '\t' + 'quotes' + '\n')
for file_name in list_of_files:
file = open(file_name, 'rt')
tree = ET.parse(file)
file.close()
pid = pid(tree)
trid = trid(tree)
content = content(tree)
quotes = quotes(tree)
content_file.write(pid + '\t' + trid + '\t' + content + '\n')
quotes_file.write(pid + '\t' + trid + '\t' + quotes + '\n')
content_file.close()
quotes_file.close()
You are overwriting your function calls with the values they returned. changing the function names should fix it.
import xml.etree.ElementTree as ET
import re
import glob
import csv
import sys
content_file = open('WWP Project/WWP_texts.txt','wt')
quotes_file = open('WWP Project/WWP_quotes.txt', 'wt')
list_of_files = glob.glob("../../../Documents/WWPtextbase/distribution/*.xml")
ns = {'wwp':'http://www.wwp.northeastern.edu/ns/textbase'}
def get_content(tree):
lines = ''.join(ET.tostring(tree.getroot(),encoding='unicode',method='text')).replace('\n',' ').replace('\t',' ').strip()
clean_lines = re.sub(' +',' ', lines)
return clean_lines.lower()
def get_quotes(tree):
quotes_list = []
for node in tree.findall('.//wwp:quote', namespaces=ns):
quote = ET.tostring(node,encoding='unicode',method='text')
clean_quote = re.sub(' +',' ', quote)
quotes_list.append(clean_quote)
return ' '.join(str(v) for v in quotes_list).replace('\t','').replace('\n','').lower()
def get_pid(tree):
for node in tree.findall('.//wwp:sourceDesc//wwp:author/wwp:persName[1]', namespaces=ns):
pid = node.attrib.get('ref')
return pid.replace('personography.xml#','') # will need to replace 'p:'
def get_trid(tree): # this function will eventually need to call OT (.//wwp:publicationStmt//wwp:idno)
for node in tree.findall('.//wwp:sourceDesc',namespaces=ns):
trid = node.attrib.get('n')
return trid
content_file.write('pid' + '\t' + 'trid' + '\t' +'text' + '\n')
quotes_file.write('pid' + '\t' + 'trid' + '\t' + 'quotes' + '\n')
for file_name in list_of_files:
file = open(file_name, 'rt')
tree = ET.parse(file)
file.close()
pid = get_pid(tree)
trid = get_trid(tree)
content = get_content(tree)
quotes = get_quotes(tree)
content_file.write(pid + '\t' + trid + '\t' + content + '\n')
quotes_file.write(pid + '\t' + trid + '\t' + quotes + '\n')
content_file.close()
quotes_file.close()
This is the code. However, this code can only parse 4 characters of Arabian only. I want it to parse dynamically. So, the number of characters does not matter. Therefore, it can parse 1 character, 2 character or more based on the number of existing characters.
import xml.etree.ElementTree as ET
import os, glob
import csv
from time import time
#read xml path
xml_path = glob.glob('D:\1. Thesis FINISH!!!\*.xml')
#create file declaration for saving the result
file = open("parsing.csv","w")
#file = open("./%s" % ('parsing.csv'), 'w')
#create variable of starting time
t0=time()
#create file header
file.write('wordImage_id'+'|'+'paw1'+'|'+'paw2'+'|' + 'paw3' + '|' + 'paw4' + '|'+'font_size'+'|'+'font_style'+
'|'+'font_name'+'|'+'specs_effect'+'|'+'specs_height'+'|'+'specs_height'
+'|'+'specs_width'+'|'+'specs_encoding'+'|'+'generation_filtering'+
'|'+'generation_renderer'+'|'+'generation_type' + '\n')
for doc in xml_path:
print 'Reading file - ', os.path.basename(doc)
tree = ET.parse(doc)
#tree = ET.parse('D:\1. Thesis FINISH!!!\Image_14_AdvertisingBold_13.xml')
root = tree.getroot()
#get wordimage id
image_id = root.attrib['id']
#get paw 1 and paw 2
paw1 = root[0][0].text
paw2 = root[0][1].text
paw3 = root[0][2].text
paw4 = root[0][3].text
#get properties of font
for font in root.findall('font'):
size = font.get('size')
style = font.get('fontStyle')
name = font.get('name')
#get properties of specs
for specs in root.findall('specs'):
effect = specs.get('effect')
height = specs.get('height')
width = specs.get('width')
encoding = specs.get('encoding')
#get properties for generation
for generation in root.findall('generation'):
filtering = generation.get('filtering')
renderer = generation.get('renderer')
types = generation.get('type')
#save the result in csv
file.write(image_id + '|' + paw1 + '|' + paw2 + '|' + paw3 + '|' + paw4 + '|' + size + '|' +
style + '|' + name + '|' + effect + '|' + height + '|'
+ width + '|' + encoding + '|' + filtering + '|' + renderer + '|' + types + '\n')
#close the file
file.close()
#print time execution
print("process done in %0.3fs." % (time() - t0))
I need to encrypt 3 .bin files which contain 2 keys for Diffie-Hellman. I have no clue how to do that, all I could think of was what I did in the following Python file. I have an example what the output should look like but my code doesn't seem to produce the right keys. The output file server.ini is used by a client to connect to a server.
import base64
fileList = [['game_key.bin', 'Game'], ['gate_key.bin', 'Gate'], ['auth_key.bin', 'Auth']]
iniList = []
for i in fileList:
file = open(i[0], 'rb')
n = list(file.read(64))
x = list(file.read(64))
file.close()
n.reverse()
x.reverse()
iniList.append(['Server.' + i[1] + '.N "' + base64.b64encode("".join(n)) + '"\n', 'Server.' + i[1] + '.X "' + base64.b64encode("".join(x)) + '"\n'])
iniList[0].append('\n')
#time for user Input
ip = '"' + raw_input('Hostname: ') + '"'
dispName = 'Server.DispName ' + '"' + raw_input('DispName: ') + '"' + '\n'
statusUrl = 'Server.Status ' + '"' + raw_input('Status URL: ') + '"' + '\n'
signupUrl = 'Server.Signup ' + '"' + raw_input('Signup URL: ') + '"' + '\n'
for l in range(1, 3):
iniList[l].append('Server.' + fileList[l][1] + '.Host ' + ip + '\n\n')
for l in [[dispName], [statusUrl], [signupUrl]]:
iniList.append(l)
outFile = open('server.ini', 'w')
for l in iniList:
for i in l:
outFile.write(i)
outFile.close()
The following was in my example file:
# Keys are Base64-encoded 512 bit RC4 keys, as generated by DirtSand's keygen
# command. Note that they MUST be quoted in the commands below, or the client
# won't parse them correctly!
I also tried it without inverting n and x
I've put together a tkinter form and python script for downloading files from an ftp site. The filenames are in the attribute table of a shapefile, as well as an overall Name that the filenames correspond too. In other words I look up a Name such as "CABOT" and download the filename 34092_18.tif. However, if a Name has an apostrophe, such as "O'KEAN", it's giving me trouble. I try to replace the apostrophe, like I've done in previous scripts, but it doesn't download anything....
whereExp = quadField + " = " + "'" + quadName.replace("'", '"') + "'"
quadFields = ["FILENAME"]
c = arcpy.da.SearchCursor(collarlessQuad, quadFields, whereExp)
for row in c:
tifFile = row[0]
tifName = quadName.replace("'", '') + '_' + tifFile
#fullUrl = ftpUrl + tifFile
local_filename = os.path.join(downloadDir, tifName)
lf = open(local_filename, "wb")
ftp.retrbinary('RETR ' + tifFile, lf.write)
lf.close()
Here is an example of a portion of a script that works fine by replacing the apostrophe....
where_clause = quadField + " = " + "'" + quad.replace("'", '"') + "'"
#out_quad = quad.replace("'", "") + ".shp"
arcpy.MakeFeatureLayer_management(quadTable, "quadLayer")
select_out_feature_class = arcpy.SelectLayerByAttribute_management("quadLayer", "NEW_SELECTION", where_clause)