Using other languages with Pylatex - python

I'm trying to get hebrew to pring into a pdf using pylatex. In a sample hebrew .tex file that I'm trying to emulate the format of, the header looks like this:
%\title{Hebrew document in WriteLatex - מסמך בעברית}
\documentclass{article}
\usepackage[utf8x]{inputenc}
\usepackage[english,hebrew]{babel}
\selectlanguage{hebrew}
\usepackage[top=2cm,bottom=2cm,left=2.5cm,right=2cm]{geometry}
I was able to emulate this entire header except for the line \selectlanguage{hebrew}. I'm not sure how I should go about getting this in my .tex file using pylatex. The code for generating the rest of the file is:
doc = pylatex.Document('basic', inputenc = 'utf8x', lmodern = False, fontenc = None, textcomp = None)
packages = [Package('babel', options = ['english', 'hebrew']), Package('inputenc', options = 'utf8enc')]
doc.packages.append(Package('babel', options = ['english', 'hebrew']))
doc.append(text.decode('utf-8'))
doc.generate_pdf(clean_tex=False, compiler = "XeLaTeX ")
doc.generate_tex()
And the header of the .tex file generated is:
\documentclass{article}%
\usepackage[utf8x]{inputenc}%
\usepackage{lastpage}%
\usepackage[english,hebrew]{babel}%
How do you get the selectlanguage line there? I'm pretty new to latex so I apologize for not being so accurate with my terminology.

You want to use Command:
from pylatex import Command
To add it to your preamble,
doc.preamble.append(Command('selectlanguage', 'hebrew'))
or to another specific place in your document,
doc.append(Command('selectlanguage', 'hebrew'))

Related

Is there a way to detect exisiting link from a text file in python

I have code in jupyter notebook with the help of requests to get confirmation on whether that url existed or not and after that prints out the output into the text file. Here is the line code for that
import requests
Instaurl = open("dictionaries/insta.txt", 'w', encoding="utf-8")
cli = ['duolingo', 'ryanair', 'mcguinness.paddy', 'duolingodeutschland', 'duolingobrasil']
exist=[]
url = []
for i in cli:
r = requests.get("https://www.instagram.com/"+i+"/")
if r.apparent_encoding == 'Windows-1252':
exist.append(i)
url.append("instagram.com/"+i+"/")
Instaurl.write(url)
Let's say that inside the cli list, i accidentally added the same existing username as before into the text file (duolingo for example). Is there a way where if the requests found the same URL from the text file, it would not be added into the the text file again?
Thank you!
You defined a list:
cli = ['duolingo', ...]
It sounds like you would prefer to define a set:
cli = {'duolingo', ...}
That way, duplicates will be suppressed.
It happens for dups in the initial
assignment, and for any duplicate cli.add(entry) you might attempt later.

Read form fields in a PDF created by Adobe LiveCycle Designer

How to get the fields from this PDF file? It is a dynamic PDF created by Adobe LiveCycle Designer. If you open the link in a web browser, you will probably see a single page starting from 'Please wait...' If you download the file and open it via Adobe Reader (5.0 or higher), you should see all 8 pages.
So, when reading via PyPDF2, you get an empty dictionary because it renders the file as a single page like that you see via a web browser.
def print_fields(path):
from PyPDF2 import PdfFileReader
reader = PdfFileReader(str(path))
fields = reader.getFields()
print(fields)
You can use Java-dependent library tika to read the contents for all 8 pages. However the results are messy and I am avoiding Java dependency.
def read_via_tika(path):
from tika import parser
raw = parser.from_file(str(path))
content = raw['content']
print(content)
So, basically, I can manually Edit -> Form Options -> Export Data… in Adobe Actobat DC to get a nice XML. Similarly, I need to get the nice form fields and their values via Python.
Thanks to this awesome answer, I managed to retrieve the fields using pdfminer.six.
Navigate through Catalog > AcroForm > XFA, then pdfminer.pdftypes.resolve1 the object right after b'datasets' element in the list.
In my case, the following code worked (source: ankur garg)
import PyPDF2 as pypdf
def findInDict(needle, haystack):
for key in haystack.keys():
try:
value=haystack[key]
except:
continue
if key==needle:
return value
if isinstance(value,dict):
x=findInDict(needle,value)
if x is not None:
return x
pdfobject=open('CTRX_filled.pdf','rb')
pdf=pypdf.PdfFileReader(pdfobject)
xfa=findInDict('/XFA',pdf.resolvedObjects)
xml=xfa[7].getObject().getData()

Extract and get value from a text file python

I have executed ssh commands in remote machine using paramiko library and written output to text file. Now, I want to extract few values from a text file. The output of a text file looks as pasted below
b'\nMS Administrator\n(C) Copyright 2006-2016 LP\n\n[MODE]> SHOW INFO\n\n\nMode: \nTrusted Certificates\n1 Details\n------------\n\tDeveloper ID: MS-00c1\n\tTester ID: ms-00B1\n\tValid from: 2030-01-29T06:51:15Z\n\tValid until: 2030-01-30T06:51:15Z\n\t
how do i get the value of Developer ID and Tester ID. The file is huge.
As suggested by users I have written the snippet below.
file = open("Output.txt").readlines()
for lines in file:
word = re.findall('Developer\sID:\s(.*)\n', lines)[0]
print(word)
I see the error IndexError: list index out of range
If i remove the index. I see empty output
file = open("Output.txt").readlines()
developer_id=""
for lines in file:
if 'Developer ID' in line:
developer_id = line.split(":")[-1].strip()
print developer_id
You can use Regular expressions
text = """\nMS Administrator\n(C) Copyright 2006-2016 LP\n\n[MODE]> SHOW INFO\n\n\nMode: \nTrusted Certificates\n1 Details\n------------\n\tDeveloper ID: MS-00c1\n\tTester ID: ms-00B1\n\tValid from: 2030-01-29T06:51:15Z\n\tValid until: 2030-01-30T06:51:15Z\n\t"""
import re
developerID = re.match("Developer ID:(.+)\\n", text).group(0)
testerID = re.match("Tester ID:(.+)\\n", text).group(0)
If your output is consistent in format, you can use something as easy as line.split():
developer_id = line.split('\n')[11].lstrip()
tester_id = line.split('\n')[12].lstrip()
Again, this assumes that every line is using the same formatting. Otherwise, use regex as suggested above.

Username and Password login

I'd like to create a Login in which will open a text/csv file read the "Valid" usernames and passwords from the file and then if whatever the user has added has matched what was in the file then it will allow access to the rest of the program
How would i integrate the code below into one of which opens a file reads valid usernames and passwords and checks it against the users input
Currently i have something which works but there is only one password which i have set in the code.
Password = StringVar()
Username = StringVar()
def EnterPassword():
file = open('Logins.txt', 'w') #Text file i am using
with open('Logins.txt') as file:
data = file.read() #data=current text in text file
UsernameAttempt = Username.get()#how to get value from entry box
PasswordAttempt = Password.get()#how to get value from entry box
if PasswordAttempt == '' and UsernameAttempt == '':
self.delete()
Landlord = LandlordMenu()
else:
PasswordError = messagebox.showerror('Password/Username Entry','Incorrect Username or Password entered.\n Please try again.')
PasswordButton = Button(self.GenericGui,text = 'Landlord Section',height = 3, width = 15, command = EnterPassword, font = ('TkDefaultFont',14),relief=RAISED).place(x=60,y=175)
Some assistance would be appreciated
Please have a look at some documentation. Your question in "Coding Comments" -> #how to get value from entry box is easy to be solved using official documentation.
For reading files there is also official documentation on strings and file operations (reading file line by line into string, using string.split(';') to get arrays instead of row-strings).
Please do read documentation before writing applications. You do not need to know the complete API of all python modules but where to look. It is very exhausting to be dependent on other users / developers when there is no actual need for it (as there is very detailed documentation and tons of howtows for that kind of stuff).
This is not meant to be offensive but to show you how easy you can get documentation. Both results where first-results from a search engine. (ddg)
Please keep in mind that SO is neither a code writing service nor a let-me-google-that-for-you forum.

maintaining formatting of imported text with mako and rst2pdf

I've created a template which renders pdf files from csv input. However, when the csv input fields contain user formatting, with line breaks and indentations, it messes with the rst2pdf formatting engine. Is there a way to consistently deal with user input in a way that doesn't break the document flow, but also maintains the formatting of the input text? Example script below:
from mako.template import Template
from rst2pdf.createpdf import RstToPdf
mytext = """This is the first line
Then there is a second
Then a third
This one could be indented
I'd like it to maintain the formatting."""
template = """
My PDF Document
===============
It starts with a paragraph, but after this I'd like to insert `mytext`.
It should keep the formatting intact, though I don't know what formatting to expect.
${mytext}
"""
mytemplate = Template(template)
pdf = RstToPdf()
pdf.createPdf(text=mytemplate.render(mytext=mytext),output='foo.pdf')
I have tried adding the following function in the template to insert | at the start of each line, but that doesn't seem to work either.
<%!
def wrap(text):
return text.replace("\\n", "\\n|")
%>
Then ${mytext} would become |${mytext | wrap}. This throws the error:
<string>:10: (WARNING/2) Inline substitution_reference start-string without end-string.
Actually it turns out I was on the right track, I just needed a space between the | and the text. So the following code works:
from mako.template import Template
from rst2pdf.createpdf import RstToPdf
mytext = """This is the first line
Then there is a second
Then a third
How about an indent?
I'd like it to maintain the formatting."""
template = """
<%!
def wrap(text):
return text.replace("\\n", "\\n| ")
%>
My PDF Document
===============
It starts with a paragraph, but after this I'd like to insert `mytext`.
It should keep the formatting intact.
| ${mytext | wrap}
"""
mytemplate = Template(template)
pdf = RstToPdf()
#print mytemplate.render(mytext=mytext)
pdf.createPdf(text=mytemplate.render(mytext=mytext),output='foo.pdf')

Categories