Yaml not working properly in Python3 version of pythonanywhere - python

Good day. I am trying to create a quick&dirty configuration file for my pythonanywhere code. I tried to use YAML, but the result is weird.
import os
import yaml
yaml_str = """Time_config:
Tiempo_entre_avisos: 1
Tiempo_entre_backups: 7
Tiempo_entre_pushes: 30
Other_config:
Morosos_treshold: 800
Mail_config:
Comunication_mail: ''
Backup_mail: ''
Director_mail: []
"""
try:
yaml_file = open("/BBDD/file.yml", 'w+')
except:
print("FILE NOT FOUND")
else:
print("PROCESSING FILE")
yaml.dump(yaml_str, yaml_file, default_flow_style=False)
a = yaml.dump(yaml_str, default_flow_style=False)
print(a) #I make a print to debug
yaml_file.close()
The code seems to work fairly well. However, the result seems corrupted. Both in the file and in the print it looks like this (including the "s):
"Time_config:\n Tiempo_entre_avisos: 1\n Tiempo_entre_backups: 7\n Tiempo_entre_pushes:\ \ 30\nOther_config:\n Morosos_treshold: 800\nMail_config:\n Comunication_mail:\ \ ''\n Backup_mail: ''\n Director_mail: []\n"
If I copy and paste that string in the python console, yaml gives me the intended result, which is this one:
Time_config:
Tiempo_entre_avisos: 1
Tiempo_entre_backups: 7
Tiempo_entre_pushes: 30
Other_config:
Morosos_treshold: 800
Mail_config:
Comunication_mail: ''
Backup_mail: ''
Director_mail: []
Why does this happen? Why I am not getting the result in the first shot? Why is it printing the newline symbol (\n) instead of inserting a new line? Why does it include the " symbols?

I think you should be loading the yaml from the string first, then continuing:
# Everything before here is the same
print("PROCESSING FILE")
yaml_data = yaml.load(yaml_str)
yaml.dump(yaml_data, yaml_file, default_flow_style=False)
a = yaml.dump(yaml_data, default_flow_style=False)
print(a) #I make a print to debug
yaml_file.close()

Related

searching for a particular character, except in comments of a file

I am working on python migration from 2 to 3.
I want to check if the files have a "/" operation. Since the files are too many, I plan to use a script to do so.
Although the script works fine, some files have comments and those comments have the "/" in between.
Eg:
File:
import sys
#blah blah
#get/set ---This gets detected
a=5
b=2
c=a/b --- I want to detect this
d=5/3 --- I want to detect this
I do not want the comments section to be considered, is there any regex that could help me here?
Script:
text = '/'
APP_FOLDER: "C\Users\Files"
for dirpath, dirnames, filenames in os.walk(APP_FOLDER):
for inputFile in filenames:
if pathlib.Path(inputFile).suffix == ".py":
file_path = os.path.join(dirpath, inputFile)
with open(file_path) as f:
num_lines = len(f.readlines())
with open(file_path, 'r') as fp:
for line in fp:
if re.findall(text, line, flags=re.IGNORECASE):
file_count = file_count + 1
print "File path: " + file_path
print "File name: " + inputFile
print "*******************************************************************************"
break
Looking forward for suggestions. PS: The # symbol need not be the first character in the line.
NOTE:
The comments on your question actually give a better answer than this...
You can do this quite easily by simply splitting on the # character and only evaluating the part before the # character. See below:
def find_char_in_text(text, subtext, commentchar='#'):
result = []
for line in text.split('\n'):
if commentchar in line:
# split on the comment character
# reason to not change line itself directly is
# so you can add the whole line to the results.
evaluate_this = line.split(commentchar)[0]
else:
evaluate_this = line
if subtext in evaluate_this:
result.append(line)
return result
text = """File:
import sys
#blah blah
#get/set ---This gets detected
a=5
b=2
c=a/b --- I want to detect this
d=5/3 --- I want to detect this"""
for result in find_char_in_text(text, '/'):
print(result)
output
c=a/b --- I want to detect this
d=5/3 --- I want to detect this

filecmp returns False even though files are equal

I'm trying to compare between two files with filecmp, the problem is that the result is always "No, the files are NOT the same" which means False
even though the files are the same.
I'm writing to two different files the same content. First I write to file revision_1.txt:
original_stdout = sys.stdout
with open('revision_1.txt', 'w') as rev1:
sys.stdout = rev1
print(revision) # revision is output from command i took before
sys.stdout = original_stdout
if filecmp.cmp('revision_1.txt', 'revision_2.txt'):
# revision_2.txt is file I c
print("Both the files are same")
else:
# Do whatever you want if the files are NOT the same
print("No, the files are NOT the same")
original_stdout = sys.stdout
with open('revision_2.txt', 'w') as rev2:
sys.stdout = rev2
print(revision) # revision is output from command i took before
sys.stdout = original_stdout
My goal is if the files are equal - stop the script. If they are not, it will rewrite revision_2.txt and then send mail, (I already wrote the code for mail).
Your usage of files us unusual:
import filecmp
revision = "08/15"
with open('revision_1.txt', 'w') as rev1:
rev1.write(revision)
with open('revision_2.txt', 'w') as rev2:
rev2.write(revision)
with open('revision_3.txt', 'w') as rev3:
rev3.write(revision + "-42")
# should compare equal
if filecmp.cmp('revision_1.txt', 'revision_2.txt'):
print("Identical")
else:
print("No, the files are NOT the same")
# should NOT compare equal
if filecmp.cmp('revision_1.txt', 'revision_3.txt'):
print("Identical")
else:
print("No, the files are NOT the same")
prints
Identical
No, the files are NOT the same
Try set shallow to false (Default is True), i.e
if filecmp.cmp('revision_1.txt', 'revision_2.txt', shallow=False):
From the documentation:
If shallow is true, files with identical os.stat() signatures are taken to be equal. Otherwise, the contents of the files are compared.
https://docs.python.org/3/library/filecmp.html#filecmp.cmp
Thank you all for the reply
As I said I'm very new with Python
According to your recommendations I changed the code, this time I'm going to send the full script and explain
I successes to compare between 'revision' and 'd' my problem is that I'm getting different rpc-reply message-id,
How can can ignore message-id (I only need the Revision value) ?
See script output:
Not equal
Revision: fpc1-1603878922-228
FFFFFFF
Revision: fpc1-1603878922-228
FFFFFFF
Script:
import smtplib
import email.message
from email.mime.text import MIMEText
from ncclient import manager
from ncclient.xml_ import *
import sys
import time
import filecmp
# Connecting to juniper cc-vc-leg
conn = manager.connect(
host='10.1.1.1',
port='830',
username='test',
password='test',
timeout=10,
device_params={'name':'junos'},
hostkey_verify=False)
# Take juniper commands
resault = conn.command('show version | match Hostname', format='text')
revision = conn.command('show system commit revision', format='text')
compare_config = conn.compare_configuration(rollback=1)
# Open & read file vc-lg_rev.text
f = open('vc-lg_rev.text', 'r')
d = f.read()
# Check if revision output is equal to file "vc-lg_rev.text"
# If equal exit the script
if (revision == d):
print('equal')
exit()
print('I hop script stopped')
else:
print('Not equal')
print(revision)
print('FFFFFFF')
print(d)
print('FFFFFFF')
# To save last revision number to "vc-lg_rev.text"
with open('vc-lg_rev.text', 'w', buffering=1) as rev1:
rev1.write(str(revision))
rev1.flush()
rev1.close()
# This is how i copy "compare_config" output to file "vc-lg_compare.text"
original_stdout = sys.stdout
with open('vc-lg_compare.text', 'w') as a:
sys.stdout = a
print(compare_config)
sys.stdout = original_stdout
def send_email(compare):
server = smtplib.SMTP('techunix.technion.ac.il', 25)
email_reciver = 'rafish#technion.ac.il', 'rafi1shemesh#gmail.com'
message = f"'Subject': mail_subject \n\n {compare}"
ID = 'Juniper_Compare'
server.sendmail(ID, email_reciver, message)
with open('vc-lg_compare.text', 'r') as compare: # "as" means file object called compare
text = str(compare.read()) # I want to recive the output as string to look specific word in the file
if (text.find('+') > -1) or (text.find('- ') > -1):
send_email(text)

Check if a variable string exist in a text file

So guys, i'm tryng to make a password generator but i'm having this trouble:
First, the code i use for tests:
idTest= "TEST"
passwrd= str(random.randint(11, 99))
if not os.path.exists('Senhas.txt'):
txtFileW = open('Senhas.txt', 'w')
txtFileW.writelines(f'{idTest}: {passwrd}\n')
txtFileW.close()
else:
txtFileA = open('Senhas.txt', 'a')
txtFileA.write(f'{idTest}: {passwrd}\n')
txtFileA.close()
print(f'{idTest}: {passwrd}')
Well, what i'm expecting is something like this:
else:
with open('Senhas.txt', 'r+') as opened:
opened.read()
for lines in opened:
if something == idTest:
lines.replace(f'{something}', f'{idTest}')
else:
break
txtFileA = open('Senhas.txt', 'a')
txtFileA.write(f'{idTest}: {passwrd}\n')
txtFileA.close()
print(f'{idTest}: {passwrd}')
I've searched for it but all i've found are ways to separate it in 2 files (for my project it doesn't match) or with "static" strings, that doesn't match for me as well.
You can use the fileinput module to update the file in place.
import fileinput
with fileinput.input(files=('Senhas.txt'), inplace=True) as f:
for line in f:
if (line.startswith(idTest+':'):
print(f'{idTest}: {passwrd}')
else:
print(line)

Mafft only creating one file with Python

So I'm working on a project to align a sequence ID and its code. I was given a barcode file, which contains a tag for a DNA sequence, i.e. TTAGG. There's several tags (ATTAC, ACCAT, etc.) which then get removed from the a sequence file and placed with a seq ID.
Example:
sequence file --> SEQ 01 TTAGGAACCCAAA
barcode file --> TTAGG
the output file I want will remove the barcode and use it to create a new fasta format file.
Example:
testfile.TTAGG which when opened should have
>SEQ01
AACCCAAA
There are several of these files. I want to take each one of this files that I create and run them through mafft, but when I run my script, it only concentrates on one file for mafft. The files I mentioned above come out ok, but when mafft runs, it only runs the last file created.
Here's my script:
#!/usr/bin/python
import sys
import os
fname = sys.argv[1]
barcodefname = sys.argv[2]
barcodefile = open(barcodefname, "r")
for barcode in barcodefile:
barcode = barcode.strip()
outfname = "%s.%s" % (fname, barcode)
outf = open(outfname, "w+")
handle = open(fname, "r")
mafftname = outfname + ".mafft"
for line in handle:
newline = line.split()
seq = newline[0]
brc = newline[1]
potential_barcode = brc[:len(barcode)]
if potential_barcode == barcode:
outseq = brc[len(barcode):]
barcodeseq = ">%s\n%s\n" % (seq,outseq)
outf.write(barcodeseq)
handle.close()
outf.close()
cmd = "mafft %s > %s" % (outfname, mafftname)
os.system(cmd)
barcodefile.close()
I hope that was clear enough! Please help! I've tried changing my indentations, adjusting when I close the file. Most of the time it won't make the .mafft file at all, sometimes it does but doesn't put anything it, but mostly it only works on that last file created.
Example:
the beginning of the code creates files such as -
testfile.ATTAC
testfile.AGGAC
testfile.TTAGG
then when it runs mafft it only creates
testfile.TTAGG.mafft (with the correct input)
I have tried close the outf file and then opening it again, in which it tells me I'm coercing it.
I've changed to the outf file to write only, doesn't change anything.
The reason why mafft only aligns the last file file is because its execution is outside the loop.
As your code stands, you create an input file name variable (outfname) in each iteration of the loop, but this variable is always overwritten in the next iteration. Therefore, when your code eventually reaches the mafft execution command, the outfname variable will contain the last file name of the loop.
To correct this, simply insert the mafft execution command inside the loop:
#!/usr/bin/python
import sys
import os
fname = sys.argv[1]
barcodefname = sys.argv[2]
barcodefile = open(barcodefname, "r")
for barcode in barcodefile:
barcode = barcode.strip()
outfname = "%s.%s" % (fname, barcode)
outf = open(outfname, "w+")
handle = open(fname, "r")
mafftname = outfname + ".mafft"
for line in handle:
newline = line.split()
seq = newline[0]
brc = newline[1]
potential_barcode = brc[:len(barcode)]
if potential_barcode == barcode:
outseq = brc[len(barcode):]
barcodeseq = ">%s\n%s\n" % (seq,outseq)
outf.write(barcodeseq)
handle.close()
outf.close()
cmd = "mafft %s > %s" % (outfname, mafftname)
os.system(cmd)
barcodefile.close()

Script that reads PDF metadata and writes to CSV

I wrote a script to read PDF metadata to ease a task at work. The current working version is not very usable in the long run:
from pyPdf import PdfFileReader
BASEDIR = ''
PDFFiles = []
def extractor():
output = open('windoutput.txt', 'r+')
for file in PDFFiles:
try:
pdf_toread = PdfFileReader(open(BASEDIR + file, 'r'))
pdf_info = pdf_toread.getDocumentInfo()
#print str(pdf_info) #print full metadata if you want
x = file + "~" + pdf_info['/Title'] + " ~ " + pdf_info['/Subject']
print x
output.write(x + '\n')
except:
x = file + '~' + ' ERROR: Data missing or corrupt'
print x
output.write(x + '\n')
pass
output.close()
if __name__ == "__main__":
extractor()
Currently, as you can see, I have to manually input the working directory and manually populate the list of PDF files. It also just prints out the data in the terminal in a format that I can copy/paste/separate into a spreadsheet.
I'd like the script to work automatically in whichever directory I throw it in and populate a CSV file for easier use. So far:
from pyPdf import PdfFileReader
import csv
import os
def extractor():
basedir = os.getcwd()
extension = '.pdf'
pdffiles = [filter(lambda x: x.endswith('.pdf'), os.listdir(basedir))]
with open('pdfmetadata.csv', 'wb') as csvfile:
for f in pdffiles:
try:
pdf_to_read = PdfFileReader(open(f, 'r'))
pdf_info = pdf_to_read.getDocumentInfo()
title = pdf_info['/Title']
subject = pdf_info['/Subject']
csvfile.writerow([file, title, subject])
print 'Metadata for %s written successfully.' % (f)
except:
print 'ERROR reading file %s.' % (f)
#output.writerow(x + '\n')
pass
if __name__ == "__main__":
extractor()
In its current state it seems to just prints a single error (as in, the error message in the exception, not an error returned by Python) message and then stop. I've been staring at it for a while and I'm not really sure where to go from here. Can anyone point me in the right direction?
writerow([file, title, subject]) should be writerow([f, title, subject])
You can use sys.exc_info() to print the details of your error
http://docs.python.org/2/library/sys.html#sys.exc_info
Did you check the pdffiles variable contains what you think it does? I was getting a list inside a list... so maybe try:
for files in pdffiles:
for f in files:
#do stuff with f
I personally like glob. Notice I add * before the .pdf in the extension variable:
import os
import glob
basedir = os.getcwd()
extension = '*.pdf'
pdffiles = glob.glob(os.path.join(basedir,extension)))
Figured it out. The script I used to download the files was saving the files with '\r\n' trailing after the file name, which I didn't notice until I actually ls'd the directory to see what was up. Thanks for everyone's help.

Categories