Appending print commands with .decode() method - python

I have a script of about 300 lines (part of which is pasted below) with a lot of print commands. I am trying to cleanup the output it produces. If I leave it the way it is then all the print commands print bytes with \r\n on to the console.
I figured if I add .decode('utf-8') in front of the variable that I need to print then the output is what I should be expecting (uni-code string). For example, compare print (data1) and print (data3) commands below. What I want to do is to go through all of the code and append .decode() to every print statement.
All the print commands are in this format: Print (dataxxxx)
import telnetlib
import time
import sys
import random
from xlwt import Workbook
shelfIp = "10.10.10.10"
shelf = "33"
print ("Shelf IP is: " + str(shelfIp))
print ("Shelf number is: " + str(shelf))
def addCard():
tn = telnetlib.Telnet(shelfIp)
### Telnet session
tn.read_until(b"<",5)
cmd = "ACT-USER::ADMIN:ONE::ADMIN;"
tn.write(bytes(cmd,encoding="UTF-8"))
data1 = tn.read_until(b"ONE COMPLD", 5)
print (data1.decode('utf-8'))
### Entering second network element
cmd = "ENT-CARD::CARD" + shelf + "-" + shelf + ":TWO:xyz:;"
tn.write(bytes(cmd,encoding="UTF-8"))
data3 = tn.read_until(b"TWO COMPLD", 5)
print (data3)
### Entering third network element
cmd = "ENT-CARD::CARD-%s-%s:ADM:ABC:;" %(shelf,shelf)
tn.write(bytes(cmd,encoding="UTF-8"))
dataAmp = tn.read_until(b"ADM COMPLD", 5)
print (dataAmp)
tn.close()
addCard()

If you are looking into doing some sort of find-replace on the code, you can try this:
import re
f = open('script.py','rb')
script = f.read()
f.close()
newscript = re.sub("(print\(.*)\)", "\g<1>.decode('utf-8'))", script)
f = open('script.py', 'wb')
f.write(newscript)
f.close()
What I did in the regular expression:
Catch text that contains print(......) and save the print(..... part into group 1
Replace the text after the print(.... which is ) with: .decode('utf-8')) using the syntax \g<1> which takes the saved group number 1 and put that as the prefix in the replaced text.

Appending .decode() to print() statements will fail because .decode() is a string method.
>>> x=u"testing"
>>> print(x).decode('utf-8')
testing
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'decode'
You must apply .decode('utf-8') to the variables you wish to decode, which is not easily accomplished using regex based tools.

Related

Why do i get a Syntax error using exec()?

This function grabs a python script from a paste on pastebin with the title py_0001, when the execution reaches the try: except: it throws an error
SyntaxError: unexpected character after line continuation character
If you copy value of script_ and declare it as a string variable it executes without any errors
The function works fine until it reaches the error handling part
def get_script():
''' grabs python script from pastebin '''
reg_ = r'[a-zA-Z0-9]*\">py_0001'
resp = requests.get(url='https://pastebin.com/u/'+usr_name)
path = re.findall(reg_ , str(resp.content) , re.MULTILINE)
url2 = "https://pastebin.com/raw/"+ str(path[0]).replace('">py_0001' , '')
resp2 = requests.get(url2)
script_ = str(resp2.content)[2:-1]
print (script_)
try:
exec(script_)
except:
print ("3rr0r")
This is the output of the paste on pastebin
import os\r\nimport time \r\nimport random \r\n \r\ndef fun_9991():\r\n ## a simple code example to test \r\n for i in range (0 , 10 ):\r\n print ( " loop count {} , random number is {} , time is {} ".format(i , random.randrange(10) , int(time.time()/1000)))\r\n print ("loop reached the end")\r\n \r\n \r\nif __name__ == "__main__":\r\n fun_9991()\r\n\r\n\r\n
Your problem is calling str() on a bytes object. NEVER call str() on a bytes object to convert it to a string, since it behaves like repr(). Simply using [2:-1] will only remove the quotes but not undo escaping other special characters.
You can do this:
script_ = resp2.content.decode('utf-8')
Or this:
script_ = resp2.text
Also, executing random code from the internet is an incredibly bad idea.

removing space from string in python

def digits_plus(test):
test=0
while (test<=3):
print str(test)+"+",
test = test+1
return()
digits_plus(3)
The output is:
0+ 1+ 2+ 3+
However i would like to get: 0+1+2+3+
Another method to do that would be to create a list of the numbers and then join them.
mylist = []
for num in range (1, 4):
mylist.append(str(num))
we get the list [1, 2, 3]
print '+'.join(mylist) + '+'
If you're stuck using Python 2.7, start your module with
from __future__ import print_function
Then instead of
print str(test)+"+",
use
print(str(test)+"+", end='')
You'll probably want to add a print() at the end (out of the loop!-) to get a new-line after you're done printing the rest.
You could also use the sys.stdout object to write output (to stdout) that you have more fine control over. This should let you output exactly and only the characters you tell it to (whereas print will do some automatic line endings and casting for you)
#!/usr/bin/env python
import sys
test = '0'
sys.stdout.write(str(test)+"+")
# Or my preferred string formatting method:
# (The '%s' implies a cast to string)
sys.stdout.write("%s+" % test)
# You probably don't need to explicitly do this,
# If you get unexpected (missing) output, you can
# explicitly send the output like
sys.stdout.flush()

Python: Writing multiple variables to a file

I'm fairly new to Python and I've written a scraper that prints the data I scrap the exact way I need it, but I'm having trouble writing the data to a file. I need it to look the exact same way and be in the same order as it does when it prints in IDLE
import requests
import re
from bs4 import BeautifulSoup
year_entry = raw_input("Enter year: ")
week_entry = raw_input("Enter week number: ")
week_link = requests.get("http://sports.yahoo.com/nfl/scoreboard/?week=" + week_entry + "&phase=2&season=" + year_entry)
page_content = BeautifulSoup(week_link.content)
a_links = page_content.find_all('tr', {'class': 'game link'})
for link in a_links:
r = 'http://www.sports.yahoo.com' + str(link.attrs['data-url'])
r_get = requests.get(r)
soup = BeautifulSoup(r_get.content)
stats = soup.find_all("td", {'class':'stat-value'})
teams = soup.find_all("th", {'class':'stat-value'})
scores = soup.find_all('dd', {"class": 'score'})
try:
game_score = scores[-1]
game_score = game_score.text
x = game_score.split(" ")
away_score = x[1]
home_score = x[4]
home_team = teams[1]
away_team = teams[0]
away_team_stats = stats[0::2]
home_team_stats = stats[1::2]
print away_team.text + ',' + away_score + ',',
for stats in away_team_stats:
print stats.text + ',',
print '\n'
print home_team.text + ',' + home_score +',',
for stats in home_team_stats:
print stats.text + ',',
print '\n'
except:
pass
I am totally confused on how to get this to print to a txt file the same way it prints in IDLE. The code is built to only run on completed weeks of the NFL season. So if you test the code, I recommend year = 2014 and week = 12 (or before)
Thanks,
JT
To write to a file you need to build up the line as a string, then write that line to a file.
You'd use something like:
# Open/create a file for your output
with open('my_output_file.csv', 'wb') as csv_out:
...
# Your BeautifulSoup code and parsing goes here
...
# Then build up your output strings
for link in a_links:
away_line = ",".join([away_team.text, away_score])
for stats in away_team_stats:
away_line += [stats.text]
home_line = ",".join(home_team.text, home_score])
for stats in home_team_stats:
home_line += [stats.text]
# Write your output strings to the file
csv_out.write(away_line + '\n')
csv_out.write(home_line + '\n')
This is a quick and dirty fix. To do it properly you probably want to look into the csv module (docs)
From the structure of your output I agree with Jamie that using CSV is a logical choice.
But since you're using Python 2, it's possible to use an alternate form of the print statement to print to a file.
From https://docs.python.org/2/reference/simple_stmts.html#the-print-statement
print also has an extended form, defined by the second portion of the
syntax described above. This form is sometimes referred to as “print
chevron.” In this form, the first expression after the >> must
evaluate to a “file-like” object, specifically an object that has a
write() method as described above. With this extended form, the
subsequent expressions are printed to this file object. If the first
expression evaluates to None, then sys.stdout is used as the file for
output.
Eg,
outfile = open("myfile.txt", "w")
print >>outfile, "Hello, world"
outfile.close()
However, this syntax is not supported in Python 3, so I guess it's probably not a good idea to use it. :) FWIW, I generally use the file write() method in my code when writing to files, except that I tend to use print >>sys.stderr for error messages.

Two regex functions together do not work

I am trying to get the index for the start of a tag and the end of another tag. However, when I use one regex it works absolutely fine but for two regex functions, it gives an error for the second one.
Kindly help in explaining the reason
The below code works fine:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
opentag = re.search('<TEXT>',f.read())
begin = opentag.start()+6
print begin
But when I add another similar regex it give me the error
AttributeError: 'NoneType' object has no attribute 'start'
which I understand is due to the start() function returning None
Below is the code:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
opentag = re.search('<TEXT>',f.read())
begin = opentag.start()+6
print begin
closetag = re.search('</TEXT>',f.read())
end = closetag.start() - 1
print end
Please provide a solution to how can I get this working. Also I am a newbie here so please don't mind if I ask more questions on the solution.
You are reading the file in f.read() which reads the whole file, and so the file descriptor moves forward, which means the text can't be read again when you do f.read() the next time.
If you need to search on the same text again, save the output of f.read(), and then do a regular expression search on it as below:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
text = f.read()
opentag = re.search('<TEXT>',text)
begin = opentag.start()+6
print begin
closetag = re.search('</TEXT>',text)
end = closetag.start() - 1
print end
f.read() reads the whole file. So there's nothing left to read on the second f.read() call.
See https://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects
First of all you have to know that f.read() after read file sets the pointer to the EOF so if you again use f.read() it gives you empty string ''. Secondly you should use r before string passed as a pattern of re.search function, which means raw, and automatically escapes special characters. So you have to do something like this:
import re
f = open('C:/Users/Jyoti/Desktop/PythonPrograms/try.xml','r')
data = f.read()
opentag = re.search(r'<TEXT>',data)
begin = opentag.start()+6
print begin
closetag = re.search(r'</TEXT>',data)
end = closetag.start() - 1
print end
gl & hf with Python :)

"cannot concatenate 'str' and 'list' objects" keeps coming up :(

I'm writing a python program. The program calculates Latin Squares using two numbers the user enters on a previous page. But but an error keeps coming up, "cannot concatenate 'str' and 'list' objects" here is the program:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# enable debugging
import cgi
import cgitb
cgitb.enable()
def template(file, **vars):
return open(file, 'r').read() % vars
print "Content-type: text/html\n"
print
form = cgi.FieldStorage() # instantiate only once!
num_1 = form.getfirst('num_1')
num_2 = form.getfirst('num_2')
int1r = str(num_1)
int2r = str(num_2)
def calc_range(int2r, int1r):
start = range(int2r, int1r + 1)
end = range(1, int2r)
return start+end
int1 = int(int1r)
int2 = int(int2r)
out_str = ''
for i in range(0, int1):
first_line_num = (int2 + i) % int1
if first_line_num == 0:
first_line_num = int1
line = calc_range(first_line_num, int1)
out_str += line
print template('results.html', output=out_str, title="Latin Squares")
range returns a list object, so when you say
line = calc_range(first_line_num, int1)
You are assigning a list to line. This is why out_str += line throws the error.
You can use str() to convert a list to a string, or you can build up a string a different way to get the results you are looking for.
By doing out_str += line, you're trying to add a list (from calc_range) to a string. I don't even know what this is supposed to be doing, but that's where the problem lies.
You didn't say what line you're getting the error from, but I'm guessing it's:
out_str += line
The first variable is a string. The second is a list of numbers. You can't concatenate a list onto a string. I don't know what you're trying to do exactly, but how about:
out_str += ", ".join(line)
That will add the numbers joined by commas onto out_str.
calc_range() returns a list; however, you are attempting to add it to a string (out_str).
It looks like your code is unfinished - don't you want to do something with the range of numbers returned by calc_range()? Like, say, something with the form?
line = ''.join(num_1[index] for index in calc_range(first_line_num, int1))
I don't know if that's what you want - but maybe something like that?

Categories