Python syntax error in email address extraction script

Python syntax error in email address extraction script - python

I found this python script at metafilter and modified the addresses and pass as needed, but I get a script error at the very last line. In the error the little carat symbol is underneath the quote after print "\n
Not being a coder at all, I have no idea where to turn next. Any help would be greatly appreciated.
What I actually want to do is pull out the "From" addresses and not to and CC, but I figured I would get this working properly first.
Script was run in Windows using Python 3.2 and redirected output to a text file.
import email
import getpass
import imaplib
HOST = "mail.-----.com"
USER = "sales#-----.com"
FOLDER = "Folder"
connection = imaplib.IMAP4_SSL(HOST)
res, data = connection.login(USER, getpass.getpass())
assert res == "OK"
res, count = connection.select(FOLDER)
assert res == "OK"
res, (msg_nums,) = connection.search(None, "ALL")
assert res == "OK"
for msg_num in msg_nums.split():
res, message_text = connection.fetch(msg_num, "(RFC822)")
assert res == "OK"
message = email.message_from_string(message_text[0][1])
tos = message.get_all("From") or []
ccs = message.get_all("Cc") or []
all_recipients = email.Utils.getaddresses(tos + ccs)
print "\n".join(addr.lower() for realname, addr in all_recipients)

print is a function in Python 3, so it needs to be:
print("\n".join(addr.lower() for realname, addr in all_recipients))
The script was probably originally written for use with Python 2.x.

Related

Getting JSON content from a packet using Scapy with Python

I have a pcapng file that contains a little bit of traffic. One of the packets I am trying to print out is containing JSON data. If I open the packet up in Wireshark, I am able to see the values in the JSON. But when using scapy to read the file and print I don't see it.
from scapy.all import IP, sniff
from scapy.layers import http
def process_tcp_packet(packet):
if packet.haslayer(http.HTTPRequest):
http_layer = packet.getlayer(http.HTTPRequest)
ip_layer = packet.getlayer(IP)
#print('\n{0[src]} just requested a {1[Method]} {1[Host]}{1[Path]}'.format(ip_layer.fields, http_layer.fields))
#print(ip_layer.fields)
#print(http_layer.fields)
#packet.show()
print('Packet: ' + str(packet))
print('\n\n')
# Start sniffing the network.
sniff(offline='test.pcapng', prn=process_tcp_packet, count=2)
Here is the JSON content Wireshark is showing me:
And this is the output I am getting for that packet using the code above..
Packet: b'\x18\x0fv\xef0\x8a\xc4\x98\\\xe7=\x18\x08\x00E\x00\x01&&S#\x00#\x06}\n\xc0\xa8\x89\x94#\xa7(\x91\x9b\xd0\x00P\x16-/\x9e\xb1\xa1\xe8V\x80\x18\x01K\x97\xaf\x00\x00\x01\x01\x08\n\x00\x00\t\xd5\xfb\xc3b\x89POST /v1/identify HTTP/1.1\r\nHost: api.segment.io\r\nUser-Agent: Roku/DVP-9.10 (489.10E04121A)\r\nAccept: application/json\r\nAuthorization: Basic: NHJmY3AzUEJmTUhPVlJsWVZZNTZKRDZ0N1JuMUNoaVY=\r\nContent-Type: application/json\r\nContent-Length: 704\r\n\r\n'
I was reading on how to print the entire content of the packet and thats where I came across both packet.show() and print(packet) however both of them are still missing the JSON data.
I want to get the JSON data because I want to be able to manually parse it. I don't like how Wireshark has all the JSON nested into arrows that I have to drop down to see.
This is the output of show:
And I am using the latest version of scapy.

It's an old question, but for future people who search for an answer, here is how I did it:
packet_dict = {}
for line in packet.show2(dump=True).split('\n'):
if '###' in line:
layer = line.strip('#[] ')
packet_dict[layer] = {}
elif '=' in line:
key, val = line.split('=', 1)
packet_dict[layer][key.strip()] = val.strip()
print(json.dumps(packet_dict))

If it can be useful to someone, starting from Yechiel's code I made some improvements:
Key values are returned in the correct format instead of all as a string
Sublayers are parsed
def pkt2dict(pkt):
packet_dict = {}
for line in pkt.show2(dump=True).split('\n'):
if '###' in line:
if '|###' in line:
sublayer = line.strip('|#[] ')
packet_dict[layer][sublayer] = {}
else:
layer = line.strip('#[] ')
packet_dict[layer] = {}
elif '=' in line:
if '|' in line and 'sublayer' in locals():
key, val = line.strip('| ').split('=', 1)
packet_dict[layer][sublayer][key.strip()] = val.strip('\' ')
else:
key, val = line.split('=', 1)
val = val.strip('\' ')
if(val):
try:
packet_dict[layer][key.strip()] = eval(val)
except:
packet_dict[layer][key.strip()] = val
else:
log.debug("pkt2dict packet not decoded: " + line)
return packet_dict
To check if it works on all types of layers returned by scapy.

Indentation Error Python Not Working

Im trying to run my code and there is an
File "C:/trcrt/trcrt.py", line 42
def checkInternet():
^
IndentationError: unexpected unindent
The code supposed to check for the traceroute to a website... i know... its not very smart but its what i was told to do
Ive checked the code using pep8 and eveything is seems to be fine...
'''
Developer: Roei Edri
File name: trcrt.py
Date: 24.11.17
Version: 1.1.0
Description: Get an url as an input and prints the traceroute to it.
'''
import sys
import urllib2
i, o, e = sys.stdin, sys.stdout, sys.stderr
from scapy.all import *
from scapy.layers.inet import *
sys.stdin, sys.stdout, sys.stderr = i, o, e
def trcrt(dst):
"""
Check for the route for the given destination
:param dst: Final destination, in a form of a website.
:type dst: str
"""
try:
pckt = IP(dst=dst)/ICMP() # Creates the
# packet
ip = [p for p in pckt.dst] # Gets the ip
print "Tracerouting for {0} : {1}".format(dst, ip[0])
for ttl in range(1, 40):
pckt = IP(ttl=ttl, dst=dst)/ICMP()
timeBefore = time.time()
reply = sr1(pckt, verbose=0, timeout=5)
timeAfter = time.time()
timeForReply = (timeAfter - timeBefore)*1000
if reply is not None:
print "{0} : {1} ; Time for reply: {2}".format(ttl,
reply.src, timeForReply)
if reply.type == 0:
print "Tracerout Completed"
break
else:
print "{0} ... Request Time Out".format(ttl)
def checkInternet():
"""
Checks if there is an internet connection
:return: True if there is an internet connection
"""
try:
urllib2.urlopen('http://45.33.21.159', timeout=1)
return True
except urllib2.URLError as IntError:
return False
Thanks for any help...
Btw pep8 says
"module level import not at top of file"
for lines 12,13

The try block is missing its except clause.
try:
pckt = IP(dst=dst)/ICMP() # Creates the
# packet
ip = [p for p in pckt.dst] # Gets the ip
print "Tracerouting for {0} : {1}".format(dst, ip[0])
for ttl in range(1, 40):
pckt = IP(ttl=ttl, dst=dst)/ICMP()
timeBefore = time.time()
reply = sr1(pckt, verbose=0, timeout=5)
timeAfter = time.time()
timeForReply = (timeAfter - timeBefore)*1000
if reply is not None:
print "{0} : {1} ; Time for reply: {2}".format(ttl,
reply.src, timeForReply)
if reply.type == 0:
print "Tracerout Completed"
break
else:
print "{0} ... Request Time Out".format(ttl)
except: # Here : Add the exception you wish to catch
pass # handle this exception appropriately
As a general rule, do not use catch all except clauses, and do not pass on a caught exception, it lets it fail silently.

If this is your full code, there are two things to check:
1) Have you mixed tabs and spaces? Make sure that all tabs are converted to spaces (I recommend 4 spaces per tab) for indentation. A good IDE will do this for you.
2) The try: in trcrt(dst) does not hava a matching except block.
PEP8 will by the way also tell you, that function names should be lowercase:
check_internet instead of checkInternet, ...
I will give you the same recommendation, that I give to everyone working with me: Start using an IDE that marks PEP8 and other errors for you, there is multiple around. It helps spotting those errors a lot and trains you to write clean Python code that is easily readable and (if you put comments in it) also reausable and understandable a few years later.

Python search imap email for a string

New to python, having some trouble getting past this.
Am getting back emails from gmail via imap (with starter code from https://yuji.wordpress.com/2011/06/22/python-imaplib-imap-example-with-gmail/) and want to search a specific email (which I am able to fetch) for a specific string. Something like this
ids = data[0]
id_list = ids.split()
ids = data[0]
id_list = ids.split()
latest_email_id = id_list[-1]
result, data = mail.fetch(latest_email_id, "(RFC822)")
raw_email = data[0][1]
def search_raw():
if 'gave' in raw_email:
done = 'yes'
else:
done = 'no'
and it always sets done to no. Here's the output for the email (for the body section of the email)
Content-Type multipart/related;boundary=1_56D8EAE1_29AD7EA0;type="text/html"
--1_56D8EAE1_29AD7EA0
Content-Type text/html;charset="UTF-8"
Content-Transfer-Encoding base64
PEhUTUw+CiAgICAgICAgPEhFQUQ+CiAgICAgICAgICAgICAgICA8VElUTEU+PC9USVRMRT4KICAg
ICAgICA8L0hFQUQ+CiAgICAgICAgPEJPRFk+CiAgICAgICAgICAgICAgICA8UCBhbGlnbj0ibGVm
dCI+PEZPTlQgZmFjZT0iVmVyZGFuYSIgY29sb3I9IiNjYzAwMDAiIHNpemU9IjIiPlNlbnQgZnJv
bSBteSBtb2JpbGUuCiAgICAgICAgICAgICAgICA8QlI+X19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXzwvRk9OVD48L1A+CgogICAgICAg
ICAgICAgICAgPFBSRT4KR2F2ZQoKPC9QUkU+CiAgICAgICAgPC9CT0RZPgo8L0hUTUw+Cg==
--1_56D8EAE1_29AD7EA0--
I know the issue is the html, but can't seem to figure out how to parse the email properly.
Thank you!

The text above is base64 encoding. Python has a module named base64 which gives you the ability to decode it.
import base64
import re
def has_gave(raw_email):
email_body = base64.b64decode(raw_email)
match = re.search(r'.*gave.*', email_body , re.IGNORECASE)
if match:
done = 'yes'
print 'match found for word ', match.group()
else:
done = 'no'
print 'no match found'
return done

Reading windows event log in Python using pywin32 (win32evtlog module)

I would like to read Windows' event log. I am not sure if it's the best way but I would like to use the pywin32 -> win32evtlog module to do so. First and foremost is it possible to read logs from Windows 7 using this library and if so how to read events associated with applications runs (running an .exe must leave a trace in the event log in windows i guess).
I have managed to find some little example on the net but it's not enough for me and the documentation isn't well written unfortunately ;/
import win32evtlog
hand = win32evtlog.OpenEventLog(None,"Microsoft-Windows-TaskScheduler/Operational")
print win32evtlog.GetNumberOfEventLogRecords(hand)

you can find plenty of demos related to the winapi in your C:\PythonXX\Lib\site-packages\win32\Demos folder. In this folder you'll find a script named eventLogDemo.py. There you can see how to use win32evtlog module. Just start this script with eventLogDemo.py -v and you will get prints from your Windows event log with logtype Application.
In case you can't find this script:
import win32evtlog
import win32api
import win32con
import win32security # To translate NT Sids to account names.
import win32evtlogutil
def ReadLog(computer, logType="Application", dumpEachRecord = 0):
# read the entire log back.
h=win32evtlog.OpenEventLog(computer, logType)
numRecords = win32evtlog.GetNumberOfEventLogRecords(h)
# print "There are %d records" % numRecords
num=0
while 1:
objects = win32evtlog.ReadEventLog(h, win32evtlog.EVENTLOG_BACKWARDS_READ|win32evtlog.EVENTLOG_SEQUENTIAL_READ, 0)
if not objects:
break
for object in objects:
# get it for testing purposes, but dont print it.
msg = win32evtlogutil.SafeFormatMessage(object, logType)
if object.Sid is not None:
try:
domain, user, typ = win32security.LookupAccountSid(computer, object.Sid)
sidDesc = "%s/%s" % (domain, user)
except win32security.error:
sidDesc = str(object.Sid)
user_desc = "Event associated with user %s" % (sidDesc,)
else:
user_desc = None
if dumpEachRecord:
print "Event record from %r generated at %s" % (object.SourceName, object.TimeGenerated.Format())
if user_desc:
print user_desc
try:
print msg
except UnicodeError:
print "(unicode error printing message: repr() follows...)"
print repr(msg)
num = num + len(objects)
if numRecords == num:
print "Successfully read all", numRecords, "records"
else:
print "Couldn't get all records - reported %d, but found %d" % (numRecords, num)
print "(Note that some other app may have written records while we were running!)"
win32evtlog.CloseEventLog(h)
def usage():
print "Writes an event to the event log."
print "-w : Dont write any test records."
print "-r : Dont read the event log"
print "-c : computerName : Process the log on the specified computer"
print "-v : Verbose"
print "-t : LogType - Use the specified log - default = 'Application'"
def test():
# check if running on Windows NT, if not, display notice and terminate
if win32api.GetVersion() & 0x80000000:
print "This sample only runs on NT"
return
import sys, getopt
opts, args = getopt.getopt(sys.argv[1:], "rwh?c:t:v")
computer = None
do_read = do_write = 1
logType = "Application"
verbose = 0
if len(args)>0:
print "Invalid args"
usage()
return 1
for opt, val in opts:
if opt == '-t':
logType = val
if opt == '-c':
computer = val
if opt in ['-h', '-?']:
usage()
return
if opt=='-r':
do_read = 0
if opt=='-w':
do_write = 0
if opt=='-v':
verbose = verbose + 1
if do_write:
ph=win32api.GetCurrentProcess()
th = win32security.OpenProcessToken(ph,win32con.TOKEN_READ)
my_sid = win32security.GetTokenInformation(th,win32security.TokenUser)[0]
win32evtlogutil.ReportEvent(logType, 2,
strings=["The message text for event 2","Another insert"],
data = "Raw\0Data".encode("ascii"), sid = my_sid)
win32evtlogutil.ReportEvent(logType, 1, eventType=win32evtlog.EVENTLOG_WARNING_TYPE,
strings=["A warning","An even more dire warning"],
data = "Raw\0Data".encode("ascii"), sid = my_sid)
win32evtlogutil.ReportEvent(logType, 1, eventType=win32evtlog.EVENTLOG_INFORMATION_TYPE,
strings=["An info","Too much info"],
data = "Raw\0Data".encode("ascii"), sid = my_sid)
print("Successfully wrote 3 records to the log")
if do_read:
ReadLog(computer, logType, verbose > 0)
if __name__=='__main__':
test()
I hope this script fits your needs

Sending png file via socket in Python

I'm using python version 2.7.9 and i try to send png file.
But something strange happens..i using sockets and sends a post request(or kind of).
I send the request to the server from the client,then i prints the length of the request received on the server, for example, the length is:1051.
Then I do a regex to take the png file data, and then prints the length, and the length is 2632, that he larger than the response?!
I think the problem is that it's actually write the content, but not the right of representation, I tried different things but they did not work, so I ask here how to solve this problem.
Server source code:
import socket
import re
server = socket.socket()
server.bind(('0.0.0.0',8080))
while True:
server.listen(2)
(client, client_addr) = server.accept()
print 'IP :',client_addr
res = client.recv(0xfffffff)
print len(res)
#get file name
file_name = res.split('&')[0]
file_name = str(file_name.split('=')[1])
print repr(res)
#get the data of the file
raw_img = str(re.findall("&photo_data=(.*)" ,res ,re.DOTALL))
print "File name:" + file_name
print "Size:" + str(len(raw_img))
with open(file_name, 'wb') as f:
f.write(raw_img)
print "Done"
Client source code:
import socket
client = socket.socket()
client.connect(('127.0.0.1',8080))
raw_data = open('test.png', 'rb').read()
save_file_name = raw_input("Enter the file name:")
print len(raw_data)
output = 'POST /upload HTTP/1.1\r\n'
output += 'Content-Length:' + str(len(raw_data)) + str(len(save_file_name)) + '\r\n\r\n'
output += 'file_name=' + save_file_name + '&'
output += 'photo_data=' + raw_data
print len(output)
client.send(output)
client.close()

First, you should use while True to receive the full data:
res = ''
while True:
data = client.recv(1024)
if not data:
break
res += data
print len(res)
Then, re.findall actually returns an array, not a string. So you should do this:
r = re.findall("&photo_data=(.*)" ,res ,re.DOTALL)
raw_img = str(r[0])
Now it works fine.
Why doesn't the code before work? Let's say we have a list:
r = ['\x45']
The data in raw_img part is basically like this. If we brutely convert this list to a str, we have:
print len(str[r])) # ['E'], 5
Actually, what we need is r[0]:
print len(str[r[0])) # 1
That's why the size of the file became larger.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python syntax error in email address extraction script - python

print is a function in Python 3, so it needs to be: print("\n".join(addr.lower() for realname, addr in all_recipients)) The script was probably originally written for use with Python 2.x.

Related

Getting JSON content from a packet using Scapy with Python

Indentation Error Python Not Working

Python search imap email for a string

Reading windows event log in Python using pywin32 (win32evtlog module)

Sending png file via socket in Python

Categories

Resources