Getting JSON content from a packet using Scapy with Python - python

I have a pcapng file that contains a little bit of traffic. One of the packets I am trying to print out is containing JSON data. If I open the packet up in Wireshark, I am able to see the values in the JSON. But when using scapy to read the file and print I don't see it.
from scapy.all import IP, sniff
from scapy.layers import http
def process_tcp_packet(packet):
if packet.haslayer(http.HTTPRequest):
http_layer = packet.getlayer(http.HTTPRequest)
ip_layer = packet.getlayer(IP)
#print('\n{0[src]} just requested a {1[Method]} {1[Host]}{1[Path]}'.format(ip_layer.fields, http_layer.fields))
#print(ip_layer.fields)
#print(http_layer.fields)
#packet.show()
print('Packet: ' + str(packet))
print('\n\n')
# Start sniffing the network.
sniff(offline='test.pcapng', prn=process_tcp_packet, count=2)
Here is the JSON content Wireshark is showing me:
And this is the output I am getting for that packet using the code above..
Packet: b'\x18\x0fv\xef0\x8a\xc4\x98\\\xe7=\x18\x08\x00E\x00\x01&&S#\x00#\x06}\n\xc0\xa8\x89\x94#\xa7(\x91\x9b\xd0\x00P\x16-/\x9e\xb1\xa1\xe8V\x80\x18\x01K\x97\xaf\x00\x00\x01\x01\x08\n\x00\x00\t\xd5\xfb\xc3b\x89POST /v1/identify HTTP/1.1\r\nHost: api.segment.io\r\nUser-Agent: Roku/DVP-9.10 (489.10E04121A)\r\nAccept: application/json\r\nAuthorization: Basic: NHJmY3AzUEJmTUhPVlJsWVZZNTZKRDZ0N1JuMUNoaVY=\r\nContent-Type: application/json\r\nContent-Length: 704\r\n\r\n'
I was reading on how to print the entire content of the packet and thats where I came across both packet.show() and print(packet) however both of them are still missing the JSON data.
I want to get the JSON data because I want to be able to manually parse it. I don't like how Wireshark has all the JSON nested into arrows that I have to drop down to see.
This is the output of show:
And I am using the latest version of scapy.

It's an old question, but for future people who search for an answer, here is how I did it:
packet_dict = {}
for line in packet.show2(dump=True).split('\n'):
if '###' in line:
layer = line.strip('#[] ')
packet_dict[layer] = {}
elif '=' in line:
key, val = line.split('=', 1)
packet_dict[layer][key.strip()] = val.strip()
print(json.dumps(packet_dict))

If it can be useful to someone, starting from Yechiel's code I made some improvements:
Key values are returned in the correct format instead of all as a string
Sublayers are parsed
def pkt2dict(pkt):
packet_dict = {}
for line in pkt.show2(dump=True).split('\n'):
if '###' in line:
if '|###' in line:
sublayer = line.strip('|#[] ')
packet_dict[layer][sublayer] = {}
else:
layer = line.strip('#[] ')
packet_dict[layer] = {}
elif '=' in line:
if '|' in line and 'sublayer' in locals():
key, val = line.strip('| ').split('=', 1)
packet_dict[layer][sublayer][key.strip()] = val.strip('\' ')
else:
key, val = line.split('=', 1)
val = val.strip('\' ')
if(val):
try:
packet_dict[layer][key.strip()] = eval(val)
except:
packet_dict[layer][key.strip()] = val
else:
log.debug("pkt2dict packet not decoded: " + line)
return packet_dict
To check if it works on all types of layers returned by scapy.

Related

python socket file transfer verified with sha256 not working, but only sometimes?

Client side:
def send_file_to_hashed(data, tcpsock):
time.sleep(1)
f = data
flag = 0
i=0
tcpsock.send(hashlib.sha256(f.read()).hexdigest())
f.seek(0)
time.sleep(1)
l = f.read(BUFFER_SIZE-64)
while True:
while (l):
tcpsock.send(hashlib.sha256(l).hexdigest() + l)
time.sleep(1)
hashok = tcpsock.recv(6)
if hashok == "HASHOK":
l = f.read(BUFFER_SIZE-64)
flag = 1
if hashok == "BROKEN":
flag = 0
if not l:
time.sleep(1)
tcpsock.send("DONE")
break
return (tcpsock,flag)
def upload(filename):
flag = 0
while(flag == 0):
with open(os.getcwd()+'\\data\\'+ filename +'.csv', 'rU') as UL:
tuplol = send_file_to_hashed(UL ,send_to_sock(filename +".csv",send_to("upload",TCP_IP,TCP_PORT)))
(sock,flagn) = tuplol
flag = flagn
time.sleep(2)
sock.close()
Server Side:
elif(message == "upload"):
message = rec_OK(self.sock)
fis = os.getcwd()+'/data/'+ time.strftime("%H:%M_%d_%m_%Y") + "_" + message
f = open(fis , 'w')
latest = open(os.getcwd()+'/data/' + message , 'w')
time.sleep(1)
filehash = rec_OK(self.sock)
print("filehash:" + filehash)
while True:
time.sleep(1)
rawdata = self.sock.recv(BUFFER_SIZE)
log.write("rawdata :" + rawdata + "\n")
data = rawdata[64:]
dhash = rawdata[:64]
log.write("chash: " + dhash + "\n")
log.write("shash: " + hashlib.sha256(data).hexdigest() + "\n")
if dhash == hashlib.sha256(data).hexdigest():
f.write(data)
latest.write(data)
self.sock.send("HASHOK")
log.write("HASHOK\n" )
print"HASHOK"
else:
self.sock.send("HASHNO")
print "HASHNO"
log.write("HASHNO\n")
if rawdata == "DONE":
f.close()
f = open(fis , 'r')
if (hashlib.sha256(f.read()).hexdigest() == filehash):
print "ULDONE"
log.write("ULDONE")
f.close()
latest.close()
break
else:
self.sock.send("BROKEN")
print hashlib.sha256(f.read()).hexdigest()
log.write("BROKEN")
print filehash
print "BROKEN UL"
f.close()
So the data upload is working fine in all tests that i ran from my computer, even worked fine while uploading data over my mobile connection and still sometimes people say it takes a long time and they kill it after a few minutes. the data is there on their computers but not on the server. I don't know what is happening please help!
First of all: this is unrelated to sha.
Streaming over the network is unpredictable. This line
rawdata = self.sock.recv(BUFFER_SIZE)
doesn't guarantee that you read BUFFER_SIZE bytes. You may have read only 1 byte in the worst case scenario. Therefore your server side is completely broken because of the assumption that rawdata contains whole message. It is even worse. If the client sends command and hash fast you may get e.g. rawdata == 'DONEa2daf78c44(...) which is a mixed output.
The "hanging" part just follows from that. Trace your code and see what happens when the server receives partial/broken messages ( I already did that in my imagination :P ).
Streaming over the network is almost never as easy as calling sock.send on one side and sock.recv on the other side. You need some buffering/framing protocol. For example you can implement this simple protocol: always interpret first two bytes as the size of incoming message, like this:
client (pseudocode)
# convert len of msg into two-byte array
# I am assuming the max size of msg is 65536
buf = bytearray([len(msg) & 255, len(msg) >> 8])
sock.sendall(buf)
sock.sendall(msg)
server (pseudocode)
size = to_int(sock.recv(1))
size += to_int(sock.recv(1)) << 8
# You need two calls to recv since recv(2) can return 1 byte.
# (well, you can try recv(2) with `if` here to avoid additional
# syscall, not sure if worth it)
buffer = bytearray()
while size > 0:
tmp = sock.recv(size)
buffer += tmp
size -= len(tmp)
Now you have properly read data in buffer variable which you can work with.
WARNING: the pseudocode for the server is simplified. For example you need to check for empty recv() result everywhere (including where size is calculated). This is the case when the client disconnects.
So unfortunately there's a lot of work in front of you. You have to rewrite whole sending and receving code.

process socket data that ends with a line break

What is the best approach to process a socket connection where I need var data to end with a line break \n?
I'm using the code below but sometimes the tcp packets get chunked and it takes a long time to match data.endswith("\n").
I've also tried other approaches, like saving the last line if it doesn't end with \n and append it to dataon the next loop. but this also doesn't work because multiple packets get chunked and the 1st and 2nd part don't match.
I've no control over the other end, it basically sends multiple lines that end in \r\n.
Any suggestion will be welcome, as I don't have much knowledge on socket connections.
def receive_bar_updates(s):
global all_bars
data = ''
buffer_size = 4096
while True:
data += s.recv(buffer_size)
if not data.endswith("\n"):
continue
lines = data.split("\n")
lines = filter(None, lines)
for line in lines:
if line.startswith("BH") or line.startswith("BC"):
symbol = str(line.split(",")[1])
all_bars[symbol].append(line)
y = Thread(target=proccess_bars, kwargs={'symbol': symbol})
y.start()
data = ""
Example of "normal" data:
line1\r\n
line2\r\n
line3\r\n
Example of chunked data:
line1\r\n
line2\r\n
lin
If you have a raw input that you want to process as line, the io module is your friend because it will do the low level assembling of packets in lines.
You could use:
class SocketIO(io.RawIOBase):
def __init__(self, sock):
self.sock = sock
def read(self, sz=-1):
if (sz == -1): sz=0x7FFFFFFF
return self.sock.recv(sz)
def seekable(self):
return False
It is more robust than endswith('\n') because if one packet contains an embedded newline ('ab\ncd'), the io module will correctly process it. Your code could become:
def receive_bar_updates(s):
global all_bars
data = ''
buffer_size = 4096
fd = SocketIO(s) # fd can be used as an input file object
for line in fd:
if should_be_rejected_by_filter(line): continue # do not know what filter does...
if line.startswith("BH") or line.startswith("BC"):
symbol = str(line.split(",")[1])
all_bars[symbol].append(line)
y = Thread(target=proccess_bars, kwargs={'symbol': symbol})
y.start()
Use socket.socket.makefile() to wrap the socket in a class that implenents Text I/O. It handles buffering, converting between bytes and strings, and lets you iterate over lines. Remember to flush any writes.
Example:
#!/usr/bin/env python3
import socket, threading, time
def client(addr):
with socket.create_connection(addr) as conn:
conn.sendall(b'aaa')
time.sleep(1)
conn.sendall(b'bbb\n')
time.sleep(1)
conn.sendall(b'cccddd\n')
time.sleep(1)
conn.sendall(b'eeefff')
time.sleep(1)
conn.sendall(b'\n')
conn.shutdown(socket.SHUT_WR)
response = conn.recv(1024)
print('client got %r' % (response,))
def main():
with socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0) as listen_socket:
listen_socket.bind(('localhost', 0))
listen_socket.listen(1)
addr = listen_socket.getsockname()
threading.Thread(target=client, args=(addr,)).start()
conn, _addr = listen_socket.accept()
conn_file = conn.makefile(mode='rw', encoding='utf-8')
for request in conn_file:
print('server got %r' % (request,))
conn_file.write('response1\n')
conn_file.flush()
if __name__ == '__main__':
main()
$ ./example.py
server got 'aaabbb\n'
server got 'cccddd\n'
server got 'eeefff\n'
client got b'response1\n'
$
Are you accepting different connections? Or is it one stream of data, split up by \r\n's?
When accepting multiple connections you'd wait for a connection with s.accept() and then process all its data. When you have all of the packet, process its data, and wait for the next connection.
What you do then depends on what the structure of each packet would be.
(Example: https://wiki.python.org/moin/TcpCommunication)
If instead you are consuming a stream of data, you should probably process each 'line' you find in a separate thread, while you keep consuming on another.
Edit:
So, if I have your situation correct; one connection, the data being a string broken up by \r\n, ending with a \n. The data however does not correspond to what you are expecting, instead looping infinitely waiting for a \n.
The socket interface, as I understand it, ends with an empty data result. So the last buffer might have ended with a \n, but then just continued getting None objects, trying to find another \n.
Instead, try adding this:
if not data:
break
Full code:
def receive_bar_updates(s):
global all_bars
data = ''
buffer_size = 4096
while True:
data += s.recv(buffer_size)
if not data:
break
if not data.endswith("\n"):
continue
lines = data.split("\n")
lines = filter(None, lines)
for line in lines:
if line.startswith("BH") or line.startswith("BC"):
symbol = str(line.split(",")[1])
all_bars[symbol].append(line)
y = Thread(target=proccess_bars, kwargs={'symbol': symbol})
y.start()
data = ""
Edit2: Oops, wrong code
I have not tested this code, but it should work:
def receive_bar_updates(s):
global all_bars
data = ''
buf = ''
buffer_size = 4096
while True:
if not "\r\n" in data: # skip recv if we already have another line buffered.
data += s.recv(buffer_size)
if not "\r\n" in data:
continue
i = data.rfind("\r\n")
data, buf = data[:i+2], data[i+2:]
lines = data.split("\r\n")
lines = filter(None, lines)
for line in lines:
if line.startswith("BH") or line.startswith("BC"):
symbol = str(line.split(",")[1])
all_bars[symbol].append(line)
y = Thread(target=proccess_bars, kwargs={'symbol': symbol})
y.start()
data = buf
Edit: Forgot to mention, i only modified the code for receiving the data, i have no idea what the rest of the function (starting with lines = data.split("\n")) is for.
Edit 2: Now uses "\r\n" for linebreaks instead of "\n".
Edit 3: Fixed an issue.
You basically seem to want to read lines from the socket. Maybe you're better off not using low level recv calls but just use sock.makefile() and treat the result as a regular file where you can read lines from: from line in sfile: ...
That leaves the delay/chunk issue. This is likely to be caused by Nagle's algorithm on the sending side. Try disabling that:
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

Sending png file via socket in Python

I'm using python version 2.7.9 and i try to send png file.
But something strange happens..i using sockets and sends a post request(or kind of).
I send the request to the server from the client,then i prints the length of the request received on the server, for example, the length is:1051.
Then I do a regex to take the png file data, and then prints the length, and the length is 2632, that he larger than the response?!
I think the problem is that it's actually write the content, but not the right of representation, I tried different things but they did not work, so I ask here how to solve this problem.
Server source code:
import socket
import re
server = socket.socket()
server.bind(('0.0.0.0',8080))
while True:
server.listen(2)
(client, client_addr) = server.accept()
print 'IP :',client_addr
res = client.recv(0xfffffff)
print len(res)
#get file name
file_name = res.split('&')[0]
file_name = str(file_name.split('=')[1])
print repr(res)
#get the data of the file
raw_img = str(re.findall("&photo_data=(.*)" ,res ,re.DOTALL))
print "File name:" + file_name
print "Size:" + str(len(raw_img))
with open(file_name, 'wb') as f:
f.write(raw_img)
print "Done"
Client source code:
import socket
client = socket.socket()
client.connect(('127.0.0.1',8080))
raw_data = open('test.png', 'rb').read()
save_file_name = raw_input("Enter the file name:")
print len(raw_data)
output = 'POST /upload HTTP/1.1\r\n'
output += 'Content-Length:' + str(len(raw_data)) + str(len(save_file_name)) + '\r\n\r\n'
output += 'file_name=' + save_file_name + '&'
output += 'photo_data=' + raw_data
print len(output)
client.send(output)
client.close()
First, you should use while True to receive the full data:
res = ''
while True:
data = client.recv(1024)
if not data:
break
res += data
print len(res)
Then, re.findall actually returns an array, not a string. So you should do this:
r = re.findall("&photo_data=(.*)" ,res ,re.DOTALL)
raw_img = str(r[0])
Now it works fine.
Why doesn't the code before work? Let's say we have a list:
r = ['\x45']
The data in raw_img part is basically like this. If we brutely convert this list to a str, we have:
print len(str[r])) # ['E'], 5
Actually, what we need is r[0]:
print len(str[r[0])) # 1
That's why the size of the file became larger.

Reading data from serial in python

I got GPS module who returns NMEA data.
When I'm trying to print all data it returns using following code, I'm getting this.
while True:
try:
rcv = port.read()
print rcv
Then, I've made some modification that will read NMEA data cleaner. It looks like this:
port = serial.Serial("/dev/ttyAMA0", baudrate=9600, timeout=10.0)
line = []
print("connected to: " + port.portstr)
while True:
try:
rcv = port.read()
except:
rcv = ''
line.append(rcv)
if rcv == '\n':
line = "".join(line)
print line
line = []
Output looks like that:
$GPGGA,183345.000,5023.3424,N,01857.3817,E,1,7,1.25,313.3,M,42.1,M,,*53
$GPGSA,A,3,09,26,28,08,15,18,17,,,,,,1.52,1.25,0.88*06
$GPRMC,183345.000,A,5023.3424,N,01857.3817,E,0.40,55.07,050214,,,A*54
$GPVTG,55.07,T,,M,0.40,N,0.74,K,A*0D
$GPGGA,183346.000,5023.3423,N,01857.3817,E,1,7,1.25,313.3,M,42.1,M,,*57
$GPGSA,A,3,09,26,28,08,15,18,17,,,,,,1.52,1.25,0.88*06
The problem is that sometimes it misses some commas or other data, and NMEA parser is reading it wrong. Is there any better and cleaner way to read whole NMEA frames via serial?
you can use readline instead of read, That will continue to read characters until an EOL is received.

Python syntax error in email address extraction script

I found this python script at metafilter and modified the addresses and pass as needed, but I get a script error at the very last line. In the error the little carat symbol is underneath the quote after print "\n
Not being a coder at all, I have no idea where to turn next. Any help would be greatly appreciated.
What I actually want to do is pull out the "From" addresses and not to and CC, but I figured I would get this working properly first.
Script was run in Windows using Python 3.2 and redirected output to a text file.
import email
import getpass
import imaplib
HOST = "mail.-----.com"
USER = "sales#-----.com"
FOLDER = "Folder"
connection = imaplib.IMAP4_SSL(HOST)
res, data = connection.login(USER, getpass.getpass())
assert res == "OK"
res, count = connection.select(FOLDER)
assert res == "OK"
res, (msg_nums,) = connection.search(None, "ALL")
assert res == "OK"
for msg_num in msg_nums.split():
res, message_text = connection.fetch(msg_num, "(RFC822)")
assert res == "OK"
message = email.message_from_string(message_text[0][1])
tos = message.get_all("From") or []
ccs = message.get_all("Cc") or []
all_recipients = email.Utils.getaddresses(tos + ccs)
print "\n".join(addr.lower() for realname, addr in all_recipients)
print is a function in Python 3, so it needs to be:
print("\n".join(addr.lower() for realname, addr in all_recipients))
The script was probably originally written for use with Python 2.x.

Categories