How to passively sniff for TCP/HTTP get requests - python

I am looking to passively sniff HTTP GET requests (with an Rpi) to track the traffic of network devices.
So far I have the following code which I beleive sniffs all packets, filtering down to TCP ones that should contain HTTP requests:
#Packet sniffer in python
#For Linux - Sniffs all incoming and outgoing packets :)
#Silver Moon (m00n.silv3r#gmail.com)
import socket, sys
import sys
from threading import RLock
from struct import *
#Convert a string of 6 characters of ethernet address into a dash separated hex string
def eth_addr (a) :
b = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x" % (ord(a[0]) , ord(a[1]) , ord(a[2]), ord(a[3]), ord(a[4]) , ord(a[5]))
return b
#create a AF_PACKET type raw socket (thats basically packet level)
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
try:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
except socket.error , msg:
print 'Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()
# receive a packet
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
#parse ethernet header
eth_length = 14
eth_header = packet[:eth_length]
eth = unpack('!6s6sH' , eth_header)
eth_protocol = socket.ntohs(eth[2])
source_mac = eth_addr(packet[6:12])
print 'Destination MAC : ' + eth_addr(packet[0:6]) + ' Source MAC : ' + source_mac + ' Protocol : ' + str(eth_protocol)
#Parse IP packets, IP Protocol number = 8
if eth_protocol == 8 :
#Parse IP header
#take first 20 characters for the ip header
ip_header = packet[eth_length:20+eth_length]
#now unpack them :)
iph = unpack('!BBHHHBBH4s4s' , ip_header)
version_ihl = iph[0]
version = version_ihl >> 4
ihl = version_ihl & 0xF
iph_length = ihl * 4
ttl = iph[5]
protocol = iph[6]
s_addr = socket.inet_ntoa(iph[8]);
d_addr = socket.inet_ntoa(iph[9]);
#print 'Version : ' + str(version) + ' IP Header Length : ' + str(ihl) + ' TTL : ' + str(ttl) + ' Protocol : ' + str(protocol) + ' Source Address : ' + str(s_addr) + ' Destination Address : ' + str(d_addr)
#TCP protocol
if protocol == 6 :
t = iph_length + eth_length
tcp_header = packet[t:t+20]
#now unpack them :)
tcph = unpack('!HHLLBBHHH' , tcp_header)
source_port = tcph[0]
dest_port = tcph[1]
sequence = tcph[2]
acknowledgement = tcph[3]
doff_reserved = tcph[4]
tcph_length = doff_reserved >> 4
#print 'Source Port : ' + str(source_port) + ' Dest Port : ' + str(dest_port) + ' Sequence Number : ' + str(sequence) + ' Acknowledgement : ' + str(acknowledgement) + ' TCP header length : ' + str(tcph_length)
h_size = eth_length + iph_length + tcph_length * 4
data_size = len(packet) - h_size
#get data from the packet
data = packet[h_size:]
print 'Data: '
print data
This gives the following:
(Run on Rpi on same subnet as PC browsing wikipedia)
What would I need to do to decode the GET request string from this?
I.e. GET /tutorials/other/top-20-mysql-best-practices/ HTTP/1.1
Host: net.tutsplus.com...

The complexity depends on how exact you want to be.
To be "perfect", you'll need to largely implement a TCP/IP stack. You need to trace the 3-way handshake, handle (and ignore) retransmits, re-order packets, and merge the data payloads into a single stream. You'll also have to be resilient, making reasonable guess in the case where your monitor misses a packet but it is received by the destination.
To be "good", you can make assumptions such as "a GET request will be in a single packet" (usually true) to avoid actual connection tracking but you still have to check for retransmitted packets. In this case, "GET" will be the first three characters of a data packet and you can process from there until a \n\n or the end of that data packet.

Related

Threaded Python script only seems to work as expected after signaling keyboard interrupt?

Basically, I have a script that sniffs packets in one thread, and appends them to a list. A second thread also runs, and if the list is not empty, then it will perform calculations with the data. However, the sniffing thread only seems to work properly after hitting Ctrl + C. Before sending the keyboard interrupt, console output that I've produced for debugging is very slow and it seems to be missing packets. Afterwards hitting Ctrl + C it runs much faster and works as expected. Any ideas why this might be occurring? My code would look something similar to what is below.
Packet sniffer:
import socket, sys
from struct import *
def eth_addr(a):
#create a AF_PACKET type raw socket (thats basically packet level)
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
try:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
except socket.error , msg:
print 'Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()
def packet_sniff():
# receive a packet
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
#parse ethernet header
eth_length = 14
eth_header = packet[:eth_length]
eth = unpack('!6s6sH' , eth_header)
eth_protocol = socket.ntohs(eth[2])
print 'Destination MAC : ' + eth_addr(packet[0:6]) + ' Source MAC : ' + eth_addr(packet[6:12]) + ' Protocol : ' + str(eth_protocol)
#TCP protocol
if protocol == 6 :
t = iph_length + eth_length
tcp_header = packet[t:t+20]
#now unpack them
tcph = unpack('!HHLLBBHHH' , tcp_header)
source_port = tcph[0]
dest_port = tcph[1]
sequence = tcph[2]
acknowledgement = tcph[3]
doff_reserved = tcph[4]
tcph_length = doff_reserved >> 4
print 'Source Port : ' + str(source_port) + ' Dest Port : ' + str(dest_port) + ' Sequence Number : ' + str(sequence) + ' Acknowledgement : ' + str(acknowledgement) + ' TCP header length : ' + str(tcph_length)
h_size = eth_length + iph_length + tcph_length * 4
data_size = len(packet) - h_size
#get data from the packet
data = packet[h_size:]
return data
Main would look something roughly like this:
def run_program():
processThread1 = threading.Thread(target = self.sniff_data, args = [])
processThread2 = threading.Thread(target = self.process_data, args = [])
processThread1.start()
processThread2.start()
def sniff_data():
global my_list
while True:
data = packet_sniff()
my_list.append(data)
def process_data():
global my_list
while True:
if len(my_list) != 0:
# Do computation
my_list = []
run_program()
if len(my_list) != 0: with no sleep run infinitely can and will lock the interpreter, preventing other threads from changing my_list. That needs to go.

Why am I not recieving packets in packet sniffer even though I'm in promiscuous mode?

So I have 3 raspberry pis all connected on an adhoc network. I would like the first pi to send a tcp message to the second pi, and have the third pi sniff the packet, then later process the data. I was following this python code, however, it doesn't seem to sniff packets not destined for this third raspberry pi when I run it, even though the pi is in promiscuous mode.
#Packet sniffer in python for Linux
#Sniffs only incoming TCP packet
import socket, sys
from struct import *
#create an INET, STREAMing socket
try:
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
except socket.error , msg:
print 'Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()
# receive a packet
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
#take first 20 characters for the ip header
ip_header = packet[0:20]
#now unpack them :)
iph = unpack('!BBHHHBBH4s4s' , ip_header)
version_ihl = iph[0]
version = version_ihl >> 4
ihl = version_ihl & 0xF
iph_length = ihl * 4
ttl = iph[5]
protocol = iph[6]
s_addr = socket.inet_ntoa(iph[8]);
d_addr = socket.inet_ntoa(iph[9]);
print 'Version : ' + str(version) + ' IP Header Length : ' + str(ihl) + ' TTL : ' + str(ttl) + ' Protocol : ' + str(protocol) + ' Source Address : ' + str(s_addr) + ' Destination Address : ' + str(d_addr)
tcp_header = packet[iph_length:iph_length+20]
#now unpack them :)
tcph = unpack('!HHLLBBHHH' , tcp_header)
source_port = tcph[0]
dest_port = tcph[1]
sequence = tcph[2]
acknowledgement = tcph[3]
doff_reserved = tcph[4]
tcph_length = doff_reserved >> 4
print 'Source Port : ' + str(source_port) + ' Dest Port : ' + str(dest_port) + ' Sequence Number : ' + str(sequence) + ' Acknowledgement : ' + str(acknowledgement) + ' TCP header length : ' + str(tcph_length)
h_size = iph_length + tcph_length * 4
data_size = len(packet) - h_size
#get data from the packet
data = packet[h_size:]
print 'Data : ' + data
print
Also, when I change this:
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
To this:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
It seems to be able to receive packets not destined for it. The problem is that I would prefer to only be receiving a specific protocol (instead of receiving all packets and having to sort through them after the fact).
I've been really struggling to figure this out as I am not the best at networking/socket programming. Any ideas?
Nor necessarily an answer but my line of investigation would be:
run tcpdump on the 3rd pi to check if you see the pi1 and pi2 packets. With this you can determine if there's an issue with your code or not.
check if ifconfig reports the PROMISC flag (on the interface that you use) when you run your app on pi3
are you running as a privileged user?

Network function to filter/drop packets

I'm trying to implement a stateless network function in python that will sniff incoming packets and then decide if they are "good" packets or "bad" ones.
If they are "good" I want the program to forward the packets (to another network/IP), and if they are bad I want to drop them.
I already have the sniffing part working but I've been really struggling to make the rest of the function to work.
Does anyone have any ideas? Thanks
import socket # Import socket module
import sys
from struct import *
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
allowed_services = [80, 443]
allowed_hosts = []
for x in range(1, 11): #allows 192.168.0.1-10
allowed_hosts += ['192.168.0.%d' % x]
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
eth_length = 14
eth_header = packet[:eth_length]
eth = unpack('!6s6sH' , eth_header)
eth_protocol = socket.ntohs(eth[2])
#Parse IP packets, IP Protocol number = 8
if eth_protocol == 8 :
#Parse IP header
#take first 20 characters for the ip header
ip_header = packet[eth_length:20+eth_length]
#now unpack them :)
iph = unpack('!BBHHHBBH4s4s' , ip_header)
version_ihl = iph[0]
version = version_ihl >> 4
ihl = version_ihl & 0xF
iph_length = ihl * 4
protocol = iph[6]
saddr = socket.inet_ntoa(iph[8]);
daddr = socket.inet_ntoa(iph[9]);
# print 'Incoming packet from ' + str(saddr) + ' going to ' + str(daddr) + '\n'
#TCP protocol
if protocol == 6:
t = iph_length + eth_length
tcp_header = packet[t:t+20]
#now unpack them :)
tcph = unpack('!HHLLBBHHH' , tcp_header)
sport = tcph[0]
dport = tcph[1]
if daddr in allowed_hosts and dport in allowed_services or saddr in allowed_hosts and sport in allowed_services:
print 'Good TCP packet (from %s) \n' % saddr
else:
print 'Bad Packet -> Dropping from %s \n Verify IP and/or Port\n' % saddr
print '---------------------------------------------------------------\n'

Python Tracer Array

I have the following problem: I want to Trace only outgoing packets and I have no idea how to sort them out. I want to save them to text file as well, "on the fly" or after I break compiling.
Can anyone help? This is my code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#Packet sniffer in python
#For Linux - Sniffs all incoming and outgoing packets :)
import socket, sys
from struct import *
from Tkinter import *
#Convert a string of 6 characters of ethernet address into a dash separated hex string
ef eth_addr (a) :
b = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x" % (ord(a[0]) , ord(a[1]) , ord(a[2]), ord(a[3]), ord(a[4]) , ord(a[5]))
return b
#create a AF_PACKET type raw socket (thats basically packet level)
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
try:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
except socket.error , msg:
print 'Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()
# receive a packet
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
#parse ethernet header
eth_length = 14
eth_header = packet[:eth_length]
eth = unpack('!6s6sH' , eth_header)
eth_protocol = socket.ntohs(eth[2])
#Parse IP packets, IP Protocol number = 8
if eth_protocol == 8 :
#Parse IP header
#take first 20 characters for the ip header
ip_header = packet[eth_length:20+eth_length]
#now unpack them :)
iph = unpack('!BBHHHBBH4s4s' , ip_header)
version_ihl = iph[0]
version = version_ihl >> 4
ihl = version_ihl & 0xF
iph_length = ihl * 4
ttl = iph[5]
protocol = iph[6]
s_addr = socket.inet_ntoa(iph[8]);
d_addr = socket.inet_ntoa(iph[9]);
print ' Source Address : ' + str(s_addr) + ' Destination Address : ' + str(d_addr)

using python to determine dot1x protocol type in ethernet header

I'm using python to enumerate information in a dot1x exchange but I'm having trouble parsing the Ethernet protocol. I know the Ethernet type field is two bytes and dot1x uses "888e". I've confirmed "888e" is being passed via Wireshark but I'm getting the below output. Why is it showing "36488" instead of "888e"?
Destination MAC : 01:80:c2:00:00:03 Source MAC : c2:04:17:9c:f1:03 Protocol : 36488
Destination MAC : 01:80:c2:00:00:03 Source MAC : 08:00:27:83:5b:8b Protocol : 36488
Destination MAC : 01:80:c2:00:00:03 Source MAC : c2:04:17:9c:f1:03 Protocol : 36488
Destination MAC : 01:80:c2:00:00:03 Source MAC : 08:00:27:83:5b:8b Protocol : 36488
Destination MAC : 01:80:c2:00:00:03 Source MAC : c2:04:17:9c:f1:03 Protocol : 36488
My code:
import socket, sys
from struct import *
#Convert a string of 6 characters of ethernet address into a dash separated hex string
def eth_addr (a) :
b = "%.2x:%.2x:%.2x:%.2x:%.2x:%.2x" % (ord(a[0]) , ord(a[1]) , ord(a[2]), ord(a[3]), ord(a[4]) , ord(a[5]))
return b
#create a AF_PACKET type raw socket (thats basically packet level)
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
try:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
except socket.error , msg:
print 'Socket could not be created. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()
# receive a packet
while True:
packet = s.recvfrom(65565)
#packet string from tuple
packet = packet[0]
#parse ethernet header
eth_length = 14
eth_header = packet[:eth_length]
eth = unpack('!6s6sH' , eth_header)
eth_protocol = socket.ntohs(eth[2])
print 'Destination MAC : ' + eth_addr(packet[0:6]) + ' Source MAC : ' + eth_addr(packet[6:12]) + ' Protocol : ' + str(eth_protocol)
It is just a matter of hexadecimal and decimal representation.
36488 is 8e88 in hexadecimal. Also you are doing a ntohs() translation to get the eth_protocol which basically changes the byte order i.e. translates 888e to 8e88.
If you want your program to print hexadecimal number check string formatting specs at Python docs.

Categories