Filter protocols in python - python

I'm trying to filter certain packets with protocols using user input from a given pcap file and than move the packets to a new pcap file.
That the code I made so far:
# ===================================================
# Imports
# ===================================================
from scapy.all import *
from scapy.utils import PcapWriter
"""
your going to need to install the modules below
"""
from Tkinter import Tk
from tkFileDialog import askopenfilename
# ===================================================
# Constants
# ===================================================
#OS commands:
#~~~~~~~~~~~~~
if "linux2" in sys.platform:
"""
linux based system clear command
"""
CLEAR_COMMAND = "clear"
elif "win32" in sys.platform:
"""
windows based system clear command
"""
CLEAR_COMMAND = "cls"
elif "cygwin" in sys.platform:
"""
crygwin based clear command
"""
CLEAR_COMMAND = "printf \"\\033c\""
elif "darwin" in sys.platform:
"""
mac OS X based clear command
"""
CLEAR_COMMAND = "printf \'\\33c\\e[3J\'"
#Usage string:
#~~~~~~~~~~~~~~
FILE_STRING = "please choose a pcap file to use"
BROWSE_STRING = "press any key to browser files\n"
BAD_PATH_STRING = "bad file please try agien\n"
BAD_INPUT_STRING = "bad input please try agien\n"
PROTOCOL_STRING = "please enter the protocol you wish to filter\n"
NAME_STRING = "please enter the new pcap file name\n"
# ===================================================
# Code
# ===================================================
def filter_pcap():
"""
filtering from the given pcap file a protocol the user chooce (from any layer in the OSI model)
and than asks for a new pcap file name, than filters the given protocol to a new pcap file
:param none
:return nothing:
"""
path = file_browse()
i = 0
filtertype = raw_input(PROTOCOL_STRING)
name = raw_input(NAME_STRING)
packs = rdpcap(path)
for i in range(len(packs)):
if filtertype in packs[i]:
wrpcap(name +".pcap", packs[i])
def file_browse():
"""
Purpose: It will allow the user to browse files on his computer
than it will check if the path is ok and will return it
:returns - the path to the chosen pcap file
"""
path = "test"
while ".pcap" not in path:
print FILE_STRING
raw_input(BROWSE_STRING)
os.system(CLEAR_COMMAND)
Tk().withdraw()
path = askopenfilename()
if ".pcap" not in path:
print BAD_PATH_STRING
return path
filter_pcap()
Now the problem is that I'm failing to filter the packets correctly.
The code need to filter protocols from any layer and any kind.
I have checked that thread: How can I filter a pcap file by specific protocol using python?
But as you can see it was not answered and the user added the problems I had in the edit, if any one could help me it would be great
Example for how it should work:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lets say i use the file "sniff" as my first pcap file and it has 489 packets when 200 from those packets are http packets.
now there is that print:
please enter the protocol you wish to filter
'http'
and than there is the print:
please enter the new pcap file name
'new'
the user input was 'http' now the program will search for every packet that run on http protocol and will create a new pcap file called 'new.pcap'.
the file 'new.pcap' will contain 200 http packets.
now that thing should work with any protocol on the OSI model including protocols like IP, TCP, Ethernet and so on (all the protocols in the ascii model).
I have found out that wireshark command line has the option -R and tshark has .protocols, but it dont really work... Can any one check that?
edit: i found pyshark but i dont know how to write with it

I don't believe that scapy has any functions or methods to support application layer protocols in the way that you are after. However, using sport and dport as a filter will do the trick (provided you are going to see/are expecting default ports).
Try something like this:
def filter_pcap(filtertype = None):
..... snip .....
# Static Dict of port to protocol values. Like so:
protocols = {'http': 80, 'ssh': 22, 'telnet': 21}
# Check to see if it is in the list
while filtertype not in protocols:
filtertype = raw_input(PROTOCOL_STRING)
# Name for output file
name = raw_input(NAME_STRING)
# Read Input File
packs = rdpcap(path)
# Filters to only TCP packets
for packet in packs[TCP]:
# Filter only the proto (aka Port) that we want
if protocols[filtertype] in packet.sport or protocols[filtertype] in packet.dport :
# Write to file
wrpcap(name +".pcap", packet)

Related

Python Scapy wireless scan and match a mac address stored in text file

I have some code that will scan for wireless packets and displays the mac address from each of them. What i would like to do is have a text file of mac addresses and for the code to alert me with a message when one of the addresses in the file is picked up on the wireless scan. I can not think of a way to implement this, here is the code for the wiresless scan and below is an example of the text file.
import sys
from scapy.all import *
devices = set()
def PacketHandler(pkt):
if pkt.haslayer(Dot11):
dot11_layer = pkt.getlayer(Dot11)
if dot11_layer.addr2 and (dot11_layer.addr2 not in devices):
devices.add(dot11_layer.addr2)
print dot11_layer.addr2
sniff(iface = sys.argv[1], count = int(sys.argv[2]), prn = PacketHandler)
here is example of the text file.
00:11:22:33:44:55
AA:BB:CC:DD:EE:FF
Create a function that reads from a .txt and store each line (matching a MAC address) in a list.
def getListMac() -> list: # you can put the path for your .txt file as argument
with open('MAClist.txt', 'r+') as file:
res = [x.rstrip('\n') for x in file.readlines()]
return res
And then check in your packetHandler function if the mac if in this list.
Here you have two choice :
Call getListMac() at the start of your program, store it in a global variable. Go for this if your .txt file won't change after launching your program.
MACLIST = getListMac()
...
# in your PacketHandler function
if mac in MACLIST:
print("mac found!") #or whatever your want to do
Call the function each time a packet is sniffed. Go for this option if the list of MAC addresses frequently changes and you need it updated when your program is running. Be careful with it as this will slow your program, especially if your list is very long.
# in your PacketHandler function:
if mac in getListMac():
print("mac found!") # or whatever your want to do
Finally, i will finish this post by advising you to use a real DBMS, which will be much more efficient than reading a txt file. ;)
EDIT
To answer your comment :
Modify the getListMac function in order to store the information in a dictionnary.
Here is an exemple assuming you use " - " as separator between MAC - Time - Username
def getListMac() -> dict: # you can put the path for your .txt file as argument
with open('MAClist.txt', 'r+') as file:
res = {x.rstrip('\n').split(" - ")[0]: x.rstrip('\n').split(" - ")[2] for x in file.readlines()}
return res
Access the data in the dictionary like this:
if MAC in MACLIST:
print(f"MAC found -> {MAC}, Username -> {MACLIST[MAC]}")

NFQueue/Scapy Man in the Middle

I'm trying to construct a man in the middle attack on a webpage (i.e. HTTP traffic). I'm doing this by using a Linux machine attached to Ethernet and a client attached to the Linux box via its WiFi hotspot.
What I've done so far is use NFQueue from within the IPTables Linux firewall to route all TCP packets on the FORWARD chain to the NFQueue queue, which a Python script is picking up and then processing those rules. I'm able to read the data off of the HTTP response packets, but whenever I try to modify them and pass them back (accept the packets), I'm getting an error regarding the strings:
Exception AttributeError: "'str' object has no attribute 'build_padding'" in 'netfilterqueue.global_callback' ignored
My code is here, which includes things that I've tried that didn't work. Notably, I'm using a third-party extension for scapy called scapy_http that may be interfering with things, and I'm using a webpage that is not being compressed by gzip because that was messing with things as well. The test webpage that I'm using is here.
#scapy
from scapy.all import *
#nfqueue import
from netfilterqueue import NetfilterQueue
#scapy http extension, not really needed
import scapy_http.http
#failed gzip decoding, also tried some other stuff
#import gzip
def print_and_accept(packet):
#convert nfqueue datatype to scapy-compatible
pkt = IP(packet.get_payload())
#is this an HTTP response?
if pkt[TCP].sport == 80:
#legacy trial that doesn't work
#data = packet.get_data()
print('HTTP Packet Found')
#check what's in the payload
stringLoad = str(pkt[TCP].payload)
#deleted because printing stuff out clogs output
#print(stringLoad)
#we only want to modify a specific packet:
if "<title>Acids and Bases: Use of the pKa Table</title>" in stringLoad:
print('Target Found')
#strings kind of don't work, I think this is a me problem
#stringLoad.replace('>Acids and Bases: Use of the pK<sub>a</sub>', 'This page has been modified: a random ')
#pkt[TCP].payload = stringLoad
#https://stackoverflow.com/questions/27293924/change-tcp-payload-with-nfqueue-scapy
payload_before = len(pkt[TCP].payload)
# I suspect this line is a problem: the string assigns,
# but maybe under the hood scapy doesn't like that very much
pkt[TCP].payload = str(pkt[TCP].payload).replace("Discussion", "This page has been modified")
#recalculate length
payload_after = len(pkt[TCP].payload)
payload_dif = payload_after - payload_before
pkt[IP].len = pkt[IP].len + payload_dif
#recalculate checksum
del pkt[TCP].chksum
del pkt[IP].chksum
del pkt.chksum
print('Packet Modified')
#redudant
#print(stringLoad)
#this throws an error (I think)
print(str(pkt[TCP].payload))
#no clue if this works or not yet
#goal here is to reassign modified packet to original parameter
packet.set_payload(str(pkt))
#this was also throwing the error, so tried to move away from it
#print(pkt.show2())
#bunch of legacy code that didn't work
#print(GET_print(pkt))
#print(pkt.show())
#decompressed_data = zlib.decompress(str(pkt[TCP].payload), 16 + zlib.MAX_WBITS)
#print(decompressed_data)
#print(str(gzip.decompress(pkt[TCP].payload)))
# print(pkt.getlayer(Raw).load)
#print('HTTP Contents Shown')
packet.accept()
def GET_print(packet1):
ret = "***************************************GET PACKET****************************************************\n"
ret += "\n".join(packet1.sprintf("{Raw:%Raw.load%}\n").split(r"\r\n"))
ret += "*****************************************************************************************************\n"
return ret
print('Test: Modify a very specific target')
print('Program Starting')
nfqueue = NetfilterQueue()
nfqueue.bind(1, print_and_accept)
try:
print('Packet Interface Starting')
nfqueue.run()
except KeyboardInterrupt:
print('\nProgram Ending')
nfqueue.unbind()
Apologies in advance if this is hard to read or badly formatted code; Python isn't a language that I write in often. Any help is greatly appreciated!

scapy PcapReader cut short

I'm trying to rewrite a pcap file with different IP and IPv6 addresses. But after I extract a packt by PcapReader and change its IP adresses, the packets in the output pcap file is cut short (that is to say the payload of the packet is lost).
Here's the example code:
from scapy.all import PcapReader
from scapy.all import PcapWriter
def test():
f = "input.pcap"
writers = PcapWriter("output.pcap")
with PcapReader(f) as pcap_reader:
for pkt in pcap_reader:
# somehow change the IP address
writers.write(pkt=pkt)
test()
When I open the .pcap file with WireShark, it shows "The capture file appears to have been cut short in the middle of a packet".
Is there any solution in scapy to maintain the payload or is there any other python packets to recommand?
here I did not change anything and the results are like this:
input file:
enter image description here
output file:
enter image description here
I think the problem must be in the code you use to modify the packet (and which you did not show) or that your source file already had short packets (i.e. snaplen less than packet len). The following code works for me without problems:
from scapy.all import PcapReader,PcapWriter,IP
writer = PcapWriter('output.pcap')
for pkt in PcapReader('input.pcap'):
# somehow change the IP address
pkt[IP].dst = '1.2.3.4'
pkt[IP].src = '5.6.7.8'
writer.write(pkt=pkt)

Pipsta Printer and Printing a list

I'm trying to modify the simple python script provided with my Pipsta Printer so that instead of printing a single line of text, it prints out a list of things.
I know there are probably better ways to do this, but using the script below, could somebody please tell me what changes I need to make around the "txt = " part of the script, so that I can print out a list of items and not just one item?
Thanks.
# BasicPrint.py
# Copyright (c) 2014 Able Systems Limited. All rights reserved.
'''This simple code example is provided as-is, and is for demonstration
purposes only. Able Systems takes no responsibility for any system
implementations based on this code.
This very simple python script establishes USB communication with the Pipsta
printer sends a simple text string to the printer.
Copyright (c) 2014 Able Systems Limited. All rights reserved.
'''
import argparse
import platform
import sys
import time
import usb.core
import usb.util
FEED_PAST_CUTTER = b'\n' * 5
USB_BUSY = 66
# NOTE: The following section establishes communication to the Pipsta printer
# via USB. YOU DO NOT NEED TO UNDERSTAND THIS SECTION TO PROGRESS WITH THE
# TUTORIALS! ALTERING THIS SECTION IN ANY WAY CAN CAUSE A FAILURE TO COMMUNICATE
# WITH THE PIPSTA. If you are interested in learning about what is happening
# herein, please look at the following references:
#
# PyUSB: http://sourceforge.net/apps/trac/pyusb/
# ...which is a wrapper for...
# LibUSB: http://www.libusb.org/
#
# For full help on PyUSB, at the IDLE prompt, type:
# >>> import usb
# >>> help(usb)
# 'Deeper' help can be trawled by (e.g.):
# >>> help(usb.core)
#
# or at the Linux prompt, type:
# pydoc usb
# pydoc usb.core
PIPSTA_USB_VENDOR_ID = 0x0483
PIPSTA_USB_PRODUCT_ID = 0xA053
def parse_arguments():
'''Parse the arguments passed to the script looking for a font file name
and a text string to print. If either are mssing defaults are used.
'''
txt = 'Hello World from Pipsta!'
parser = argparse.ArgumentParser()
parser.add_argument('text', help='the text to print',
nargs='*', default=txt.split())
args = parser.parse_args()
return ' '.join(args.text)
def main():
"""The main loop of the application. Wrapping the code in a function
prevents it being executed when various tools import the code.
"""
if platform.system() != 'Linux':
sys.exit('This script has only been written for Linux')
# Find the Pipsta's specific Vendor ID and Product ID
dev = usb.core.find(idVendor=PIPSTA_USB_VENDOR_ID,
idProduct=PIPSTA_USB_PRODUCT_ID)
if dev is None: # if no such device is connected...
raise IOError('Printer not found') # ...report error
try:
# Linux requires USB devices to be reset before configuring, may not be
# required on other operating systems.
dev.reset()
# Initialisation. Passing no arguments sets the configuration to the
# currently active configuration.
dev.set_configuration()
except usb.core.USBError as ex:
raise IOError('Failed to configure the printer', ex)
# The following steps get an 'Endpoint instance'. It uses
# PyUSB's versatile find_descriptor functionality to claim
# the interface and get a handle to the endpoint
# An introduction to this (forming the basis of the code below)
# can be found at:
cfg = dev.get_active_configuration() # Get a handle to the active interface
interface_number = cfg[(0, 0)].bInterfaceNumber
# added to silence Linux complaint about unclaimed interface, it should be
# release automatically
usb.util.claim_interface(dev, interface_number)
alternate_setting = usb.control.get_interface(dev, interface_number)
interface = usb.util.find_descriptor(
cfg, bInterfaceNumber=interface_number,
bAlternateSetting=alternate_setting)
usb_endpoint = usb.util.find_descriptor(
interface,
custom_match=lambda e:
usb.util.endpoint_direction(e.bEndpointAddress) ==
usb.util.ENDPOINT_OUT
)
if usb_endpoint is None: # check we have a real endpoint handle
raise IOError("Could not find an endpoint to print to")
# Now that the USB endpoint is open, we can start to send data to the
# printer.
# The following opens the text_file, by using the 'with' statemnent there is
# no need to close the text_file manually. This method ensures that the
# close is called in all situation (including unhandled exceptions).
txt = parse_arguments()
usb_endpoint.write(b'\x1b!\x00')
# Print a char at a time and check the printers buffer isn't full
for x in txt:
usb_endpoint.write(x) # write all the data to the USB OUT endpoint
res = dev.ctrl_transfer(0xC0, 0x0E, 0x020E, 0, 2)
while res[0] == USB_BUSY:
time.sleep(0.01)
res = dev.ctrl_transfer(0xC0, 0x0E, 0x020E, 0, 2)
usb_endpoint.write(FEED_PAST_CUTTER)
usb.util.dispose_resources(dev)
# Ensure that BasicPrint is ran in a stand-alone fashion (as intended) and not
# imported as a module. Prevents accidental execution of code.
if __name__ == '__main__':
main()
Pipsta uses linux line-feed ('\n', 0x0A, 10 in decimal) to mark a new line. For a quick test change BasicPrint.py as follows:
#txt = parse_arguments()
txt = 'Recipt:\n========\n1. food 1 - 100$\n2. drink 1 - 200$\n\n\n\n\n'
usb_endpoint.write(b'\x1b!\x00')
for x in txt:
usb_endpoint.write(x)
res = dev.ctrl_transfer(0xC0, 0x0E, 0x020E, 0, 2)
while res[0] == USB_BUSY:
time.sleep(0.01)
res = dev.ctrl_transfer(0xC0, 0x0E, 0x020E, 0, 2)
I commented out parameter parsing and injected a test string with multi line content.

Incremental parse of appended data to external XML file in Python

I've got a log file on external computer in my LAN network. Log is an XML file. File is not accessible from http, and is updating every second.
Currently i'm copying log file into my computer and run parser, but I want to parse file directly from external host.
How can I do it in Python? Is it possible, to parse whole file once, and later parse only new content added to the end in future versions?
You can use paramiko and xml.sax's default parser, xml.sax.expatreader, which implements xml.sax.xmlreader.IncrementalParser.
I ran the following script on local virtual machine to produce XML.
#!/bin/bash
echo "<root>" > data.xml
I=0
while sleep 2; do
echo "<entry><a>value $I</a><b foo='bar' /></entry>" >> data.xml;
I=$((I + 1));
done
Here's incremental consumer.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import time
import xml.sax
from contextlib import closing
import paramiko.client
class StreamHandler(xml.sax.handler.ContentHandler):
lastEntry = None
lastName = None
def startElement(self, name, attrs):
self.lastName = name
if name == 'entry':
self.lastEntry = {}
elif name != 'root':
self.lastEntry[name] = {'attrs': attrs, 'content': ''}
def endElement(self, name):
if name == 'entry':
print({
'a' : self.lastEntry['a']['content'],
'b' : self.lastEntry['b']['attrs'].getValue('foo')
})
self.lastEntry = None
def characters(self, content):
if self.lastEntry:
self.lastEntry[self.lastName]['content'] += content
if __name__ == '__main__':
# use default ``xml.sax.expatreader``
parser = xml.sax.make_parser()
parser.setContentHandler(StreamHandler())
client = paramiko.client.SSHClient()
# or use ``client.load_system_host_keys()`` if appropriate
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect('192.168.122.40', username = 'root', password = 'pass')
with closing(client) as ssh:
with closing(ssh.open_sftp()) as sftp:
with closing(sftp.open('/root/data.xml')) as f:
while True:
buffer = f.read(4096)
if buffer:
parser.feed(buffer)
else:
time.sleep(2)
I am assuming that another process of which you don't have access to is maintaining the xml as an object being updated every so often, and then dumping the result.
If you don't have access to the source of the program dumping the XML, you will need a fancy diffing between the two XML versions to get an incremental update to send over the network.
And I think you would have to parse the new XML each time to be able to have that diff.
So maybe you could have a python process watching the file, parsing the new version, diffing it (for instance using solutions from this article), and then you can send that difference over the network using a tool like xmlrpc. If you want to save bandwidth it'll probably help. Although I think I would send directly the raw diff via network, patch and parse the file in the local machine.
However, if only some of your XML values are changing (no node deletion or insertion) there may be a faster solution. Or, if the only operation on your xml file is to append new trees, then you should be able to parse only these new trees and send them over (diff first, then parse in the server, send to client, merge in client).

Categories