How to decode Common Industrial Protocol (CIP) packets using python? - python

I tried to decode this highlighted segment however i ran into some issues.
I used this code in order to decipher the content
hexed ="01000c0000000040000040400000803f0000003f2af0ce4004040000404000008040cdcc4c3ecdcccc3d305b1a3e2903fa42240000484400006144000048430000c8424ddc4143200000484400006144000048430000c84218380b440000000000000000000000000000000000000000000000000b010001deddf7420b0100016666e6400201000102000000000000000000000000305b1a3e4ddc414318380b4400010000000101000100010002000300121204000200010000050006000600ffffffff00000000deddf742"
ether_pkt = Ether(binascii.unhexlify(hexed))
ether_pkt.show()
And the result i got is:
How do i further decipher this content?
'\x80?\x00\x00\x00?*\xf0\xce#\x04\x04\x00\x00##\x00\x00\x80#\xcd\xccL>\xcd\xcc\xcc=0[\x1a>)\x03\xfaB$\x00\x00HD\x00\x00aD\x00\x00HC\x00\x00\xc8BM\xdcAC \x00\x00HD\x00\x00aD\x00\x00HC\x00\x00\xc8B\x188\x0bD\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0b\x01\x00\x01\xde\xdd\xf7B\x0b\x01\x00\x01ff\xe6#\x02\x01\x00\x01\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x000[\x1a>M\xdcAC\x188\x0bD\x00\x01\x00\x00\x00\x01\x01\x00\x01\x00\x01\x00\x02\x00\x03\x00\x12\x12\x04\x00\x02\x00\x01\x00\x00\x05\x00\x06\x00\x06\x00\xff\xff\xff\xff\x00\x00\x00\x00\xde\xdd\xf7B'
I've tried to .decode() and hex() in order to turn them into string however the output is not human readable

Have a look at pycomm3. Especially its CIP reference.
According to the reference, 0x4c is the "read_tag" custom service for Rockwell devices, whatever that means.
The data you highlighted is listed as "command specific data". That suggests that it is not defined in the CIP, but is custom to the device that sent it. If it had been part of the CIP, wireshark could probably have decoded it further. So you will have to find and read documentation for the device in question.

There is no magic, you need to download the specs and write a parser to decode it. As you can see in your wireshark screenshot, the protocol isn't string/ascii.

Related

Facebook/messenger archive contains emoji that I am unable to parse

I can' figure out how to decode facebook's way of encoding emoji in the messenger archive.
Hi everyone,
I'm trying to code a handy utility to explore messenger's archive file with PYTHON.
The message's file is a "badly encoded "JSON and as stated in this other post: Facebook JSON badly encoded
Using .encode('latin1').decode('utf8) I've been able to deal with most characters such as "é" or "à" and display them correctly. But I'm having a hard time with emojis, as they seem to be encoded in a different way.
Example of a problematic emoji : \u00f3\u00be\u008c\u00ba
The encoding/decoding does not yield any errors, but Tkinter is not willing to display what the function outputs and gives "_tkinter.TclError: character U+fe33a is above the range (U+0000-U+FFFF) allowed by Tcl". Tkinter is not yet this issue thought because trying to display the same emoji in the consol yields "ó¾º" which clearly isn't what's supposed to be displayed ( it's supposed to be a crying face)
I've tried using the emoji library but it doesn't seem to help any
>>> print(emoji.emojize("\u00f3\u00be\u008c\u00ba"))
'ó¾º'
How can I retrieve the proper emoji and display it?
If it's not possible, how can I detect problematic emojis to maybe sanitize and remove them from the JSON in the first place?
Thank you in advance
.encode('latin1').decode('utf8) is correct - it results in the codepoint U+fe33a("󾌺"). This codepoint is in a Private Use Area (PUA) (specifically Supplemental Private Use Area-A), so everyone can assign his own meaning to that codepoint (Maybe facebook wanted to use a crying face, when there wasn't yet one in Unicode, so they used PUA?).
Googling for that char (https://www.google.com/search?q=󾌺) makes google autocorrect it to U+1f62d ("😭") - sadly I have no idea how google maps U+fe33a to U+1f62d.
Googling for U+fe33a site:unicode.org gives https://unicode.org/L2/L2010/10132-emojidata.pdf, which lists U+1F62D as proposed official codepoint.
As that document from unicode lists U+fe33a as a codepoint used by google, I searched for android old emoji codepoints pua. Among other stuff two actually usable results:
How to get Android emoji code point - the question links to :
https://unicodey.com/emoji-data/table.htm - a html table, that seems to be acceptably parsable
and even better: https://github.com/google/mozc/blob/master/src/data/emoji/emoji_data.tsv - a tab sepperated list, that maps modern codepoints to legacy PUA codepoints and other information like this:
1F62D 😭 FE33A E72D E411[...]
https://github.com/googlei18n/noto-emoji/issues/115 - this thread links to:
https://github.com/Crissov/noto-emoji/blob/legacy-pua/emoji_aliases.txt - a machine readable document, that translates legacy PUA codepoints to modern codepoints like this:
FE33A;1F62D # Google
I included my search queries in the answer, because non of the results I found are in any way authoritative - but it should be enough, to get your tool working :-)

Is it possible to access the hexdump of a packet in PyShark?

I am using pyshark to open and parse pcap files. Currently I've been able to access the packet fields. But I cannot seem to find a way to access the hexdump value of each packet. Is there any way to do that?
According to the homepage of PyShark:
[PyShark] doesn't actually parse any packets, it simply uses tshark's (wireshark command-line utility) ability to export XMLs to use its parsing.
The XML exported by tshark is either PSML (Packet Summary Markup Language) or PDML (Packet Details Markup Language) and neither of these format store the full hexadecimal dump of packets.
After digging into the source code and considering the point above, I can say that the feature you are looking for is not implemented in PyShark.

Using python to measure Wi-Fi

I am working on a school project in which I must measure and log Wi-Fi (I know how to log the data, I just don't know the most efficient way to do it). I have tried using by using
subproject.check_output('iwconfig', stderr=subprocess.STDOUT)
but that outputs bytes, which I really don't want to deal with (and I don't know how to, either, so if that is the only option, then can someone explain how to handle bytes). Is there any other way, maybe to get it in plain text? And please do not just give me the code I need, tell me how to do it.
Thank you in advance!
You are almost there. I assume that you are using python 3.x. iwconfig is sending you text encoded in whatever character set your terminal uses. That encoding is available as sys.stdin.encoding. So just put it together to get a string. BTW, you want a command list instead of a string.
raw = subprocess.check_output(['iwconfig'],stderr=subprocess.STDOUT)
data = raw.decode(sys.stdin.encoding)

Interpreting padded string

I have little experience with unicode strings. I am not even sure this fits the criteria.
In any case I was using nmap and ran:
# nmap -sV -O 192.168.0.8
against a box in my LAN. Nmap produced a string over several lines returned from an open port, but I cannot understand a lot of the output due to its formatting. For example, a small snippet looks like this:
-Port8081-TCP:V=6.00%I=7%D=10/20%Time=52642C3A%P=i686-pc-linux-gnu%r(FourOhFourRequest,37,"HTTP/1\.1\x20503\x20Service\x20Unavailable\r\nContent-Length:\x200\r\n\r\n")%r
My first thought was URL encoding which requires decoding, but that's incorrect. It looks almost like padding from serial communication? Anybody able to shed light on how to interpret the "\x200" or "\x20503" or another that shows often is "x\20".
I thought about writing a small Python script to take in the entire string and convert to ASCII with:
>>> s = '<STRING>'
>>> eval('\x20"'+s.replace('"', r'"')+'"').encode('ascii')
Am I on the right track?
The string you see is a service fingerprint. It contains the responses that were received to the various probes that Nmap sends. If you think there is identifying information in the responses, please submit the fingerprint to the Nmap project to improve detection in the future.
More than likely, what happened is that the service is not sending any useful information. The sample you gave, for instance, does not have a Server: header that would identify the HTTP server.
To answer the technical problem of how to turn this string:
"HTTP/1\.1\x20503\x20Service\x20Unavailable\r\nContent-Length:\x200\r\n\r\n"
into an unescaped version, you can do this:
>>> print mystring
"HTTP/1\.1\x20503\x20Service\x20Unavailable\r\nContent-Length:\x200\r\n\r\n"
>>> print mystring.decode('string-escape')
"HTTP/1\.1 503 Service Unavailable
Content-Length: 0
"
Those numbers bring to mind hexidecimal values due to the 'x' in front.
I know that hexidecimal values actually start with '0x' and not just x, but I thought it was worth googling them as hex values with the '0x' in front. I did get a full page of search results which seemed to contain these three values(perhaps inevitable that three random values would show up somewhere, but then again, perhaps not):
0x200, 0x20503, 0x20
Sorry that this isn't an answer as such, but I thought I would mention it since you didn't mention trying this in your post. I wanted to post this as a comment, but the option wasn't available for some reason...

Handling unicode data in XMLRPC

I have to migrate data to OpenERP through XMLRPC by using TerminatOOOR.
I send a name with value "Rotule right Aurélia".
In Python the name with be encoded with value : 'Rotule right Aur\xc3\xa9lia '
But in TerminatOOOR (xmlrpc client) the data is encoded with value 'Rotule middle Aur\357\277\275lia'
So in the server side, the data value is not decoded correctly and I get bad data.
The terminateOOOR is a ruby plugin for Kettle ( Java product) and I guess it should encode data by utf-8.
I just don't know why it happens like this.
Any help?
This issue comes from Kettle.
My program is using Kettle to get an Excel file, get the active sheet and transfer the data in that sheet to TerminateOOOR for further handling.
At the phase of reading data from Excel file, Kettle can not recognize the encoding then it gives bad data to TerminateOOOR.
My work around solution is manually exporting excel to csv before giving data to TerminateOOOR. By doing this, I don't use the feature to mapping excel column name a variable name (used by kettle).
first off, whenever you deal with text (and all text is bound to contain some non-US-ASCII character sooner or later), you'll be much happier doing that in Python 3.x instead of in the 2.x series. if Py3 is not an option, try to always use from __future__ import unicode_literals (available in Python 2.6 and 2.7).
basically, when you send text or any other data over the wire, that will only happen in the form of bytes (octets of bits), so it will have to be encoded at some point. try to find out exactly where that encoding takes place in your tool chain; if necessary, use a debugging tool (or deploy print( repr( x ) ) statements) to look into relevant variables. the other software you mention is presumably written in PHP, a language which is known to have issues with unicode. you say that 'it should encode the data by utf-8', but on the other hand, when the receiving end sees the data of an incoming RPC request, that data should already be in utf-8. it would have to be decoded to obtain unicode again.

Categories