Extracting contents from a header and parts of payload - python

I have the following contents from data.log file. I wish to extract the ts value and part of the payload (after deadbeef in the payload, third row, starting second to last byte. Please refer to expected output).
data.log
print 1: file offset 0x0
ts=0x584819041ff529e0 2016-12-07 14:13:24.124834649 UTC
type: ERF Ethernet
dserror=0 rxerror=0 trunc=0 vlen=0 iface=1 rlen=96 lctr=0 wlen=68
pad=0x00 offset=0x00
dst=aa:bb:cc:dd:ee:ff src=ca:fe:ba:be:ca:fe
etype=0x0800
45 00 00 32 00 00 40 00 40 11 50 ff c0 a8 34 35 E..2..#.#.P...45
c0 a8 34 36 80 01 00 00 00 1e 00 00 08 08 08 08 ..46............
08 08 50 e6 61 c3 85 21 01 00 de ad be ef 85 d7 ..P.a..!........
91 21 6f 9a 32 94 fd 07 01 00 de ad be ef 85 d7 .!o.2...........
print 2: file offset 0x60
ts=0x584819041ff52b00 2016-12-07 14:13:24.124834716 UTC
type: ERF Ethernet
dserror=0 rxerror=0 trunc=0 vlen=0 iface=1 rlen=96 lctr=0 wlen=68
pad=0x00 offset=0x00
dst=aa:bb:cc:dd:ee:ff src=ca:fe:ba:be:ca:fe
etype=0x0800
45 00 00 32 00 00 40 00 40 11 50 ff c0 a8 34 35 E..2..#.#.P...45
c0 a8 34 36 80 01 00 00 00 1e 00 00 08 08 08 08 ..46............
08 08 68 e7 61 c3 85 21 01 00 de ad be ef 86 d7 ..h.a..!........
91 21 c5 34 77 bd fd 07 01 00 de ad be ef 86 d7 .!.4w...........
print 3806: file offset 0x592e0
ts=0x584819042006b840 2016-12-07 14:13:24.125102535 UTC
type: ERF Ethernet
dserror=0 rxerror=0 trunc=0 vlen=0 iface=1 rlen=96 lctr=0 wlen=68
pad=0x00 offset=0x00
dst=aa:bb:cc:dd:ee:ff src=ca:fe:ba:be:ca:fe
etype=0x0800
45 00 00 32 00 00 40 00 40 11 50 ff c0 a8 34 35 E..2..#.#.P...45
c0 a8 34 36 80 01 00 00 00 1e 00 00 08 08 08 08 ..46............
08 08 50 74 73 c3 85 21 01 00 de ad be ef 62 e6 ..Pts..!......b.
91 21 ed 4a 8c df fd 07 01 00 de ad be ef 62 e6 .!.J..........b.
My expected output
0x584819041ff529e0,85d79121
0x584819041ff52b00,86d79121
0x584819042006b840,62e69121
What I have tried so far
I am able to extract the ts value. I used
awk -v ORS="" '$NF == "UTC"{print sep$1; sep=","} END{print "\n"}' data.log
>> ts=0x584819041ff529e0,ts=0x584819041ff52b00
But didn't succeed in extracting the payload contents.
Any help is much appreciated.

Here's one way to get it done:
awk -F '=| ' '/^ts=/{printf $2","} /de ad be ef/{if(!a){printf $15$16;a=1}else{print $1$2;a=0}}' data.log
Output:
0x584819041ff529e0,85d79121
0x584819041ff52b00,86d79121
Explanation:
-F '=| ' : set the field seperator to both '=' and 'space'
/^ts=/{printf $2","} : if pattern 'ts=' found at line beginning, print the second field
/de ad be ef/{something} : if pattern 'de ad be ef' found, do 'something'
Initially variable a will be equal to 0. if pattern de ad be ef is found for the first time, if(!a) would succeed and hence print the 15th and 16th fields. Now set a to 1. So when de ad be ef pattern is matched in the next line, if(!a) check would fail and hence print the 1st and 2nd fields. Now, reset a to 0 and continue the same process for the rest of the file.

If you want sed:
sed -n -e '/^ts/ {s/^ts=\([^ ]*\) \(.*\)/\1/; H;};' \
-e '/de ad be ef/ {N; s/\(.*\)de ad be ef \([0-9a-f]\+\) \([0-9a-f]\+\) \(.*\) \([0-9a-f]\+\) \([0-9a-f]\+\) \(.*\)/,\2\3\5\6/; H;};' \
-e '$ {x; s/\n,/,/g p;}' file
If you are interested in further infos, just ask.

awk variant using deadbeef as switch
awk -F '[= ]' '/^ts/{s=$2",";a=15} /de ad be ef/{s=s $a $(a+1);if(a==1)print s;a=1}' data.log
and a sed variant
sed -n -e '/^ts=/{h;b^J}' -e "/de ad be ef/,//{H;g;s/ts=\([^ ]*\).*\n*de ad be ef \(..\) \(..\).*\n\(..\) \(..\).*/\1,\2\3\4\4/p;}" data.log
info: "^J" is a CTRL+J (new line carractere) in posix version and a ";" in GNU version

With GNU awk for gensub():
$ awk -v RS= '{
gsub(/( |\t)+[^\n]*(\n|$)/," ")
print gensub(/.*\nts=(\S+).*de ad be ef (..) (..) (..) (..).*/,"\\1,\\2\\3\\4\\5\\6",1)
}' data.log
0x584819041ff529e0,85d79121
0x584819041ff52b00,86d79121
0x584819042006b840,62e69121
The above will work even if deadbeef is split across lines.

Related

ADODB unable to store DATETIME value with sub-second precision

According to the Microsoft documentation for the DATETIME column type, values of that type can store "accuracy rounded to increments of .000, .003, or .007 seconds." According to their documentation for the data types used by ADODB, the adDBTimeStamp (code 135), which ADODB uses for DATETIME column parameters, "indicates a date/time stamp (yyyymmddhhmmss plus a fraction in billionths)." However, all attempts (tested using multiple versions of SQL Server, and both the SQLOLEDB provider and the newer SQLNCLI11 provider) fail when a parameter is passed with sub-second precision. Here's a repro case demonstrating the failure:
import win32com.client
# Connect to the database
conn_string = "Provider=...." # sensitive information redacted
conn = win32com.client.Dispatch("ADODB.Connection")
conn.Open(conn_string)
# Create the temporary test table
cmd = win32com.client.Dispatch("ADODB.Command")
cmd.ActiveConnection = conn
cmd.CommandText = "CREATE TABLE #t (dt DATETIME NOT NULL)"
cmd.CommandType = 1 # adCmdText
cmd.Execute()
# Insert a row into the table (with whole second precision)
cmd = win32com.client.Dispatch("ADODB.Command")
cmd.ActiveConnection = conn
cmd.CommandText = "INSERT INTO #t VALUES (?)"
cmd.CommandType = 1 # adCmdText
params = cmd.Parameters
param = params.Item(0)
print("param type is {:d}".format(param.Type)) # 135 (adDBTimeStamp)
param.Value = "2018-01-01 12:34:56"
cmd.Execute() # this invocation succeeds
# Show the result
cmd = win32com.client.Dispatch("ADODB.Command")
cmd.ActiveConnection = conn
cmd.CommandText = "SELECT * FROM #t"
cmd.CommandType = 1 # adCmdText
rs, rowcount = cmd.Execute()
data = rs.GetRows(1)
print(data[0][0]) # displays the datetime value stored above
# Insert a second row into the table (with sub-second precision)
cmd = win32com.client.Dispatch("ADODB.Command")
cmd.ActiveConnection = conn
cmd.CommandText = "INSERT INTO #t VALUES (?)"
cmd.CommandType = 1 # adCmdText
params = cmd.Parameters
param = params.Item(0)
print("param type is {:d}".format(param.Type)) # 135 (adDBTimeStamp)
param.Value = "2018-01-01 12:34:56.003" # <- blows up here
cmd.Execute()
# Show the result
cmd = win32com.client.Dispatch("ADODB.Command")
cmd.ActiveConnection = conn
cmd.CommandText = "SELECT * FROM #t"
cmd.CommandType = 1 # adCmdText
rs, rowcount = cmd.Execute()
data = rs.GetRows(2)
print(data[0][1])
This code throws an exception on the line indicated above, with the error message "Application uses a value of the wrong type for the current operation." Is this a known bug in ADODB? If so, I haven't found any discussion of it. (Perhaps there was discussion earlier which disappeared when Microsoft killed the KB pages.) How can the value be of the wrong type if it matches the documentation?
This is a well-known bug in the SQL Server OLEDB drivers going back more than 20 years; which means it is never going to be fixed.
It's also not a bug in ADO. The ActiveX Data Objects (ADO) API is a thin wrapper around the underlying OLEDB API. The bug exists is in Microsoft's SQL Server OLEDB driver itself (all of them). And they will never, never, never fix it now; as they are chicken-shits that don't want to maintain existing code it might break existing applications.
So the bug has been carried forward for decades:
SQOLEDB (1999) → SQLNCLI (2005) → SQLNCLI10 (2008) → SQLNCLI11 (2010) → MSOLEDB (2012)
The only solution is rather than parameterizing your datetime as timestamp:
adTimestamp (aka DBTYPE_DBTIMESTAMP, 135)
you need to parameterize it an "ODBC 24-hour format" yyyy-mm-dd hh:mm:ss.zzz string:
adChar (aka DBTYPE_STR, 129): 2021-03-21 17:51:22.619
or with even with the ADO-specific type string type:
adVarChar (200): 2021-03-21 17:51:22.619
What about other DBTYPE_xxx's?
You might think that the adDate (aka DBTYPE_DATE, 7) looks promising:
Indicates a date value (DBTYPE_DATE). A date is stored as a double, the whole part of which is the number of days since December 30, 1899, and the fractional part of which is the fraction of a day.
But unfortunately not, as it also parameterizes the value to the server without milliseconds:
exec sp_executesql N'SELECT #P1 AS Sample',N'#P1 datetime','2021-03-21 06:40:24'
You also cannot use adFileTime, which also looks promising:
Indicates a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (DBTYPE_FILETIME).
Meaning it could support a resolution of 0.0000001 seconds.
Unfortunately by the rules of VARIANTs, you are not allowed to store a FILETIME in a VARIANT. And since ADO uses variants for all values, it throws up when it encounters variant type 64 (VT_FILETIME).
Decoding TDS to confirm our suspicions
We can confirm that the SQL Server OLEDB driver is not supplying a datetime with the available precision by decoding the packet sent to the server.
We can issue the batch:
SELECT ? AS Sample
And specify parameter 1: adDBTimestamp - 3/21/2021 6:40:23.693
Now we can capture that packet:
0000 03 01 00 7b 00 00 01 00 ff ff 0a 00 00 00 00 00 ...{............
0010 63 28 00 00 00 09 04 00 01 32 28 00 00 00 53 00 c(.......2(...S.
0020 45 00 4c 00 45 00 43 00 54 00 20 00 40 00 50 00 E.L.E.C.T. .#.P.
0030 31 00 20 00 41 00 53 00 20 00 53 00 61 00 6d 00 1. .A.S. .S.a.m.
0040 70 00 6c 00 65 00 00 00 63 18 00 00 00 09 04 00 p.l.e...c.......
0050 01 32 18 00 00 00 40 00 50 00 31 00 20 00 64 00 .2....#.P.1. .d.
0060 61 00 74 00 65 00 74 00 69 00 6d 00 65 00 00 00 a.t.e.t.i.m.e...
0070 6f 08 08 f2 ac 00 00 20 f9 6d 00 o...... .m.
And decode it:
03 ; Packet type. 0x03 = 3 ==> RPC
01 ; Status
00 7b ; Length. 0x07B ==> 123 bytes
00 00 ; SPID
01 ; Packet ID
00 ; Window
ff ff ; ProcName 0xFFFF => Stored procedure number. UInt16 number to follow
0a 00 ; PROCID 0x000A ==> stored procedure ID 10 (10=sp_executesql)
00 00 ; Option flags (16 bits)
00 00 63 28 00 00 00 09 ; blah blah blah
04 00 01 32 28 00 00 00 ;
53 00 45 00 4c 00 45 00 ; \
43 00 54 00 20 00 40 00 ; |
50 00 31 00 20 00 41 00 ; |- "SELECT #P1 AS Sample"
53 00 20 00 53 00 61 00 ; |
6d 00 70 00 6c 00 65 00 ; /
00 00 63 18 00 00 00 09 ; blah blah blah
04 00 01 32 18 00 00 00 ;
40 00 50 00 31 00 20 00 ; \
64 00 61 00 74 00 65 00 ; |- "#P1 datetime"
74 00 69 00 6d 00 65 00 ; /
00 00 6f 08 08 ; blah blah blah
f2 ac 00 00 ; 0x0000ACF2 = 44,274 ==> 1/1/1900 + 44,274 days = 3/21/2021
20 f9 6d 00 ; 0x006DF920 = 7,207,200 ==> 7,207,200 / 300 seconds after midnight = 24,024.000 seconds = 6h 40m 24.000s = 6:40:24.000 AM
The short version is that a datetime is specified on-the-wire as:
datetime is represented in the following sequence:
One 4-byte signed integer that represents the number of days since January 1, > 1900. Negative numbers are allowed to represent dates since January 1, 1753.
One 4-byte unsigned integer that represents the number of one three-hundredths of a second (300 counts per second) elapsed since 12 AM that day.
Which means we can read the datetime supplied by the driver as:
Date portion: 0x0000acf2 = 44,274 = January 1, 1900 + 44,274 days = 3/21/2021
Time portion: 0x006df920 = 7,207,200 = 7,207,200 / 300 seconds = 6:40:24 AM
So the driver cut off the precision of our datetime:
Supplied date: 2021-03-21 06:40:23.693
Date in TDS: 2021-03-21 06:40:24
In other words:
OLE Automation uses Double to represent datetime.
The Double has a resolution to ~0.0000003 seconds.
The driver has the option to encode the time down to 1/300th of a second:
6:40:24.693 → 7,207,407 → 0x006DF9EF
But it chose not to. Bug: Driver.
Resources to help decoding TDS
2.2.6.6 RPC Request
4.8 RPC Client Request (actual hex example)
2.2.5.5.1.8 Date/Times

Chrome corrupting binary file download by converting to UTF-8

I've currently been assigned to maintain an application written with Flask. Currently I'm trying to add a feature that allows users to download a pre-generated excel file, however, whenever I try to send it, my browser appears to re-encode the file in UTF-8 which causes multibyte characters to be added, which corrupts the file.
File downloaded with wget:
(venv) luke#ubuntu:~$ hexdump -C wget.xlsx | head -n 2
00000000 50 4b 03 04 14 00 00 00 08 00 06 06 fb 4a 1f 23 |PK...........J.#|
00000010 cf 03 c0 00 00 00 13 02 00 00 0b 00 00 00 5f 72 |.............._r|
The file downloaded with Chrome (notice the EF BF BD sequences?)
(venv) luke#ubuntu:~$ hexdump -C chrome.xlsx | head -n 2
00000000 50 4b 03 04 14 00 00 00 08 00 ef bf bd 03 ef bf |PK..............|
00000010 bd 4a 1f 23 ef bf bd 03 ef bf bd 00 00 00 13 02 |.J.#............|
Does anyone know how I could fix this? This is the code I'm using:
data = b'PK\x03\x04\x14\x00\x00\x00\x08\x00}\x0c\xfbJ\x1f#\xcf\x03\xc0\x00\x00\x00\x13\x02\x00\x00\x0b\x00\x00\x00'
send_file(BytesIO(data), attachment_filename="x.xlsx", as_attachment=True)
Related issue: Encoding problems trying to send a binary file using flask_restful
Chrome was expecting to receive utf-8 encoded text, and found some bytes that couldn't be interpreted as valid utf-8 encoding of a char - which is normal, because your file is binary. So it replaced these invalid bytes with EF BF BD, the utf-8 encoding of the Unicode replacement character. The content-type header you send is probably text/..... Maybe try something like Content-Type:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Python write to text file only if ASCII value

I am trying to write a program which will allow me to compare SQL files to each other and have started by writing the full SQL file to to a text file. The text file generates successfully, but with blocks at the end as in the below example:
SET ANSI_NULLS ON਍ഀ
GO਍ഀ
SET QUOTED_IDENTIFIER ON਍ഀ
GO਍ഀ
CREATE TABLE [dbo].[CDR](਍ഀ
Below this is the code that generates the text file
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
from _ast import Num
#imports packages
r= open('master_lines.txt', 'w')
directory= "E:\\" #file directory, anonymous omission
master= directory + "master"
databases= ["\\1", "\\2", "\\3", "\\4"]
file_types= ["\\StoredProcedure", "\\Table", "\\UserDefinedFunction", "\\View"]
servers= []
server_number= []
master_lines= []
for file in os.listdir("E:\\"): #adds server paths to an array
servers.append(file)
for num in range(0, len(servers)):
for file in os.listdir(directory + servers[num]): #adds all the servers and paths to an array
server_number.append(servers[num] + "\\" + file)
master= directory + server_number[server_number.index("master")]
master_var= master + databases[0]
tmp= master_var + file_types[1]
for file in os.listdir(tmp):
with open(file) as tmp_file:
line= tmp_file.readlines()
for num in range(0, len(line)):
r.write(line[num])
r.close
I have already tried changing the encoding to both latin1 and utf-8; the current text file is the most successful as ascii and latin1 produced chinese and arabic characters respectively.
Below is the SQL file in text format:
/****** Object: Table [dbo].[CDR] Script Date: 2017-01-12 02:30:49 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[CDR](
[calldate] [datetime] NOT NULL,
[clid] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[src] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[dst] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[dcontext] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[channel] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[dstchannel] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[lastapp] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[lastdata] [varchar](80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[duration] [int] NOT NULL,
[billsec] [int] NOT NULL,
[disposition] [varchar](45) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[amaflags] [int] NOT NULL,
[accountcode] [varchar](20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[userfield] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[uniqueid] [varchar](64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[cdr_id] [int] NOT NULL,
[cost] [real] NOT NULL,
[cdr_tag] [varchar](10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[importID] [bigint] IDENTITY(-9223372036854775807,1) NOT NULL,
CONSTRAINT [PK_CDR_1] PRIMARY KEY CLUSTERED
(
[uniqueid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [ReadPartition]
) ON [ReadPartition]
GO
SET ANSI_PADDING ON
GO
/****** Object: Index [Idx_Dst_incl_uniqueId] Script Date: 2017-01-12 02:30:50 PM ******/
CREATE NONCLUSTERED INDEX [Idx_Dst_incl_uniqueId] ON [dbo].[CDR]
(
[dst] ASC
)
INCLUDE ( [uniqueid]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [ReadPartition]
GO
Hex dump to understand what happens, not part of above question:
ff fe 2f 00 2a 00 2a 00 2a 00 2a 00 2a 00 2a 00
20 00 4f 00 62 00 6a 00 65 00 63 00 74 00 3a 00
20 00 20 00 54 00 61 00 62 00 6c 00 65 00 20 00
5b 00 64 00 62 00 6f 00 5d 00 2e 00 5b 00 43 00
44 00 52 00 5d 00 20 00 20 00 20 00 20 00 53 00
63 00 72 00 69 00 70 00 74 00 20 00 44 00 61 00
74 00 65 00 3a 00 20 00 32 00 30 00 31 00 37 00
2d 00 30 00 31 00 2d 00 31 00 32 00 20 00 30 00
32 00 3a 00 33 00 30 00 3a 00 34 00 39 00 20 00
50 00 4d 00 20 00 2a 00 2a 00 2a 00 2a 00 2a 00
2a 00 2f 00 0d 00 0a 00 53 00 45 00 54 00 20 00
41 00 4e 00 53 00 49 00 5f 00 4e 00 55 00 4c 00
4c 00 53 00 20 00 4f 00 4e 00 0d 00 0a 00 47 00
4f 00 0d 00 0a 00 53 00 45 00 54 00 20 00 51 00
55 00 4f 00 54 00 45 00 44 00 5f 00 49 00 44 00
Result of hexdump:
../.*.*.*.*.*.*.
.O.b.j.e.c.t.:.
. .T.a.b.l.e. .
[.d.b.o.]...[.C.
D.R.]. . . . .S.
c.r.i.p.t. .D.a.
t.e.:. .2.0.1.7.
-.0.1.-.1.2. .0.
2.:.3.0.:.4.9. .
P.M. .*.*.*.*.*.
*./.....S.E.T. .
A.N.S.I._.N.U.L.
L.S. .O.N.....G.
O.....S.E.T. .Q.
U.O.T.E.D._.I.D.
Your problem is that the original files are encoded in UTF-16 with an initial Byte Order Mark. It is normally transparent on Windows because almost all file editors automatically read it thanks to the initial BOM.
But the conversion is not automatic for Python scripts! That means that every character is read as the character itself followed by a null. It is almost transparent except for end of lines, because the nulls are simply written back again to form normal UTF16 characters. But the \n is no longer preceded by a raw \r but with a null, as as you write in text mode, Python replaces it with a pair \r\n which is no longer a valid UTF16 character and this causes the bloc display.
This is trivial to fix, just declare the UTF16 encoding when reading files:
for file in os.listdir(tmp):
with open(file, encoding='utf_16_le') as tmp_file:
Optionally, if you want to preserve the UTF16 encoding, you could also open the master file with it. By default, Python will encode it as utf8. But my advice would be to revert to 8bit encoding files to avoid further problem if you later wanted to process the output file.

Telegram Api - Creating an Authorization Key 404 error

I am trying to write a simple program in python to use telegram api, (not bot api, main messaging api) Now i have written this code
#!/usr/bin/env python
import socket
import random
import time
import struct
import requests
def swap64(i):
return struct.unpack("<L", struct.pack(">L", i))[0]
MESSAGE = '0000000000000000'+format(swap32(int(time.time()*1000%1000)<<21|random.randint(0,1048575)<<3|4),'x')+format(swap32(int(time.time())),'x')+'140000007897466068edeaecd1372139bbb0394b6fd775d3'
res = requests.post(url='http://149.154.167.40',
data=bytes.fromhex(MESSAGE),
headers={'connection': 'keep-alive'})
print("received data:", res)
For payload of post data i used the source code of telegram web, The 0 auth id, message id is generated using the algo in telegram web, next is length (14000000) just like in the source and main doc and then the method and so on,
When i run this code i get received data: <Response [404]> i have used both tcp and http transport with this and tcp one gives me nothing as answer from server, i don't know where i'm wrong in my code
i would be glad if someone can show the error in my code
btw here is hex dump of my generated req:
0000 34 08 04 17 7a ec 48 5d 60 84 ba ed 08 00 45 00
0010 00 50 c6 07 40 00 40 06 76 28 c0 a8 01 0d 95 9a
0020 a7 28 c9 62 00 50 0d 1a 3b df 41 5a 40 7f 50 18
0030 72 10 ca 39 00 00 00 00 00 00 00 00 00 00 6c 28
0040 22 4a 94 a9 c9 56 14 00 00 00 78 97 46 60 68 ed
0050 ea ec d1 37 21 39 bb b0 39 4b 6f d7 75 d3
i have already read this and this and many other docs but cant find out my problem
thanks in advance
update
i used this code as suggested
TCP_IP = '149.154.167.40'
TCP_PORT = 80
MESSAGE = 'ef0000000000000000'+"{0:0{1}x}".format(int(time.time()*4294.967296*1000),16)+'140000007897466068edeaecd1372139bbb0394b6fd775d3'
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(bytes.fromhex(MESSAGE))
data = s.recv(BUFFER_SIZE)
s.close()
and i still get no response
hex dump of my request:
0000 34 08 04 17 7a ec 48 5d 60 84 ba ed 08 00 45 00
0010 00 51 e1 44 40 00 40 06 5a ea c0 a8 01 0d 95 9a
0020 a7 28 df 8c 00 50 e4 0d 12 46 e2 98 bf a3 50 18
0030 72 10 af 66 00 00 ef 00 00 00 00 00 00 00 00 00
0040 16 37 dc e1 28 39 23 14 00 00 00 78 97 46 60 68
0050 ed ea ec d1 37 21 39 bb b0 39 4b 6f d7 75 d3
Fixed code
Finally got it working with this code
import socket
import random
import time
import struct
import requests
def swap32(i):
return struct.unpack("<L", struct.pack(">L", i))[0]
TCP_IP = '149.154.167.40'
TCP_PORT = 80
z = int(time.time()*4294.967296*1000000)
z = format(z,'x')
q = bytearray.fromhex(z)
e = q[::-1].hex()
MESSAGE = 'ef0a0000000000000000'+e+'140000007897466068edeaecd1372139bbb0394b6fd775d3'
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(bytes.fromhex(MESSAGE))
data = s.recv(BUFFER_SIZE)
s.close()
print(data)
here is sample data from a simple TCP handshake with Telegram Servers:
Connect:Success:0
Connected to 149.154.167.40:443
raw_data: 000000000000000000F011DB3B2AA9561400000078974660A9729A4F5B51F18F7943F9C0D61B1315
auth_key_id: 0000000000000000 0
message_id: 56A92A3BDB11F000 6244568794892726272
data_length: 00000014 20
message_data: 78974660A9729A4F5B51F18F7943F9C0D61B1315
message_type: 60469778
>> EF0A000000000000000000F011DB3B2AA9561400000078974660A9729A4F5B51F18F7943F9C0D61B1315
Send:Success:42
Receive:Success:85
<< 15000000000000000001CC0CC93D2AA9564000000063241605A9729A4F5B51F18F7943F9C0D61B1315B4445B94718B3C6DD4136466FAC62DCD082311272BE9FF8F9700000015C4B51C01000000216BE86C022BB4C3
raw_data: 000000000000000001CC0CC93D2AA9564000000063241605A9729A4F5B51F18F7943F9C0D61B1315B4445B94718B3C6DD4136466FAC62DCD082311272BE9FF8F9700000015C4B51C01000000216BE86C022BB4C3
auth_key_id: 0000000000000000 0
message_id: 56A92A3DC90CCC01 6244568803180334081
data_length: 00000040 64
message_data: 63241605A9729A4F5B51F18F7943F9C0D61B1315B4445B94718B3C6DD4136466FAC62DCD082311272BE9FF8F9700000015C4B51C01000000216BE86C022BB4C3
message_type: 05162463
classid: resPQ#05162463
nonce: A9729A4F5B51F18F7943F9C0D61B1315
server_nonce: B4445B94718B3C6DD4136466FAC62DCD
pq: 2311272BE9FF8F97 2526843935494475671
count: 00000001 1
fingerprints: C3B42B026CE86B21 14101943622620965665
Lets break it down:
We are using the TCP abridged version, so we start off with 0xEF
The format for plain-text Telegram messages is auth_ke_id + msg_id + msg_len + msg
auth_key_id is always 0 for plain-text messages hence we always start with 0000000000000000
msg_id must approximately equal unixtime*2^32(see here) I have also seen that some variant of this works quite well for msg_id in any language on any platform: whole_part_of(current_micro_second_time_stamp * 4294.967296)
The first message you start with for Auth_key generation is reqPQ which is defined as: reqPQ#0x60469778 {:nonce, :int128} so it is simply a TL-header + a 128-bit random integer the total length will always be 4 + 16 = 20 encoded as little-endian that would be msg_len = 14000000
say we have a 128-bit random integer= 55555555555555555555555555555555, then our reqPQ message would be 7897466055555555555555555555555555555555, which is simply TL-type 60469778 or 78974660 in little-endian followed by your randomly chooses 128-bit nonce.
Before you send out the packet, again recall that TCP-abridged mode required you to include the total packet length in front of the other bytes just after the initial 0xEA . This packet length is computed as follows
let len = total_length / 4
a) If len < 127 then len_header = len as byte
b) If len >=127 then len_header = 0x7f + to_3_byte_little_endian(len)
finally we have:
EF0A000000000000000000F011DB3B2AA956140000007897466055555555555555555555555555555555
or
EF0A
0000000000000000
00F011DB3B2AA956
14000000
78974660
55555555555555555555555555555555
compared to yours:
0000000000000000
6C28224A94A9C956
14000000
78974660
68EDEAECD1372139BBB0394B6FD775D3
I would say, try using TCP-abriged mode by include the 0xEF starting bit and re-check your msg_id computation
cheers.

String variable comparison in python

I have 2 variables from a xml file;
edit:*i m sorry. i pasted wrong value *
x="00 25 9E B8 B9 19 "
y="F0 00 00 25 9E B8 B9 19 "
when i use if x in y: statement nothings happen
but if i use if "00 25 9E B8 B9 19 " in y: i get results
any idea?
i am adding my full code;
import xml.etree.ElementTree as ET
tree =ET.parse('c:/sw_xml_test_4a.xml')
root=tree.getroot()
for sw in root.findall('switch'):
for switch in root.findall('switch'):
if sw[6].text.rstrip() in switch.find('GE01').text:
print switch[0].text
if sw[6].text.strip() in switch.find('GE02').text.strip():
print switch[0].text
if sw[6].text.strip() in switch.find('GE03').text.strip():
print switch[0].text
if sw[6].text.strip() in switch.find('GE04').text.strip():
print switch[0].text
xml file detail;
- <switch>
<ci_adi>"aaa_bbb_ccc"</ci_adi>
<ip_adress>10.10.10.10</ip_adress>
<GE01>"F0 00 00 25 9E 2C BC 98 "</GE01>
<GE02>"80 00 80 FB 06 C6 A1 2B "</GE02>
<GE03>"F0 00 00 25 9E B8 BB AA "</GE03>
<GE04>"F0 00 00 25 9E B8 BB AA "</GE04>
<bridge_id>"00 25 9E B8 BB AA "</bridge_id>
</switch>
>>> x = "00 25 9E 2C BC 8B"
>>> y = "F0 00 00 25 9E B8 B9 19"
>>> x in y
False
>>> "00 25 9E 2C BC 8B " in y
False
how exactly are you getting results?
let me explain what in is checking.
in is checking if the entire value of x is contained anywhere within the value of y. as you can see, the entire value of x is NOT contained in its entirety in y.
however, some elements of x are, maybe what you are trying to do is:
>>> x = ["00", "25", "9E", "2C", "BC", "8B"]
>>> y = "F0 00 00 25 9E B8 B9 19"
>>> for item in x:
if item in y:
print item + " is in " + y
00 is in F0 00 00 25 9E B8 B9 19
25 is in F0 00 00 25 9E B8 B9 19
9E is in F0 00 00 25 9E B8 B9 19
The operators in and not in test for collection membership. x in s evaluates to true if x is a member of the collection s, and false otherwise. For strings, this translates to return True if entire string x is a substring of y, else return False.
Other than a mix-up of values in your question, this seems to work the way you want:
sxml="""\
<switch>
<ci_adi>"aaa_bbb_ccc"</ci_adi>
<ip_adress>10.10.10.10</ip_adress>
<GE01>"F0 00 00 25 9E 2C BC 98 "</GE01>
<GE02>"80 00 80 FB 06 C6 A1 2B "</GE02>
<GE03>"F0 00 00 25 9E B8 BB AA "</GE03>
<GE04>"F0 00 00 25 9E B8 BB AA "</GE04>
<bridge_id>"00 25 9E B8 BB AA "</bridge_id>
</switch>"""
tree=et.fromstring(sxml)
x="80 00 80 FB 06 C6 A1 2B" # Note: I used a value of x I could see in the data;
# your value of x="00 25 9E B8 B9 19 " is not present...
for el in tree:
print '{}: {}'.format(el.tag, el.text)
if x in el.text:
print 'I found "{}" by gosh at {}!!!\n'.format(x,el.tag)

Categories