comparing strings and decoded unicode in python3 - python

I'm doing some socket/select programming and one of my events is triggered by the incoming byte string of 'OK'. I'm using utf_8 to encode everything sent from the server and decoding it on the client. However, my client comparisons aren't working and my if statement never evaluates to true. Here is the code in question:
Server side:
def broadcast_string(self, data, omit_sock): # broadcasts data utf_8 encoded to all socks
for sock in self.descriptors:
if sock is not self.server and sock is not omit_sock:
sock.send(data.encode('utf_8'))
print(data)
def start_game(self): # i call this to send 'OK'
data = 'OK'
self.broadcast_string(data, 0)
self.new_round()
Client side:
else: # got data from server
if data.decode('utf_8') == 'OK': # i've tried substituting this with a var, no luck
self.playstarted = True
else:
sys.stdout.write(data.decode('utf_8') + "\n")
sys.stdout.flush()
if self.playstarted is True: # never reached because if statement never True
command = input("-->")
I've read this and I think I'm following it but apparently not. I've even done the examples using the python shell and have had them evaluate to True, but not when I run this program.
Thanks!

TCP sockets don't have message boundaries. As your last comment says you are getting multiple messages in one long string. You are reponsible for queuing up data until you have a complete message, and then processing it as one complete message.
Each time select says a socket has some data to read, append the data to a read buffer, then check to see if the buffer contains a complete message. If it does, extract just the message from the front of the buffer and process it. Continue until no more complete messages are found, then call select again. Note also you should only decode a complete message, since you might receive a partial UTF-8 multi-byte character otherwise.
Rough example using \n as a message terminator (no error handling):
tmp = sock.recv(1000)
readbuf += tmp
while b'\n' in readbuf:
msg,readbuf = readbuf.split(b'\n',1)
process(msg.decode('utf8'))

Related

Python Socket is receiving inconsistent messages from Server

So I am very new to networking and I was using the Python Socket library to connect to a server that is transmitting a stream of location data.
Here is the code used.
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((gump.gatech.edu, 756))
try:
while (1):
data = s.recv(BUFFER_SIZE).decode('utf-8')
print(data)
except KeyboardInterrupt:
s.close()
The issue is that the data arrives in inconsistent forms.
Most of the times it arrives in the correct form like this:
2016-01-21 22:40:07,441,-84.404153,33.778685,5,3
Yet other times it can arrive split up into two lines like so:
2016-01-21
22:40:07,404,-84.396004,33.778085,0,0
The interesting thing is that when I establish a raw connection to the server using Putty I only get the correct form and never the split. So I imagine that there must be something happening that is splitting the message. Or something Putty is doing to always assemble it correctly.
What I need is for the variable data to contain the proper line always. Any idea how to accomplish this?
It is best to think of a socket as a continuous stream of data, that may arrive in dribs and drabs, or a flood.
In particular, it is the receivers job to break the data up into the "records" that it should consist of, the socket does not magically know how to do this for you. Here the records are lines, so you must read the data and split into lines yourself.
You cannot guarantee that a single recv will be a single full line. It could be:
just part of a line;
or several lines;
or, most probably, several lines and another part line.
Try something like: (untested)
# we'll use this to collate partial data
data = ""
while 1:
# receive the next batch of data
data += s.recv(BUFFER_SIZE).decode('utf-8')
# split the data into lines
lines = data.splitlines(keepends=True)
# the last of these may be a part line
full_lines, last_line = lines[:-1], lines[-1]
# print (or do something else!) with the full lines
for l in full_lines:
print(l, end="")
# was the last line received a full line, or just half a line?
if last_line.endswith("\n"):
# print it (or do something else!)
print(last_line, end="")
# and reset our partial data to nothing
data = ""
else:
# reset our partial data to this part line
data = last_line
The easiest way to fix your code is to print the received data without adding a new line, which the print statement (Python 2) and the print() function (Python 3) do by default. Like this:
Python 2:
print data,
Python 3:
print(data, end='')
Now print will not add its own new line character to the end of each printed value and only the new lines present in the received data will be printed. The result is that each line is printed without being split based on the amount of data received by each `socket.recv(). For example:
from __future__ import print_function
import socket
s = socket.socket()
s.connect(('gump.gatech.edu', 756))
while True:
data = s.recv(3).decode('utf8')
if not data:
break # socket closed, all data read
print(data, end='')
Here I have used a very small buffer size of 3 which helps to highlight the problem.
Note that this only fixes the problem from the POV of printing the data. If you wanted to process the data line-by-line then you would need to do your own buffering of the incoming data, and process the line when you receive a new line or the socket is closed.
Edit:
socket.recv() is blocking and like the others said, you wont get an exact line each time you call the method. So as a result, the socket is waiting for data, gets what it can get and then returns. When you print this, because of pythons default end argument, you may get more newlines than you expected. So to get the raw stuff from your server, use this:
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('gump.gatech.edu', 756))
try:
while (1):
data=s.recv(BUFFER_SIZE).decode('utf-8')
if not data: break
print(data, end="")
except KeyboardInterrupt:
s.close()

Python serial port read delay

I'm playing around with this serial module in python. I have a little problem with it. I want my script to get a char from the console send it to an AVR board, and read back the response.
Everytime I read from the USB port, and print it out, I see the previous result. Why's that?
For example:
I write 5
I read nothing
I write 6
I read 5
import serial
import sys, time
port=serial.Serial(
port='/dev/ttyUSB0',\
baudrate=9600,\
parity=serial.PARITY_NONE,\
stopbits=serial.STOPBITS_ONE,\
bytesize=serial.EIGHTBITS,\
timeout=0)
i=0
tmp = 0
while True:
tmp=raw_input('send: ')
port.write(tmp)
port.flushOutput()
print port.read(1)
port.flushInput()
From the documentation: "Writes are blocking by default, unless writeTimeout is set. For possible values refer to the list for timeout above." Try setting writeTimeout=0 as well in your constructor.
You are probably receiving a single unexpected byte on startup - either the microcontroller is sending it, or it might be noise from connecting a plug. As you are only reading a single byte for each string transmitted, you will always be off by one.
Instead of port.read(1), try:
while True:
tmp=raw_input('send: ')
port.write(tmp)
port.flushOutput()
print port.read(port.inWaiting())
port.flushInput()
This would also have happened if your typed in more than one character at the input prompt.

Receive image in Python

The following code is for a python server that can receive a string.
import socket
TCP_IP = '127.0.0.1'
TCP_PORT = 8001
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((TCP_IP, TCP_PORT))
s.listen(1)
conn, addr = s.accept()
print 'Connection address:', addr
while 1:
length = conn.recv(1027)
data = conn.recv(int(length))
import StringIO
buff = StringIO.StringIO()
buff.write(data)
if not data: break
print "received data:", data
conn.send('Thanks') # echo
get_result(buff)
conn.close()
Can anyone help me to edit this code or create a similar one to be able to receive images instead of string?
First, your code actually can't receive a string. Sockets are byte streams, not message streams.
This line:
length = conn.recv(1027)
… will receive anywhere from 1 to 1027 bytes.
You need to loop around each recv and accumulate a buffer, like this:
def recvall(conn, length):
buf = b''
while len(buf) < length:
data = conn.recv(length - len(buf))
if not data:
return data
buf += data
return buf
Now you can make it work like this:
while True:
length = recvall(conn, 1027)
if not length: break
data = recvall(conn, int(length))
if not data: break
print "received data:", data
conn.send('Thanks') # echo
You can use StringIO or other techniques instead of concatenation for performance reasons, but I left that out because it's simpler and more concise this way, and understanding the code is more important than performance.
Meanwhile, it's worth pointing out that 1027 bytes is a ridiculous huge amount of space to use for a length prefix. Also, your sending code has to make sure to actually send 1027 bytes, no matter what. And your responses have to always be exactly 6 bytes long for this to work.
def send_string(conn, msg):
conn.sendall(str(len(msg)).ljust(1027))
conn.sendall(msg)
response = recvall(conn, 6)
return response
But at least now it is workable.
So, why did you think it worked?
TCP is a stream of bytes, not a stream of messages. There's no guarantee that a single send from one side will match up with the next recv on the other side. However, when you're running both sides on the same computer, sending relatively small buffers, and aren't loading the computer down too badly, they will often happen to match up 1-to-1. After all, each time you call recv, the other side has probably only had time to send one message, which is sitting in the OS's buffers all by itself, so the OS just gives you the whole thing. So, your code will appear to work in initial testing.
But if you send the message through a router to another computer, or if you wait long enough for the other side to make multiple send calls, or if your message is too big to fit into a single buffer, or if you just get unlucky, there could be 2-1/2 messages waiting in the buffer, and the OS will give you the whole 2-1/2 messages. And then your next recv will get the leftover 1/2 message.
So, how do you make this work for images? Well, it depends on what you mean by that.
You can read an image file into memory as a sequence of bytes, and call send_string on that sequence, and it will work fine. Then the other side can save that file, or interpret it as an image file and display it, or whatever it wants.
Alternatively, you can use something like PIL to parse and decompress an image file into a bitmap. Then, you encode the header data (width, height, pixel format, etc.) in some way (e.g., pickle it), send_string the header, then send_string the bitmap.
If the header has a fixed size (e.g., it's a simple structure that you can serialize with struct.pack), and contains enough information for the other side to figure out the length of the bitmap in bytes, you don't need to send_string each one; just use conn.sendall(serialized_header) then conn.sendall(bitmap).

reading from sys.stdin without newline or EOF

I want to recieve data from my gps-tracker. It sends data by tcp, so I use xinetd to listen some tcp port and python script to handle data. This is xinetd config:
service gps-gprs
{
disable = no
flags = REUSE
socket_type = stream
protocol = tcp
port = 57003
user = root
wait = no
server = /path/to/gps.py
server_args = 3
}
Config in /etc/services
gps-gprs 57003/tcp # Tracking system
And Python script gps.py
#!/usr/bin/python
import sys
def main():
data = sys.stdin.readline().strip()
#do something with data
print 'ok'
if __name__ =='__main__':
main()
The tracker sends data strings in raw text like
$GPRMC,132017.000,A,8251.5039,N,01040.0065,E,0.00,,010111,0,,A*75+79161234567#
The problem is that sys.stdin in python script doesn't recieve end of line or end of file character and sys.stdin.readline() goes forever. I tried to send data from another pc with a python script
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('', 57003))
s.sendall( u'hello' )
data = s.recv(4024)
s.close()
print 'Received', data
and if the message is 'hello', it fails, but if the message is 'hello\n', it's ok and everything is fine. But I don't know ho to tell tracker or xinetd to add this '\n' at the end of messages. How can I read the data from sys.stdin without EOF or EOL in it?
Simple:
data=sys.stdin.read().splitlines()
for i in data:
print i
No newlines
sys.stdin.readline() waits forever until it receives a newline. Then it considers the current line to be complete and returns it in full. If you want to read data that doesn't contain newlines or you don't want to wait until a newline is received before you process (some of) the data, then you're going to have to use something other than readline. Most likely you should call read, which reads arbitrary data up to a given size.
However, your GPS appears to be sending data in the well-known NEMA format, and that format certainly terminates each line with a newline. Actually, it probably terminates each line with CRLF (\r\n) but it is possible that the \r could be getting munged somewhere before it gets to your TCP socket. Either way there's a \n at the very end of each line.
If your readline call is hanging without returning any lines, most likely it's because the sender is buffering lines until it has a full buffer. If you waited long enough for the sender's buffer to fill up, you'd get a whole bunch of lines at once. If that's what's happening, you'll have to change the sender to that it flushes its send buffer after each NEMA sentence.
It seems you are receiving # instead of <CR><LF>, just read until the # sign.
data = ""
while len(data) == 0 or data[-1] <> '#':
data += sys.stdin.read(1)
#do something with data
print 'ok'
My solution :
var = sys.stdin.readline().replace('\n', '')
It :
find the newline in the entry,
replace it from the entry by '' (none) ~remove,
assigne it to variable.

TCP Socket file transfer

I'm trying to write a secure transfer file program using Python and AES and i've got a problem i don't totally understand. I send my file by parsing it with 1024 bytes chunks and sending them over but the server side who receive the data crashes ( I use AES CBC therefore my data length must be a multiple of 16 bytes ) and the error i get says that it is not.
I tried to print the length of the data sent by the client on the client side and the length of the data received on the server and it shows that the client is sending exactly 1024 bytes each time like it's supposed to, but the server side shows that at some point in time, a received packet is not and so less than 1024 bytes ( for example 743 bytes ).
I tried to put a time.sleep(0.5) between each socket send on the client side and it seems to work. Is it possible that it is some kind of socket buffer failure on the server side ? That too much data is being send too fast by the client and that it breaks somehow the socket buffer on the server side so the data is corrupted or vanish and the recv(1024) only receive a broken chunk? That's the only thing i could think of, but this may also be completely false, if anyone has an idea of why this is not working properly it would be great ;)
Following my idea i tried :
self.s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 32768000)
print socket.SO_RCVBUF
I tried to put a 32mbytes buffer on the server side but On Windows XP it shows 4098 on the print and on linux it shows only 8. I don't know how i must interpret this, the only thing i know is that it seems that it doesn't have a 32mbytes buffer so the code doesn't work.
Well it's been a really long post, i hope some of you had the courage to read it all to here ! i'm totally lost there so if anyone has any idea about this please share it :D
Thanks to Faisal my code is here :
Server Side: ( count is my filesize/1024 )
while 1:
txt=self.s.recv(1024)
if txt == " ":
break
txt = self.cipher.decrypt(txt)
if countbis == count:
txt = txt.rstrip()
tfile.write(txt)
countbis+=1
Client side :
while 1:
txt= tfile.read(1024)
if not txt:
self.s.send(" ")
break
txt += ' ' * (-len(txt) % 16)
txt = self.cipher.encrypt(txt)
self.s.send(txt)
Thanks in advance,
Nolhian
Welcome to network programming! You've just fallen into the same mistaken assumption that everyone makes the first time through in assuming that client sends & server recives should be symmetric. Unfortunately, this is not the case. The OS allows reception to occur in arbitrarily sized chunks. It's fairly easy to work around though, just buffer your data until the amount you've read in equals the amount you wish to receive. Something along the lines of this will do the trick:
buff=''
while len(buff) < 1024:
buff += s.recv( 1024 - len(buff) )
TCP is a stream protocol, it doesn't conserve message boundaries, as you have just discovered.
As others have pointed out you're probably processing an incomplete message. You need to either have fixed sized messages or have a delimiter (don't forget to escape your data!) so you know when a complete message has been received.
What TCP can guarantee is that all your data arrives, in the right order, at some point. (Unless something unexpected happens, by which it won't arrive.) But it's very possible that the data you send will still arrive in chunks. Much of it is because of limited send- and receive-buffers. What you should do is to continue doing your recv calls until you have enough data to process it. You might might have to call send multiple times; use its return value to keep track of how much data has been sent/buffered so far.
When you do print socket.SO_RCVBUF, you actually print the symbolic SO_RCVBUF contant (except that Python doesn't really have constants); the one used to tell setsockopt what you want to change. To get the current value, you should instead call getsockopt.
Not related to TCP (as that has been answered already), but appending to a string repeatedly will be rather inefficient if you're expecting to receive a lot. It might be better to append to a list and then turn the list into a string when you finished receiving by using ''.join(list).
For many applications, the complexities of TCP are neatly abstracted by Python's asynchat module.
Here is the nice snippet of code that I wrote some time ago, may be not the best , but it could be good example of big files transfer over the local network. http://setahost.com/sending-files-in-local-network-with-python/
As mentioned above
TCP is a stream protocol
You can try this code, where the data is your original data, you can read it from the file or user input
Sender
import socket as s
sock = s.socket(s.AF_INET, s.SOCK_STREAM)
sock.connect((addr,5000))
sock.sendall(data)
finish = t.time()
Receiver
import socket as s
sock = s.socket(s.AF_INET, s.SOCK_STREAM)
sock.setsockopt(s.SOL_SOCKET, s.SO_REUSEADDR, 1)
sock.bind(("", 5000))
sock.listen(1)
conn, _ = sock.accept()
pack = []
while True:
piece = conn.recv(8192)
if not piece:
break
pack.append(piece.decode())

Categories