How to read a binary file? - python

I'm trying to send a file between a client and a server in my home network. I just want to test with a simple file, client.txt.
The client program should read X bytes and send it over the tcp socket I've created, but I cant wrap my head around how to do the sending part:
f = open("client.txt", "rb")
while 1:
// should read X bytes and send to the socket
I think I need to check if the data I want to send is valid, if a file for instance is smaller then the amount (1024 for instance) I'm sending in each batch.... or does it not work that way?

Since you mentioned you have problems setting up the server part, I'll rip this out from Python documentation and edit it slightly:
import socket
HOST = ''
PORT = 50007
s = socket.socket()
s.bind((HOST, PORT))
s.listen(1)
conn, addr = s.accept()
f = open("client.txt", "rb")
while 1:
data = f.read(1024)
if not data: break
conn.send(data)
conn.close()
The relevant document can be found here

read() takes an optional parameter that specifies the number of bytes to read in.
Documentation
To read a file’s contents, call
f.read(size), which reads some
quantity of data and returns it as a
string. size is an optional numeric
argument. When size is omitted or
negative, the entire contents of the
file will be read and returned; it’s
your problem if the file is twice as
large as your machine’s memory.
Otherwise, at most size bytes are read
and returned. If the end of the file
has been reached, f.read() will return
an empty string ("").

Related

Why these Python send / receive socket functions work if invoked slowly, but fail if invoked quickly in a row?

I have a client and a server, where the server needs to send a number of text files to the client.
The send file function receives the socket and the path of the file to send:
CHUNKSIZE = 1_000_000
def send_file(sock, filepath):
with open(filepath, 'rb') as f:
sock.sendall(f'{os.path.getsize(filepath)}'.encode() + b'\r\n')
# Send the file in chunks so large files can be handled.
while True:
data = f.read(CHUNKSIZE)
if not data:
break
sock.send(data)
And the receive file function receives the client socket and the path where to save the incoming file:
CHUNKSIZE = 1_000_000
def receive_file(sock, filepath):
with sock.makefile('rb') as file_socket:
length = int(file_socket.readline())
# Read the data in chunks so it can handle large files.
with open(filepath, 'wb') as f:
while length:
chunk = min(length, CHUNKSIZE)
data = file_socket.read(chunk)
if not data:
break
f.write(data)
length -= len(data)
if length != 0:
print('Invalid download.')
else:
print('Done.')
It works by sending the file size as the first line, then sending the text file line by line.
Both are invoked in loops in the client and the server, so that files are sent and saved one by one.
It works fine if I put a breakpoint and invoke these functions slowly. But If I let the program run uninterrupted, it fails when reading the size of the second file:
File "/home/stark/Work/test/networking.py", line 29, in receive_file
length = int(file_socket.readline())
ValueError: invalid literal for int() with base 10: b'00,1851,-34,-58,782,-11.91,13.87,-99.55,1730,-16,-32,545,-12.12,19.70,-99.55,1564,-8,-10,177,-12.53,24.90,-99.55,1564,-8,-5,88,-12.53,25.99,-99.55,1564,-8,-3,43,-12.53,26.54,-99.55,0,60,0\r\n'
Clearly a lot more data is being received by that length = int(file_socket.readline()) line.
My questions: why is that? Shouldn't that line read only the size given that it's always sent with a trailing \n?
How can I fix this so that multiple files can be sent in a row?
Thanks!
It seems like you're reusing the same connection and what happens is your file_socket being buffered means... you've actually recved more from your socket then you'd think with your read loop.
I.e. the receiver consumes more data from your socket and next time you attempt to readline() you end up reading rest of the previous file up to the new line contained therein or of the next length information.
This also means your initial problem actually is you've skipped a while. Effect of which is next read line is not an int you expected and hence the observed failure.
You can say:
with sock.makefile('rb', buffering=0) as file_socket:
instead to force the file like access being unbuffered. Or actually handle the receiving and buffering and parsing of incoming bytes (understanding where one file ends and the next one begins) on your own (instead of file like wrapper and readline).
You have to understand that socket communication is based on TCP/IP, does not matter if it's same machine (you use loopback in such cases) or different machines. So, you've got some IP addresses between which the connection is established. Going further, it involves accessing your network adapter, ie takes relatively long in comparison to accessing eg. RAM. Additionally, the adapter itself manages when to send particular data frames (lower ISO/OSI layers). Basically, in case of TCP there's ACK required, but on standard PC this is usually not some industrial, real-time ethernet.
So, in your code, you've got a while True loop without any sleep and you don't check what does sock.send returns. Even if something goes wrong with particular data frame, you ignore it and try to send next. On first glance it appears that something has been cached and receiver received what was flushed once connection was re-established.
So, first thing which you should do is check if sock.send indeed returned number of bytes sent. If not, I believe the frame should be re-sent. Another thing which I strongly recommend in such cases is think of some custom protocol (this is usually called application layer in context of OSI/ISO stack). For example, you might have 4 types of frames: START, FILESIZE, DATA, END, assign unique ID and start each frame with the identifier. Then, START is gonna be empty, FILESIZE gonna contain single uint16, DATA is gonna contain {FILE NUMBER, LINE NUMBER, LINE_LENGTH, LINE} and END is gonna be empty. Then, once you've got entire frame on the client, you can safely assemble the information you received.

how to send UDP data/packet using Socket programming without converting it into bytes using python?

I tried to create a UDP socket and was successful in creating one.
Now when I tried to send some raw data, socket.send wants me to convert it into bytes.
But this raw data is a command to change the time of my application which I am working on, so i wanted to send the data as it is.
Is there a way to send this without converting it into bytes?
here's the code i used:
Socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
bufferSize = 1024
recvBuf = 4096
new_pkt= x00\x1c\x7fb\xb5\xfd\x00PV\xb8\x08\x9f\x08\x00E\x00\x000B\xad\x00\x00\x80\x11\x00\x00\n\xe7\xa0\xc6\n\xe7\x922\xc0\xb8\x05\xdc\x00\x1cH\xf4\t\x8d\x01\x00\x01\x01\x00\x10\x00\x00\x01,\x00\x00\x00\x004\xe6\xc0
S = Socket.connect_ex(("10.231.146.50",port))
Socket.settimeout(10)
Res = Socket.sendall(new_pkt)
Socket.close
obtained an error to convert the packet into bytes, while trying to send it
I get no error with Python2.7, for send or sendall, if I put double quotes around your data string
new_pkt= "x00\x1c\x7fb\xb5\xfd\x00PV\xb8\x08\x9f\x08\x00E\x00\x000B\xad\x00\x00\x80\x11\x00\x00\n\xe7\xa0\xc6\n\xe7\x922\xc0\xb8\x05\xdc\x00\x1cH\xf4\t\x8d\x01\x00\x01\x01\x00\x10\x00\x00\x01,\x00\x00\x00\x004\xe6\xc0"
Give it a try ;-)
If you use Python3.x, you need bytes. So you just prefix your string with a b:
new_pkt_for_py3 = b"x00\x1c\x7fb\xb5\xfd...."

Python Socket is receiving inconsistent messages from Server

So I am very new to networking and I was using the Python Socket library to connect to a server that is transmitting a stream of location data.
Here is the code used.
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((gump.gatech.edu, 756))
try:
while (1):
data = s.recv(BUFFER_SIZE).decode('utf-8')
print(data)
except KeyboardInterrupt:
s.close()
The issue is that the data arrives in inconsistent forms.
Most of the times it arrives in the correct form like this:
2016-01-21 22:40:07,441,-84.404153,33.778685,5,3
Yet other times it can arrive split up into two lines like so:
2016-01-21
22:40:07,404,-84.396004,33.778085,0,0
The interesting thing is that when I establish a raw connection to the server using Putty I only get the correct form and never the split. So I imagine that there must be something happening that is splitting the message. Or something Putty is doing to always assemble it correctly.
What I need is for the variable data to contain the proper line always. Any idea how to accomplish this?
It is best to think of a socket as a continuous stream of data, that may arrive in dribs and drabs, or a flood.
In particular, it is the receivers job to break the data up into the "records" that it should consist of, the socket does not magically know how to do this for you. Here the records are lines, so you must read the data and split into lines yourself.
You cannot guarantee that a single recv will be a single full line. It could be:
just part of a line;
or several lines;
or, most probably, several lines and another part line.
Try something like: (untested)
# we'll use this to collate partial data
data = ""
while 1:
# receive the next batch of data
data += s.recv(BUFFER_SIZE).decode('utf-8')
# split the data into lines
lines = data.splitlines(keepends=True)
# the last of these may be a part line
full_lines, last_line = lines[:-1], lines[-1]
# print (or do something else!) with the full lines
for l in full_lines:
print(l, end="")
# was the last line received a full line, or just half a line?
if last_line.endswith("\n"):
# print it (or do something else!)
print(last_line, end="")
# and reset our partial data to nothing
data = ""
else:
# reset our partial data to this part line
data = last_line
The easiest way to fix your code is to print the received data without adding a new line, which the print statement (Python 2) and the print() function (Python 3) do by default. Like this:
Python 2:
print data,
Python 3:
print(data, end='')
Now print will not add its own new line character to the end of each printed value and only the new lines present in the received data will be printed. The result is that each line is printed without being split based on the amount of data received by each `socket.recv(). For example:
from __future__ import print_function
import socket
s = socket.socket()
s.connect(('gump.gatech.edu', 756))
while True:
data = s.recv(3).decode('utf8')
if not data:
break # socket closed, all data read
print(data, end='')
Here I have used a very small buffer size of 3 which helps to highlight the problem.
Note that this only fixes the problem from the POV of printing the data. If you wanted to process the data line-by-line then you would need to do your own buffering of the incoming data, and process the line when you receive a new line or the socket is closed.
Edit:
socket.recv() is blocking and like the others said, you wont get an exact line each time you call the method. So as a result, the socket is waiting for data, gets what it can get and then returns. When you print this, because of pythons default end argument, you may get more newlines than you expected. So to get the raw stuff from your server, use this:
import socket
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('gump.gatech.edu', 756))
try:
while (1):
data=s.recv(BUFFER_SIZE).decode('utf-8')
if not data: break
print(data, end="")
except KeyboardInterrupt:
s.close()

Receive image in Python

The following code is for a python server that can receive a string.
import socket
TCP_IP = '127.0.0.1'
TCP_PORT = 8001
BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((TCP_IP, TCP_PORT))
s.listen(1)
conn, addr = s.accept()
print 'Connection address:', addr
while 1:
length = conn.recv(1027)
data = conn.recv(int(length))
import StringIO
buff = StringIO.StringIO()
buff.write(data)
if not data: break
print "received data:", data
conn.send('Thanks') # echo
get_result(buff)
conn.close()
Can anyone help me to edit this code or create a similar one to be able to receive images instead of string?
First, your code actually can't receive a string. Sockets are byte streams, not message streams.
This line:
length = conn.recv(1027)
… will receive anywhere from 1 to 1027 bytes.
You need to loop around each recv and accumulate a buffer, like this:
def recvall(conn, length):
buf = b''
while len(buf) < length:
data = conn.recv(length - len(buf))
if not data:
return data
buf += data
return buf
Now you can make it work like this:
while True:
length = recvall(conn, 1027)
if not length: break
data = recvall(conn, int(length))
if not data: break
print "received data:", data
conn.send('Thanks') # echo
You can use StringIO or other techniques instead of concatenation for performance reasons, but I left that out because it's simpler and more concise this way, and understanding the code is more important than performance.
Meanwhile, it's worth pointing out that 1027 bytes is a ridiculous huge amount of space to use for a length prefix. Also, your sending code has to make sure to actually send 1027 bytes, no matter what. And your responses have to always be exactly 6 bytes long for this to work.
def send_string(conn, msg):
conn.sendall(str(len(msg)).ljust(1027))
conn.sendall(msg)
response = recvall(conn, 6)
return response
But at least now it is workable.
So, why did you think it worked?
TCP is a stream of bytes, not a stream of messages. There's no guarantee that a single send from one side will match up with the next recv on the other side. However, when you're running both sides on the same computer, sending relatively small buffers, and aren't loading the computer down too badly, they will often happen to match up 1-to-1. After all, each time you call recv, the other side has probably only had time to send one message, which is sitting in the OS's buffers all by itself, so the OS just gives you the whole thing. So, your code will appear to work in initial testing.
But if you send the message through a router to another computer, or if you wait long enough for the other side to make multiple send calls, or if your message is too big to fit into a single buffer, or if you just get unlucky, there could be 2-1/2 messages waiting in the buffer, and the OS will give you the whole 2-1/2 messages. And then your next recv will get the leftover 1/2 message.
So, how do you make this work for images? Well, it depends on what you mean by that.
You can read an image file into memory as a sequence of bytes, and call send_string on that sequence, and it will work fine. Then the other side can save that file, or interpret it as an image file and display it, or whatever it wants.
Alternatively, you can use something like PIL to parse and decompress an image file into a bitmap. Then, you encode the header data (width, height, pixel format, etc.) in some way (e.g., pickle it), send_string the header, then send_string the bitmap.
If the header has a fixed size (e.g., it's a simple structure that you can serialize with struct.pack), and contains enough information for the other side to figure out the length of the bitmap in bytes, you don't need to send_string each one; just use conn.sendall(serialized_header) then conn.sendall(bitmap).

Python sockets - sending string in chunks of 10 bytes

I have simple tcp socket program and I would like to send strings in chunks of 10 bytes. The server will join the chunks.
However I'm not sure how to split a string into binary and how to send the chunks of binaries. Instead of sending 512 bytes at one time I want to send 10 byte several times.
I have found a module Pickle that can serialize data into bytestrings (?) but how do I apply socket.send() on this?
Server:
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.bind(("", my_port))
server_socket.listen(5)
client_socket, address = server_socket.accept()
data = client_socket.recv(512)
Client:
message = "some string I want to send in chunks"
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect((host, my_port))
client_socket.send(message)
client_socket.close()
First of all, your code is not actually sending 512 bytes at a time, it's receiving 512 bytes at a time.
Now, I think you're really asking two questions.
How do you send 10 bytes at a time over a TCP connection (you are using socket.SOCK_STREAM)
How do you send raw bytes using a Python socket.
Let's answer 2. first: If you call socket.send on a byte string, it should be sent out in binary as TCP payload.
For 1., the simplest approach would be to split the data into chunks (now that you know you're operating on strings, you can simply do that using the slice operations (see the Python tutorial on strings - e.g. s[0:10], s[10:20] etc). Next, you need to ensure these slices are sent individually. This could be done by calling socket.send, but the problem is, that your TCP/IP stack may group these into packets even if you don't want it to - you have after all asked it to provide you with a stream socket, socket.SOCK_STREAM. If you were writing to a file, what you'd do in this case is you'd flush the file, but this does not appear to be easy for Python sockets (see this question).
Now, I think the answer to that question, saying it's impossible, is not quite right. It appears that Scapy will let you send 10 byte TCP chunks (I got the chunks() function from here). I checked it in wireshark and tried it multiple times with consistent results, but I didn't check the implementation to make sure this is guaranteed to happen.
You should probably ask yourself why you want to send data in chunks of 10 bytes, rather than let TCP deal with what it was designed for, and consider using delimiters.
Anyway, here's the code using Scapy (fun fact: it looks like running this does not require root privileges):
client:
from scapy.all import *
import socket
host = '192.168.0.x' #replace with your IP
my_port = 3002
message = "some string I want to send in chunks"
def chunks(lst, n):
"Yield successive n-sized chunks from lst"
for i in xrange(0, len(lst), n):
yield lst[i:i+n]
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.connect((host, my_port))
ss=StreamSocket(client_socket,Raw)
for chunk in chunks(message, 10):
print "sending: " + chunk
ss.send(Raw(chunk) )
client_socket.close()
server:
import socket
my_port=3002
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.bind(("", my_port))
server_socket.listen(5)
client_socket, address = server_socket.accept()
while (True):
data = client_socket.recv(512)
if (data):
print data
else:
break

Categories