socket.shutdown vs socket.close

socket.shutdown vs socket.close - python

I recently saw a bit of code that looked like this (with sock being a socket object of course):
sock.shutdown(socket.SHUT_RDWR)
sock.close()
What exactly is the purpose of calling shutdown on the socket and then closing it? If it makes a difference, this socket is being used for non-blocking IO.

Calling close and shutdown have two different effects on the underlying socket.
The first thing to point out is that the socket is a resource in the underlying OS and multiple processes can have a handle for the same underlying socket.
When you call close it decrements the handle count by one and if the handle count has reached zero then the socket and associated connection goes through the normal close procedure (effectively sending a FIN / EOF to the peer) and the socket is deallocated.
The thing to pay attention to here is that if the handle count does not reach zero because another process still has a handle to the socket then the connection is not closed and the socket is not deallocated.
On the other hand calling shutdown for reading and writing closes the underlying connection and sends a FIN / EOF to the peer regardless of how many processes have handles to the socket. However, it does not deallocate the socket and you still need to call close afterward.

Here's one explanation:
Once a socket is no longer required,
the calling program can discard the
socket by applying a close subroutine
to the socket descriptor. If a
reliable delivery socket has data
associated with it when a close takes
place, the system continues to attempt
data transfer. However, if the data is
still undelivered, the system discards
the data. Should the application
program have no use for any pending
data, it can use the shutdown
subroutine on the socket prior to
closing it.

Explanation of shutdown and close: Graceful shutdown (msdn)
Shutdown (in your case) indicates to the other end of the connection there is no further intention to read from or write to the socket. Then close frees up any memory associated with the socket.
Omitting shutdown may cause the socket to linger in the OSs stack until the connection has been closed gracefully.
IMO the names 'shutdown' and 'close' are misleading, 'close' and 'destroy' would emphasise their differences.

it's mentioned right in the Socket Programming HOWTO (py2/py3)
Disconnecting
Strictly speaking, you’re supposed to use shutdown on a socket before you close it.
The shutdown is an advisory to the socket at the other end. Depending on the argument you pass it, it can mean “I’m not going to send anymore, but I’ll still listen”, or “I’m not listening, good riddance!”.
Most socket libraries, however, are so used to programmers neglecting to use this piece of etiquette that normally a close is the same as shutdown(); close().
So in most situations, an explicit shutdown is not needed.
...

Isn't this code above wrong?
The close call directly after the shutdown call might make the kernel discard all outgoing buffers anyway.
According to
http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
one needs to wait between the shutdown and the close until read returns 0.

there are some flavours of shutdown: http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.shutdown.aspx. *nix is similar.

Shutdown(1) , forces the socket no to send any more data
This is usefull in
1- Buffer flushing
2- Strange error detection
3- Safe guarding
Let me explain more , when you send a data from A to B , it's not guaranteed to be
sent to B , it's only guaranteed to be sent to the A os buffer ,
which in turn sends it to the B os buffer
So by calling shutdown(1) on A , you flush A's buffer and an error is raised
if the buffer is not empty ie: data has not been sent to the peer yet
Howoever this is irrevesable , so you can do that after you completely
sent all your data and you want to be sure that it's atleast at the peer
os buffer

Related

What is the correct way to close a raw socket in python?

I have a script that creates raw PACKET sockets to capture all incoming traffic, and I want to make sure that when the script finishes, the sockets are closed.
From python socket documentation, I understand that shutdown() and close() methods should be used to close a socket in a timely fashion.
However, I guess that for this type of socket, SHUT_RD, SHUT_WR and SHUT_RDWR modes can not be used, which renders shutdown() unusable.
On occasions, when I used close() only, the script hanged and I had to wait for it a lot, for the socket to actually close.
My question is: On Linux, how can I close a raw socket immediately, if I no longer need it?

What is the timeout value for a non-blocking Python socket when reading?

I'm using Python 2.7 and am working with some legacy code. It sets a socket in non blocking mode with:
self._socket.setblocking(0)
self._socket.settimeout(0)
My question is, when doing a read, what determines the timeout on the socket? Will it be the default used by the TCP stack on the OS? If so, on Linux how would that be changed? Also, will the write timeout be the same as the read timeout?

According to the documentation (emphasis mine):
In non-blocking mode, if a recv() call doesn’t find any data, or if a
send() call can’t immediately dispose of the data, a error exception
is raised.
So it seems as though the "timeout" is an instantaneous check. If there is no data available or a write can't be made exactly when you call the function, you will receive an exception.

Following is from the help of socket.settimeout
settimeout(...) method of socket._socketobject instance
settimeout(timeout)
Set a timeout on socket operations. 'timeout' can be a float,
giving in seconds, or None. Setting a timeout of None disables
the timeout feature and is equivalent to setblocking(1).
Setting a timeout of zero is the same as setblocking(0).
So I am not sure why both setblocking(0) and settimeout(0) both is done above. This means read is immediately going to return with EAGAIN or EWOULDBLOCK if there's no data available. When the other end closes connection, read will return with a value of 0.
Read timeout would make sense only in 'blocking' state where after the time equal to timeout has elapsed and there's no data to read, the read should return EAGAIN or EWOULDBLOCK.
In general that is not going to affect write timeout. When you set socket to non-blocking and if the 'write buffers' are full (which would rarely happen, unless there's a rather 'slow' receiver) and if write is likely to block,it should return immediately with EAGAIN or EWOULDBLOCK and then the onus is on the application to make sure the write is issued again.

Python close tcp connection

I have a python tcp server, there is thread for every connection to listen.
When I call close on connection object exception "bad file descriptor" is thrown.
By googling I've found some solutions, each using loop in order to receive client data and breaking that loop, when they decide to disconnect client. My client is written in C# and does not "get", that it's "disconnected" from server, python simply ignores incomming data from C# client.
What's the legit, best practice way to disconnect tcp connection from server side in python ?
Thanks in advance

A bad file descriptor, most likely, means that the socket was already closed by another thread.
Here are some thoughts on general practices. For the client, one way to know that it is disconnected is to check if the recv() value is 0. If it is, then that means the remote side has closed the connection. Basically, you should use select (or poll) and pass fds of all the clients and teh server to select. If you get a read event on any of the fds, then depending upon the fd type, here is what happens. If the fd is server type, then a read event means that there is a pending connection and you should issue an accept() to get the new connection. On the other hand, if hte fd is a non-server type (meaning a regular tcp connection), then a read event means that there is some data and you should issue a recv() event to read data.
You would use a loop for the select. Basically, start the loop using a select() call and once you get an event, do something with that event, and then reenter the loop and issue the next select().
You might find these links helpful: http://ilab.cs.byu.edu/python/select/echoserver.html and http://docs.python.org/2/library/socket.html

From the docs:
Note: close() releases the resource associated with a connection but
does not necessarily close the connection immediately. If you want to
close the connection in a timely fashion, call shutdown() before
close().
So you should call shutdown() before calling close(). Also you should pass SHUT_RDWR flag to completely shutdown the connection:
from socket import SHUT_RDWR
...
try:
s.shutdown(SHUT_RDWR)
s.close()
except Exception:
pass
The "bad file description" error means (most likely) that the socket is already closed (at least from Python side).

What is the correct procedure for multiple, sequential communications over a socket?

I've been struggling along with sockets, making OK progress, but I keep running into problems, and feeling like I must be doing something wrong for things to be this hard.
There are plenty of tutorials out there that implement a TCP client and server, usually where:
The server runs in an infinite loop, listening for and echoing back data to clients.
The client connects to the server, sends a message, receives the same thing back, and then quits.
That I can handle. However, no one seems to go into the details of what you should and shouldn't be doing with sequential communication between the same two machines/processes.
I'm after the general sequence of function calls for doing multiple messages, but for the sake of asking a real question, here are some constraints:
Each event will be a single message client->server, and a single string response.
The messages are pretty short, say 100 characters max.
The events occur relatively slowly, max of say, 1 every 5 seconds, but usually less than half that speed.
and some specific questions:
Should the server be closing the connection after its response, or trying to hang on to the connection until the next communication?
Likewise, should the client close the connection after it receives the response, or try to reuse the connection?
Does a closed connection (either through close() or through some error) mean the end of the communication, or the end of the life of the entire object?
Can I reuse the object by connecting again?
Can I do so on the same port of the server?
Or do I have reinstantiate another socket object with a fresh call to socket.socket()?
What should I be doing to avoid getting 'address in use' errors?
If a recv() times out, is the socket reusable, or should I throw it away? Again, can I start a new connection with the same socket object, or do I need a whole new socket?

If you know that you will communicate between the two processes soon again, there is no need for closing the connection. If your server has to deal with other connections as well, you want to make it multithreaded, though.
The same. You know that both have to do the same thing, right?
You have to create a new socket on the client and you can also not reuse the socket on the server side: you have to use the new socket returned by the next (clientsocket, address) = serversocket.accept() call. You can use the same port. (Think of webservers, they always accept connections to the same port, from thousands of clients)
In both cases (closing or not closing), you should however have a message termination sign, for example a \n. Then you have to read from the socket until you have reached the sign. This usage is so common, that python has a construct for that: socket.makefile and file.readline
UPDATE:
Post the code. Probably you have not closed the connection correctly.
You can call recv() again.
UPDATE 2:
You should never assume that the connection is reliable, but include mechanisms to reconnect in case of errors. Therefore it is ok to try to use the same connection even if there are longer gaps.
As for errors you get: if you need specific help for your code, you should post small (but complete) examples.

how to unblock a blocked socket?

Synopsis:
My program occasionally runs into a condition where it wants to send data over a socket, but that socket is blocked waiting for a response to a previous command that never came. Is there any way to unblock the socket and pick back up with it when this happens? If not that, how could I test whether the socket is blocked so I could close it and open a new one? (I need blocking sockets in the first place)
Details:
I'm connecting to a server over two sockets. Socket 1 is for general command communication. Socket 2 is for aborting running commands. Aborts can come at any time and frequently. Every command sent over socket 1 gets a response, such as:
socket1 send: set command data
socket1 read: set command ack
There is always some time between the send and the read, as the server doesn't send anything back until the command is finished executing.
To interrupt commands in progress, I connect over a another socket and issue an abort command. I then use socket 1 to issue a new command.
I am finding that occasionally commands issued over socket 1 after an abort are hanging the program. It appears that socket 1 is blocked waiting for a response to a previously issued command that never returned (and that got interrupted). While usually it works sometimes it doesn't (I didn't write the server).
In these cases, is there any way for me to check to see if socket 1 is blocked waiting for a read, and if so, abandon that read and move on? Or even any way to check at all so I can close that socket and start again?
thx!
UPDATE 1: thanks for the answers. As for why I'm using blocking sockets, it's because I'm controlling a CNC-type machine with this code, and I need to know when the command I've asked it to execute is done executing. The server returns the ACK when it's done, so that seems like a good way to handle it. I like the idea of refactoring for non-blocking but can't envision a way to get info on when the command is done otherwise. I'll look at select and the other options.

Not meaning to seem disagreeable, but you say you need blocking sockets and then go on to describe some very good reasons for needing non-blocking sockets. I would recommend refactoring to use non-blocking.
Aside from that, the only method I'm aware of to know if a socket is blocked is the fact that your program called recv or one of its variants and has not yet returned. Someone else may know an API that I don't, but setting a "blocked" boolean before the recv call and clearing it afterward is probably the best hack to get you that info. But you don't want to do that. Trust me, the refactor will be worth it in the long run.

The traditional solution to this problem is to use select. Before writing, test whether the socket will support writing, and if not, do something else (such as waiting for a response first). One level above select, Python provides the asyncore module to enable such processing. Two level above, Twisted is an entire framework dealing with asynchronous processing of messages.

Sockets should be full duplex. If Python blocks a thread from writing to a socket while another thread is reading from the same socket I would regard it as a major bug in Python. This doesn't occur in any other programming language I've used.

What you really what is to block on a select() or poll(). The only way to unblock a blocked socket is to receive data or a signal which is probably not acceptable. A select() or poll() call can block waiting for one or more sockets, either on reading or writing (waiting for buffer space). They can also take a timeout if you want to wait periodically to check on other things. Take a look at my answer to Block Socket with Unix and C/C++ Help

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.