I'm trying to make a python function that scans a range of addresses. I started a socket and pass the socket as an argument to the function that connects to it:
def scan(socket, address, port):
c = socket.connect_ex((address, port))
print(c)
then I call scan for each address, each in its own thread. I'm getting Error 114: Operation already in progress..
Do I need to start a new socket for each connection? I'm trying to read about socket reusage, and I found that there exists flags like SO_ADDREUSE or something like that. I tried to insert but it didn't work.
I'm trying to think how a socket works. I think the moment I create one, it choses a tcp source port, and then when I create a connection, it sends to a destination port. I think I can't reuse the same socket because the source port would be the same for all destination ports, so the clients would answer to the same port and would cause confusion.
So do I need to create a new socket for each connection?
You can not connect stream socket multiple times.
One of the connect possible errors is EISCONN.
The socket is already connected.
This goes for stream sockets.
man bind also has this:
[EINVAL] The socket is already bound to an address, and the
protocol does not support binding to a new address; or
the socket has been shut down.
Again, this goes for stream sockets.
From the man connect:
Generally, stream sockets may successfully connect() only once; datagram sockets may use connect() multiple times to change their association.
I made emphasis on the important line.
stream sockets can not be connected multiple times. datagram sockets can be connected multiple times. Generally speaking, BSD sockets have multiple protocols, types, domains avaible. You shall read documentation for your particular case.
P.S Get yourself familiar with the readings that were suggested in the comment to your question. That will explain enough to manipulate socket family of functions.
Do I need to start a new socket for each connection?
Yes.
I'm trying to read about socket reusage
There is no such thing as 'socket reusage'. There is port reuse. Not the same thing. You cannot reconnect an existing socket once you've tried to connect it, even if the connect attempt failed.
I found that there exists flags like SO_ADDREUSE or something like that
SO_REUSEADDR means to reuse the port. Not the socket.
I'm trying to think how a socket works. I think the moment I create one, it choses a tcp source port,
Between creating a socket using the socket() system call and using it to create an outgoing connection with the connect() system call, there is an opportunity to optionally use the bind() system call to set source IP address and/or port if you want to. If you don't use bind(), the operating system will automatically bind the socket to the first available port in the appropriate range when you use the connect() system call. In this case, the source IP address is normally selected to match the network interface that provides the shortest route to the specified destination according to the routing table.
At least, that's how it works at the system call level. Some programming languages or libraries may choose to combine some of these operations into one.
To your actual question, man 7 ip says:
A TCP local socket address that has been bound is unavailable for some
time after closing, unless the SO_REUSEADDR flag has been set. Care
should be taken when using this flag as it makes TCP less reliable.
The idea is to delay the re-use of a port until any possible re-sent copies of packages that belonged to the closed connection have for sure expired on the network.
According to the bind() man page, trying to re-bind a socket that is already bound to an address will result in an EINVAL error. So "recycling" a socket using bind(socket, INADDR_ANY, 0) (after ending a connection that used SO_REUSEADDR) does not seem to be possible.
And even if that would be possible, when you're using multiple threads on a modern multi-core system, you end up (very probably) doing multiple things in parallel. A socket can be used for just one outgoing connection at a time. Each of your scan threads will need its own socket.
Related
Thank you for reading this, I appreciate any help!
I don't really seem to find an answer that satisfies the following questions, mostly explained unclearly.
Imagine I'd create a socket object in Python:
socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Then, I'd like to set the options of that socket object (server), with the following three arguments.
socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR,1)
I'm kind of confused by these arguments.
Firstly, the SOL_SOCKET, is it some kind of constant value that actually allows the following arguments in the signature (like reuseaddr) to implement it on the socket level? (More info is welcome)
Secondly the REUSEADDR what does it actually do? It allows the server to reuse (accept connections) the same ip and port, while it's in close-wait or time-wait state. If that's correct, I don't seem to get why that is needed, can't I just keep accepting connections on the same port and ip, without using it, isn't that setting automatically used, it would be my best guess that you can have multiple connections on a single port and ip address without using that argument?
Finally, what does the 1 mean at the end?
The primary reason I'm asking this question because I thought if I wouldn't use REUSEADDR that I could still accept other connections on the same port and ip
Thank you for the help, have a great day!
Firstly, the SOL_SOCKET, is it some kind of constant value that actually allows the following arguments in the signature (like reuseaddr) to implement it on the socket level?
Yes. setsockopt() options are organized in groups identified by levels. There are socket-level options, IP-level options, TCP-level options, etc. SO_REUSEADDR (and SO_REUSEPORT) is a socket-level option, as it affects the socket object itself (when it is binding to a local IP/port pair).
Secondly the REUSEADDR what does it actually do?
This is well-documented on most platforms. Python sockets are just a thin wrapper around platform BSD-style sockets.
It allows the server to reuse (accept connections) the same ip and port, while it's in close-wait or time-wait state.
It has nothing to do with accepting connections. It has only to do with being able to bind() a new socket to a local IP/port pair after a previous socket has stopped using that same pair.
If that's correct, I don't seem to get why that is needed
Because a local IP/port pair can't normally be reused for a new socket binding while the pair is in the CLOSE_WAIT or TIME_WAIT state. The whole purpose of those states is to wait a period of time for pending data to be flushed for a previous communication. By allowing a new socket to re-use the IP/port pair while data is still pending, a new socket can potentially read data from a previous conversation. So, SO_REUSEADDR is disabled by default. But this is not really a problem for TCP server sockets (more so for UDP sockets), so SO_REUSEADDR is commonly used to allow rapid reuse of the IP/port pair after closing a server and restarting it.
can't I just keep accepting connections on the same port and ip, without using it, isn't that setting automatically used
If your listening TCP socket stays active, yes. SO_REUSEADDR has nothing to do with a bound listening socket's ability to accept client connections.
it would be my best guess that you can have multiple connections on a single port and ip address without using that argument?
Once a listening socket has been successfully bound to the IP/port pair, yes.
Finally, what does the 1 mean at the end?
SO_REUSEADDR is a boolean option. It only has two defined values, 0 (off) and 1 (on).
The primary reason I'm asking this question because I thought if I wouldn't use REUSEADDR that I could still accept other connections on the same port and ip
As long as your listening socket is active, yes. But if you close your listening socket and create a new one, it has to be re-bound to a local IP/port before it can start accepting connections, and that IP/port might not be ready for re-use yet, unless SO_REUSEADDR is enabled.
The context related to my question is that I work for a company in the networking area. This company has several stores around the country where DVRs are accessible through port 2781 and a domain for people to access security cameras, the problem is that in order for these people to successfully access DVRs through the domain and port you must have a DMZ configured in the modem of the stores. To corroborate the DMZ I'm trying to use Python with the sockets module but I don't understand the best way to do it yet.
import socket
s = socket.socket()
s.connect((domain, port))
s.close()
Once I make the proper connection which is the best way to check if there is a communication? Work it with an exception or just use socket.recv and detect if it is empty?
In order for connect to succeed there already has to be some kind of connection be done. Otherwise the TCP handshake would fail. Thus the first step would be to check if connect succeeds or throws an exception.
It can still be possible that there is some deep packet inspection firewall in place which does not block the initial connection but only blocks the later data exchange. To find out if this is the case you have to do actual bidirectional communication. But how this communication should look like depends on the specific application protocol which is unknown in your case. Still you need to check that a) sending and receiving works (catching exceptions) and b) that a response returns the expected data.
I'm building a network music player with my Raspberry Pi and I'm trying to come up with a scheme that will allow me to send a "command" to my Pi that will allow it to do various things over the network (such as transport control).
This is what I'm thinking on the receiver (in sort-of pseudo-code):
while True:
while nothingIsRecvD:
do_stuff()
do_something_with(theDataRecvDfromSocket)
Is there some basic code for beginners I can look at?
You'll need to use the socket module and the select module.
To set up the socket, you'll need to
Use socket.socket to create a socket. You'll probably want to use the AF_INET address family. For TCP, use SOCK_STREAM; for UDP, use SOCK_DGRAM.
bind the socket to the interface and port you want to listen on.
For TCP, call listen on the socket. 5 is the typical backlog value used.
If you're using TCP, you've just created a listening socket. In order to actually receive data, you'll need to accept a connection using accept. With a connected socket you can recv or send data.
UDP is similar, except accepting is not necessary and you'll use recvfrom and sendto rather than recv and send.
These methods block, however, and if I understand you correctly, you don't want that. select.select lets you wait for an event to occur on any of a given set of sockets. You can also provide a zero timeout if you want to just check if there is some activity. Once it has detected activity, you can usually perform the appropriate action once without blocking.
Once you're done with sockets, be polite and close them after shutting down any connected sockets.
You could consider using sockets to communicate between the music player and server. The recv() call (typically used with TCP sockets) or recvfrom() call (typically used with UDP sockets) are blocking -- so they should provide a nice blocking context to your nothingIsRecvd case and would allow you to get rid of the "while True" loop. You can find examples on Python Library reference: http://docs.python.org/release/2.5.2/lib/socket-example.html
I have a python program with many threads. I was thinking of creating a socket, bind it to localhost, and have the threads read/write to this central location. However I do not want this socket open to the rest of the network, just connections from 127.0.0.1 should be accepted. How would I do this (in Python)? And is this a suitable design? Or is there something a little more elegant?
Given a socket created with socket.socket(), you can use bind() before listening:
socket.bind(('127.0.0.1', 80))
Using the address 127.0.0.1 indicates that the socket should bind to the local interface only.
http://www.amk.ca/python/howto/sockets/
Shows some socket example. This tidbit is interesting to you I think
we used socket.gethostname() so that the socket would be visible to the outside world. If we had used s.bind(('', 80)) or s.bind(('localhost', 80)) or s.bind(('127.0.0.1', 80)) we would still have a "server" socket, but one that was only visible within the same machine.
I guess there is your answer (see below for correction)
As to the validity of using this method for thread communications. I'm not sure how well this handles multiple threads and reading/writing
EDIT
There seems to be a python recipe linked below that does some inter-thread communication
http://code.activestate.com/recipes/491281/
Have fun!
EDIT
The article is incorrect and as pointed out "s.bind(('', 80)) will bind to INADDR_ANY"
If you are running on a UNIX-based system, you might want to consider using UNIX Domain Sockets instead of Internet sockets. I think something like the following should work:
>>> # in one window/shell
>>> import socket
>>> sd = socket.socket(socket.AF_UNIX)
>>> sd.bind('/path/to/my/socket')
>>> sd.listen(5)
>>> (client,addr) = sd.accept()
>>> client.recv(1024)
'hello'
>>>
>>> # in a different shell
>>> import socket
>>> sd = socket.socket(socket.AF_UNIX)
>>> sd.connect('/path/to/my/socket')
>>> sd.send('hello')
You might want to use the queue module from the standard library instead. It's designed specifically to facilitate communication between threads. A quote from the docs:
The Queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The Queue class in this module implements all the required locking semantics. It depends on the availability of thread support in Python; see the threading module.
notionOn TCP/IP networks 127.0.0.0/8 is a non-routeable network, so you should not be able to send an IP datagram destined to 127.0.0.1 across a routed infrastructure. The router will just discard the datagram. However, it is possible to construct and send datagrams with a destination address of 127.0.0.1, so a host on the same network (IP sense of network) as your host could possibly get the datagram to your host's TCP/IP stack. This is where your local firewal comes into play. Your local (host) firewall should have a rule that discards IP datagrams destined for 127.0.0.0/8 coming into any interface other than lo0 (or the equivalent loopback interface). If your host either 1) has such firewall rules in place or 2) exists on its own network (or shared with only completely trusted hosts) and behind a well configured router, you can safely just bind to 127.0.0.1 and be fairly certain any datagrams you receive on the socket came from the local machine. The prior answers address how to open and bind to 127.0.0.1.
If you do sock.bind((port,'127.0.0.1')) it will only listen on localhost, and not on other interfaces, so that's all you need.
I'm in the process of implementing a service -- written in Python with the Twisted framework, running on Debian GNU/Linux -- that checks the availability of SIP servers. For this I use the OPTIONS method (a SIP protocol feature), as this seems to be a commonplace practice. In order to construct correct and RFC compliant headers, I need to know the source IP address and the source port for the connection that is going to be established. [How] can this be done with Twisted?
This is what I tried:
I subclassed protocol.DatagramProtocol and within startProtocol(self) I used self.transport.getHost().host and self.transport.getHost().port. The latter is indeed the port that's going to be used, whereas the former only yields 0.0.0.0.
I guess that at this point Twisted doesn't [yet?] know which interface and as such which source IP address will be used. Does Twisted provide a facility that could help me with this or do I need to interface with the OS (routing) in a different way? Or did I just use self.transport.getHost().host incorrectly?
For the sake of completeness I answer my own question:
Make sure you use connect() on the transport before trying to determine the host's source IP address. The following excerpt shows the relevant part of a protocol implementation:
class FooBarProtocol(protocol.DatagramProtocol):
def startProtocol(self):
self.transport.getHost().host # => 0.0.0.0
self.transport.connect(self.dstHost, self.dstPort)
self.transport.getHost().host # => 192.168.1.102
If you are using UDP then the endpoint is determined by either:
calling bind() on the socket and explicitly giving it an address
sending a packet
If you want a few more details, check this response.
The problem is that I'm not that familiar with twisted. From what I can tell by a quick perusal of the source, it looks like you might want to use a reactor like t.i.d.SelectReactor instead. This is what t.n.d.DNSDatagramProtocol does under the hood.
If you take twisted out of the picture, then the following snippet shows what is going on:
>>> import socket
>>> s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, 0)
<socket._socketobject object at 0x10025d670>
>>> s.getsockname() # this is an unbound or unnamed socket
('0.0.0.0', 0)
>>> s.bind( ('0.0.0.0', 0) ) # 0.0.0.0 is INADDR_ANY, 0 means pick a port
>>> s.getsockname() # IP is still zero, but it has picked a port
('0.0.0.0', 56814)
Get the host name is a little trickier if you need to support multiple network interfaces or IPv4 and IPv6. If you can make the interface used configurable, then pass it in as the first member of the tuple to socket.bind() and you are set.
Now the hard part is doing this within the confines of the abstractions that twisted provides. Unfortunately, I can't help a whole lot there. I would recommend looking for examples on how you can get access to the underlying socket or find a way to pass the socket information into the framework.
Good luck.
Did you see if that you want to do is possible with the SIP implementation that is part of Twisted?
In any case, how you set the source address and port for UDP in Twisted is quite similar to how you set them without Twisted. In Twisted, reactor.listenUDP(port, protocol, interface) binds an UDP socket to a specific port and interface and handles the received datagrams to your protocol. Inside the protocol, self.transport.write(msg, addr) sends a datagram to addr using the address that the protocol is bound to as source address.
Reading your question again, I think the only part you were missing was passing interface to reactor.listenUDP(...).