http://wiki.apache.org/HttpComponents/FrequentlyAskedConnectionManagementQuestions
1. Connections in TIME_WAIT State
After running your HTTP application, you use the netstat command and detect a lot of connections in stateTIME_WAIT. Now you wonder why these connections are not cleaned up.
1.1. What is the TIME_WAIT State?
The TIME_WAIT state is a protection mechanism in TCP. The side thatcloses a socket connection orderly will keep the connection in stateTIME_WAIT for some time, typically between 1 and 4 minutes. Thishappens afterthe connection is closed. It does notindicate a cleanup problem. The TIME_WAIT state protects against lossof data and data corruption. It is there to help you. For technicaldetails, have a look at the Unix Socket FAQ, section 2.7.
1.2. Some Connections Go To TIME_WAIT, Others Not
If a connection is orderly closed by your application, it will go tothe TIME_WAIT state. If a connection is orderly closed by the server,the server keeps it in TIME_WAIT and your client doesn’t. If aconnection is reset or otherwise dropped by your application in anon-orderly fashion, it will not go to TIME_WAIT.
Unfortunately, it will not always be obvious to you whether aconnection is closed orderly or not. This is because connections arepooled and kept open for re-use by default. HttpClient 3.x, HttpClient4, and also the standard Java HttpURLConnection do that for you. Mostapplications will simply execute requests, then read from the responsestream, and finally close that stream.
Closing the response stream is notthe same thing as closing the connection! Closing the response streamreturns the connection to the pool, but it will be kept open ifpossible. This saves a lot of time if you send another request to thesame host within a few seconds, or even minutes.
Connection pools have a limited number of connections. A pool mayhave 5 connections, or 100, or maybe only 1. When you send a request toa host, and there is no open connection to that host in the pool, a newconnection needs to be opened. But if the pool is already full, an openconnection has to be closed before a new one can be opened. In thiscase, the old connection will be closed orderly and go to the TIME_WAITstate.
When your application exits and the JVM terminates, the open connections in the pools will not be closed orderly. They are reset or cancelled, without going to TIME_WAIT. To avoid this, you should call theshutdownmethod of the connection pools your application is using beforeexiting. The standard Java HttpURLConnection has no public method toshutdown it’s connection pool.
1.3. Running Out Of Ports
Some applications open and orderly close a lot of connections withina short time, for example when load-testing a server. A connection instate TIME_WAIT will prevent that port number from being re-used foranother connection. That is not an error, it is the purpose ofTIME_WAIT.
TCP is configured at the operating system level, not through Java.Your first action should be to increase the number of ephemeral portson the machine. Windows in particular has a rather low default for theephemeral ports. The PerformanceWiki has tuning tips for the common operating systems, have a look at the respective Network section.
Only if increasing the number of ephemeral ports does not solve yourproblem, you should consider decreasing the duration of the TIME_WAITstate. You probably have to reduce the maximum lifetime of IP packets,as the duration of TIME_WAIT is typically twice that timespan to allowfor a round-trip delay. Be aware that this will affect all applications running on the machine. Don’t ask us how to do it, we’re not the experts for network tuning.
There are some ways to deal with the problem at the applicationlevel. One way is to send a “Connection: close” header with eachrequest. That will tell the server to close the connection, so it goesto TIME_WAIT on the other side. Of course this also disables thekeep-alive feature of connection pooling and thereby degradesperformance. If you are running load tests against a server, theuntypical behavior of your application may distort the test results.[[BR] Another way is to not orderly close connections. There is a trickto set SO_LINGER to a special value, which will cause the connection tobe reset instead of orderly closed. Note that the HttpClient API willnot support that directly, you’ll have to extend or modify some classesto implement this hack.
Yet another way is to re-use ports thatare still blocked by a connection in TIME_WAIT. You can do that byspecifying the SO_REUSEADDR option when opening a socket. Java 1.4introduced the methodSocket.setReuseAddress for this purpose. You will have to extend or modify some classes of HttpClient for this too, but at least it’s not a hack.
1.4. Further Reading
java.net.Socket.setReuseAddress
Discussion on the HttpClient mailing list in December 2007
netstat command line tool
http://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-2.html#ss2.7
2.7 Please explain the TIME_WAIT state.
Remember that TCP guarantees all data transmitted will be delivered, if atall possible. When you close a socket, the server goes into a TIME_WAITstate, just to be really really sure that all the data has gone through. When a socket is closed, both sides agree by sending messages to eachother that they will send no more data. This, it seemed to me was goodenough, and after the handshaking is done, the socket should be closed. The problem is two-fold. First, there is no way to be sure that the last ack was communicated successfully. Second, there may be “wandering duplicates” left on the net that must be dealt with if they are delivered.
Andrew Gierth (andrewg@microlise.co.uk) helped to explain the closing sequence in the following usenet posting:
Assume that a connection is in ESTABLISHED state, and the client is aboutto do an orderly release. The client’s sequence no. is Sc, and the server’sis Ss. The pipe is empty in both directions.
Client Server ====== ====== ESTABLISHED ESTABLISHED (client closes) ESTABLISHED ESTABLISHED <CTL=FIN+ACK><SEQ=Sc><ACK=Ss> ------->> FIN_WAIT_1 <<-------- <CTL=ACK><SEQ=Ss><ACK=Sc+1> FIN_WAIT_2 CLOSE_WAIT <<-------- <CTL=FIN+ACK><SEQ=Ss><ACK=Sc+1> (server closes) LAST_ACK <CTL=ACK>,<SEQ=Sc+1><ACK=Ss+1> ------->> TIME_WAIT CLOSED (2*msl elapses...) CLOSED
Note: the +1 on the sequence numbers is because the FIN counts as one byte of data. (The above diagram is equivalent to fig. 13 from RFC 793).
Now consider what happens if the last of those packets is dropped in thenetwork. The client has done with the connection; it has no more data orcontrol info to send, and never will have. But the server does not knowwhether the client received all the data correctly; that’s what the lastACK segment is for. Now the server may or may not care whether theclient got the data, but that is not an issue for TCP; TCP is a reliablerotocol, and must distinguish between an orderly connection closewhere all data is transferred, and a connection abortwhere data mayor may not have been lost.
So, if that last packet is dropped, the server will retransmit it (it is,after all, an unacknowledged segment) and will expect to see a suitableACK segment in reply. If the client went straight to CLOSED, the onlypossible response to that retransmit would be a RST, which would indicateto the server that data had been lost, when in fact it had not been.
(Bear in mind that the server’s FIN segment may, additionally, containdata.)
DISCLAIMER: This is my interpretation of the RFCs (I have read all theTCP-related ones I could find), but I have not attempted to examineimplementation source code or trace actual connections in order toverify it. I am satisfied that the logic is correct, though.
More commentarty from Vic:
The second issue was addressed by Richard Stevens (rstevens@noao.edu,author of “Unix Network Programming”, see1.5 Where can I get source code for the book [book title]?).I have put together quotes from someof his postings and email which explain this. I have brought togetherparagraphs from different postings, and have made as few changes as possible.
From Richard Stevens (rstevens@noao.edu):
If the duration of the TIME_WAIT state were just to handle TCP’s full-duplex close, then the time would be much smaller, and it would be some function of the current RTO (retransmission timeout), not the MSL (the packet lifetime).
A couple of points about the TIME_WAIT state.
- The end that sends the first FIN goes into the TIME_WAIT state, because thatis the end that sends the final ACK. If the other end’s FIN is lost, orif the final ACK is lost, having the end that sends the first FIN maintain state about the connection guarantees that it has enough information to retransmit the final ACK.
- Realize that TCP sequence numbers wrap around after 2**32 bytes have been transferred. Assume a connection between A.1500 (host A, port 1500) and B.2000. During the connection one segment is lost and retransmitted. But the segment is not really lost, it is held by some intermediate router and then re-injected into the network. (This is called a “wandering duplicate”.) But in the time between the packet being lost & retransmitted, and then reappearing, the connection is closed (without any problems) and then another connection is established between the same host, same port (that is, A.1500 and B.2000; this is called another “incarnation” of the connection). But the sequence numbers chosen for the new incarnation just happen to overlap with the sequence number of the wandering duplicate that is about to reappear. (This is indeed possible, given the way sequence numbers are chosen for TCP connections.) Bingo, you are about to deliver the data from the wandering duplicate (the previous incarnation of the connection) to the new incarnation of the connection. To avoid this, you do not allow the same incarnation of the connection to be reestablished until the TIME_WAIT state terminates.Even the TIME_WAIT state doesn’t complete solve the second problem, given what is called TIME_WAIT assassination. RFC 1337 has more details.
- The reason that the duration of the TIME_WAIT state is 2*MSL is that the maximum amount of time a packet can wander around a network is assumed to be MSL seconds. The factor of 2 is for the round-trip. The recommended value for MSL is 120 seconds, but Berkeley-derived implementations normally use 30 seconds instead. This means a TIME_WAIT delay between 1 and 4 minutes. Solaris 2.x does indeed use the recommended MSL of 120 seconds.
A wandering duplicate is a packet that appeared to be lost and wasretransmitted. But it wasn’t really lost … some router had problems,held on to the packet for a while (order of seconds, could be a minuteif the TTL is large enough) and then re-injects the packet back intothe network. But by the time it reappears, the application that sentit originally has already retransmitted the data contained in that packet.
Because of these potential problems with TIME_WAIT assassinations, one should not avoid the TIME_WAIT state by setting the SO_LINGER
option to send an RST instead of the normal TCP connection termination (FIN/ACK/FIN/ACK). The TIME_WAIT state is there for a reason; it’s your friend and it’s there to help you :-)
I have a long discussion of just this topic in my just-released “TCP/IPIllustrated, Volume 3”. The TIME_WAIT state is indeed, one of the mostmisunderstood features of TCP.
I’m currently rewriting “Unix Network Programming” (see1.5 Where can I get source code for the book [book title]?). and will include lots more on this topic, as it is often confusing and misunderstood.
An additional note from Andrew:
Closing a socket: if SO_LINGER
has not been called on a socket, thenclose()
is not supposed to discard data. This is true on SVR4.2 (and,apparently, on all non-SVR4 systems) but apparently not on SVR4; theuse of eithershutdown()
or SO_LINGER
seems to be required toguarantee delivery of all data.
http://blog.csdn.net/liuxuejin/article/details/8552677