Tomcat occasionally returns a response without HTTP headers

I’m investigating a problem where Tomcat (7.0.90 7.0.92) returns a response with no HTTP headers very occasionally.

According to the captured packets by Wireshark, after Tomcat receives a request it just returns only a response body. It returns neither a status line nor HTTP response headers.

It makes a downstream Nginx instance produce the error “upstream sent no valid HTTP/1.0 header while reading response header from upstream”, return 502 error to the client and close the corresponding http connection between Nginx and Tomcat.

What can be a cause of this behavior? Is there any possibility which makes Tomcat behave this way? Or there can be something which strips HTTP headers under some condition? Or Wireshark failed to capture the frames which contain the HTTP headers? Any advice to narrow down where the problem is is also greatly appreciated.

This is a screenshot of Wireshark's "Follow HTTP Stream" which is showing the problematic response:

enter image description here

EDIT:

This is a screen shot of "TCP Stream" of the relevant part (only response). It seems that the chunks in the second response from the last looks fine:

enter image description here

EDIT2:

I forwarded this question to the Tomcat users mailing list and got some suggestions for further investigation from the developers:

http://tomcat.10.x6.nabble.com/Tomcat-occasionally-returns-a-response-without-HTTP-headers-td5080623.html

But I haven’t found any proper solution yet. I’m still looking for insights to tackle this problem..

728x90

3 Answers Tomcat occasionally returns a response without HTTP headers

We see that you are reusing an established connection to send the POST request and that, as you said, the response comes without the status-line and the headers.

after Tomcat receives a request it just returns only a response body.

Not exactly. It starts with 5d which is probably a chunk-size and this means that the latest "full" response (with status-line and headers) got from this connection contained a "Transfer-Encoding: chunked" header. For any reason, your server still believes the previous response isn't finished by the time it starts sending this new response to your last request.

A missing chunked seems confirmed as the screenshot doesn't show a last-chunk (value = 0) ending the previous request. Note that the last response ends with a last-chunk (the last byte shown is 0).

What causes this ? The previous response isn't technically considered as fully answered. It can be a bug on Tomcat, your webservice library, your own code. Maybe even, you're sending your request too early, before the previous one was completely answered.

Are some bytes missing if you compare the chunk-sizes from what is actually sent to the client ? Are all buffers flushed ? Beware of the line endings (CRLF vs LF only) too.

One last cause that I'm thinking about, if your response contains some kind of user input taken from the request, you can be facing HTTP Splitting.

Possible solutions.

It is worth trying to disable the chunked encoding at your library level, for example with Axis2 check the HTTP Transport.

When reusing a connection, check your client code to make sure that you aren't sending a request before you read all of the previous response (to avoid overlapping).

Further reading

RFC 2616 3.6.1 Chunked Transfer Coding

4 months ago

The issues you experience stem from pipelining multiple requests over a single connection with the upstream, as explained by yesterday's answer here by Eugène Adell.

Whether this is a bug in nginx, tomcat, your application, or the interaction of any combination of the above, would probably be a discussion for another forum, but for now, let's consider what would be the best solution:

Can you post your nginx configuration? Specifically, if you're using keepalive and a non-default value of proxy_http_version within nginx? – cnst 1 hour ago

@cnst I'm using proxy_http_version 1.1 and keepalive 100 – Kohei Nozaki 1 hour ago

As per an earlier answer to an unrelated question here on SO, yet sharing the configuration parameters as above, you might want to reconsider the reasons behind your use of the keepalive functionality between the front-end load-balancer (e.g., nginx) and the backend application server (e.g., tomcat).

As per a keepalive explanation on ServerFault in the context of nginx, the keepalive functionality in the upstream context of nginx wasn't even supported until very-very recently in the nginx development years. Why? It's because there are very few valid scenarios for using keepalive when it's basically faster to establish a new connection than to wait for an existing one to become available:

  • When the latency between the client and the server is on the order of 50ms+, keepalive makes it possible to reuse the TCP and SSL credentials, resulting in a very significant speedup, because no extra roundtrips are required to get the connection ready for servicing the HTTP requests.

    This is why you should never disable keepalive between the client and nginx (controlled through http://nginx.org/r/keepalive_timeout in http, server and location contexts).

  • But when the latency between the front-end proxy server and the backend application server is on the order of 1ms (0.001s), using keepalive is a recipe for chasing Heisenbugs without reaping any benefits, as the extra 1ms latency to establish a connection might as well be less than the 100ms latency of waiting for an existing connection to become available. (This is a gross oversimplification of connection handling, but it just shows you how extremely insignificant any possible benefits of the keepalive between the front-end load-balancer and the application server would be, provided both of them live in the same region.)

    This is why using http://nginx.org/r/keepalive in the upstream context is rarely a good idea, unless you really do need it, and have specifically verified that it produces the results you desire, given the points as above.

    (And, just to make it clear, these points are irrespective of what actual software you're using, so, even if you weren't experiencing the problems you experience with your combination of nginx and tomcat, I'd still recommend you not use keepalive between the load-balancer and the application server even if you decide to switch away from either or both of nginx and tomcat.)


My suggestion?

  • The problem wouldn't be reproducible with the default values of http://nginx.org/r/proxy_http_version and http://nginx.org/r/keepalive.

  • If your backend is within 5ms of front-end, you most certainly aren't even getting any benefits from modifying these directives in the first place, so, unless chasing Heisenbugs is your path, you might as well keep these specific settings at their most sensible defaults.

4 months ago

It turned out that the "sjsxp" library which JAX-WS RI v2.1.3 uses makes Tomcat behave this way. I tried a different version of JAX-WS RI (v2.1.7) which doesn't use the "sjsxp" library anymore and it solved the issue.

A very similar issue posted on Metro mailing list: http://metro.1045641.n5.nabble.com/JAX-WS-RI-2-1-5-returning-malformed-response-tp1063518.html

3 months ago