Spring Actuator based gRPC Health Check

We use Spring Boot and gRPC on AWS ECS Fargate behind AWS Application Load Balancers. We wanted to allow the ALB to query Spring Actuator’s robust health checks. However, the default Tomcat Actuator interface adds substantial bloat to a microservice and requires a slightly more involved configuration listening on a second port.

We decided to implement a gRPC service that allows the ALB to check the Actuator health. We’ve used the gRPC HealthGrpc interface for other purposes, so it made sense to write an implementation that proxies to Actuator’s HealthEndpoint.

Unfortunately, the implementation is unique to the needs of the ALB: Instead of returning ServingStatus.SERVING and ServingStatus.NOT_SERVING, it must return a different gRPC (HTTP) status on failure. In our case, we chose to have it return UNAVAILABLE.  The ALB matcher checks for status “0” (OK), so any other status indicates a service failure. We also didn’t bother to implement a streaming interface.

We inject both the HealthEndpoint and a List of HealthIndicators. The list of HealthIndicators available via the HealthEndpoint was incomplete. We found it helpful both to log all failing health indicators and return a list of them in the gRPC error description.

This has been working in production for several months. Without further adoo, here’s a sample implementation:

public final class GrpcActuatorHealthService extends HealthGrpc.HealthImplBase {
  private final HealthEndpoint healthEndpoint;
  private final List<HealthIndicator> healthIndicatorList;

  // Given a HealthIndicator, logs it as failed and returns a nice String description
  private static String logAndExtractStatus(HealthIndicator indicator) {
    var className = indicator.getClass().getSimpleName();
    logger.error("Health Check {} failed: {}", className, indicator.health());
    return String.format("Health Check %s: %s", className, indicator.health().getStatus());
  }

  @Override
  public void check(HealthCheckRequest request, StreamObserver<HealthCheckResponse> responseObserver) {

    var status = healthEndpoint.health().getStatus();

    // If actuator says we're healthy, return a normal gRPC Health SERVING response.
    if (status == Status.UP) {

      responseObserver.onNext(
          HealthCheckResponse.newBuilder()
          .setStatus(HealthCheckResponse.ServingStatus.SERVING)
          .build());

      responseObserver.onCompleted();

    } else {

      // We're in DOWN state, loop through HealthIndicators and log any that are not UP.
      var result = healthIndicatorList.stream()
          .filter(current -> !Status.UP.equals(current.health().getStatus()))
          .map(GrpcActuatorHealthService::logAndExtractStatus)
          .collect(Collectors.joining());

      // Return the list of failed services as part of the gRPC error for remote visibility.
      responseObserver.onError(
          io.grpc.Status.UNAVAILABLE
              .withDescription("Health failures: " + result)
              .asRuntimeException());
    }
  }
}