Spring Boot makes it easy to get an app running on your laptop. Getting it to behave well in production is a different set of problems, and they tend to surface at the worst time - during an incident, during a deploy, during a traffic spike. These are the three that have bitten us most: configuration, observability, and shutdown.
Config belongs in the environment
Hardcoded values and committed application.properties with real credentials are how secrets leak. We keep configuration in environment variables, injected at deploy time, with sensible defaults in the code only for things that are safe to default. The twelve-factor rule holds up: the same artifact runs in every environment, and only the config changes.
- Secrets come from the environment or a secret manager, never from the repo
- Use Spring profiles for behavior differences, not for hiding credentials
- Fail fast on startup if a required variable is missing - do not limp along
- Bind config to typed @ConfigurationProperties so a typo is a startup error, not a runtime surprise
Observability is not optional
When something breaks at 3am, you are debugging with whatever you instrumented beforehand. Actuator plus Micrometer gives us metrics with almost no code, and we wire structured JSON logging so a log line is queryable, not just human-readable. Every request carries a trace id that flows through logs and downstream calls, so we can follow one user's request across services.
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheusThe three things we make sure we can answer without redeploying: what is the error rate, how long do requests take at the 95th percentile, and which downstream call is slow. If the answer to any of those requires adding a log line and shipping, we instrumented too little.
You cannot add observability during the incident. You either had it, or you are guessing in the dark.
Graceful shutdown so deploys do not drop requests
By default a container can be killed mid-request during a rolling deploy, and the user gets a connection reset for no reason of their own. Spring Boot supports graceful shutdown: stop accepting new requests, finish the in-flight ones, then exit. We set it explicitly and pair it with a shutdown grace period in the orchestrator that is longer than our slowest reasonable request.
There is a subtlety we learned the hard way. Readiness has to flip to "not ready" before shutdown begins, so the load balancer stops sending new traffic a few seconds before the process actually starts draining. Setting shutdown: graceful and a timeout-per-shutdown-phase of 30s handles the draining, but without that readiness gap you still drop the requests that arrive in the window between the kill signal and the load balancer noticing.