This is a guest post by Bob Reselman.
Kubernetes is a big technology that brings a lot of power and control to working with web-scale applications. However, a drawback is that a lot of activity internal to Kubernetes gets obscured. Applications running in Kubernetes can be hard to debug, let alone test.
Yet being able to adequately debug a Kubernetes application is essential, not only to remedy issues discovered in testing, but also as part of the day-to-day activity of the modern developer.
In the spirit of making things a bit easier all around, I’ll share five tips that will make debugging Kubernetes an easier undertaking, both for developers and test practitioners.
1. No Matter What, Log
Logging is an essential first step for observing application behavior, particularly when applications are distributed over a number of nodes in a Kubernetes cluster.
Kubernetes implements logging in the cluster and provides the command kubectl logs <pod name> out of the box to access these logs. See figure 1, below.
Figure 1: Kubernetes provides logging internal to the cluster, but a logging service provides an ongoing record of activity among clusters.
The kubectl logs command displays log activity generated by a particular pod, as well as a particular container in a multi-container pod. Listing 1 below shows how to access the logs from a single container pod.
root@linuxlab:~# kubectl get pods NAME READY STATUS RESTARTS AGE pinger-8dd8df684-r25xd 1/1 Running 0 3m root@linuxlab:~# kubectl logs pinger-8dd8df684-r25xd Listening on port 3000 { app: 'pinger', request: IncomingMessage { _readableState: ReadableState { objectMode: false, highWaterMark: 16384, buffer: [Object], length: 0, pipes: null, pipesCount: 0, flowing: null, ended: false, endEmitted: false, reading: false, sync: true, needReadable: false, emittedReadable: false, readableListening: false, resumeScheduled: false, destroyed: false, defaultEncoding: 'utf8', awaitDrain: 0, readingMore: true, decoder: null, encoding: null }, readable: true, domain: null, _events: {}, _eventsCount: 0, _maxListeners: undefined, . . .
Listing 1: The kubectl logs command reports a pod’s log activity within the cluster
While kubectl logs is useful, there is a shortcoming. The command only shows log activity for a single pod and container, not the logs for all pods running in a cluster. To meet this need you’ll need to use one of the many online logging services available.
Getting up and running with a logging service is straightforward, as long as you design your application and its components to support remote logging. This means abstracting logging away from a specific log solution. Instead of referencing a specific log technology directly in code, it’s better to make a helper function or class that can be configured at deployment time to use a service of choice. Getting locked into a specific logging technology can cause problems down the road when a change needs to be made.
However, no matter which logging service you use — or if you decide to just go with Kubernetes’s kubectl logs — having a structured, predictable approach to logging is fundamental to observing and debugging an application running on Kubernetes.
2. Use kubectl describe as Your First Line of Inquiry
If you want to figure out the state and status of a Kubernetes API resource, as well the events that are unfolding over the lifetime of resource, the Kubernetes command kubectl describe will give you a fast, clear picture of what’s going on.
Listing 2 below shows how to execute kubectl describe against a Kubernetes deployment that has the name “pinger.”
root@linuxlab:~># kubectl describe deployment pinger Name: pinger Namespace: default CreationTimestamp: Sat, 18 May 2019 11:48:01 -0700 Labels: app=pinger Annotations: deployment.kubernetes.io/revision: 1 kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{},"name":"pinger","namespace":"default"},"spec":{"replic... Selector: app=pinger Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 1 max unavailable, 1 max surge Pod Template: Labels: app=pinger Containers: pinger: Image: reselbob/pinger:v2.2 Port: 3000/TCP Host Port: 0/TCP Environment: CURRENT_VERSION: LESSON_08 Mounts: <none> Volumes: <none> Conditions: Type Status Reason ---- ------ ------ Available True MinimumReplicasAvailable Progressing True NewReplicaSetAvailable OldReplicaSets: <none> NewReplicaSet: pinger-8dd8df684 (1/1 replicas created) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 46m deployment-controller Scaled up replica set pinger-8dd8df684 to 1
Listing 2: The Kubernetes command kubectl describe gives the state, status and events about a deployment
Notice that not only does the command provide details about the deployment, including the deployment’s pods and associated containers, but toward the end of the listing there’s also event information that describes various occurrences that happen in the deployment’s lifetime.
When it comes to troubleshooting an issue in Kubernetes that’s time-sensitive, using kubectl describe often provides the immediate insight you need to guide you along in your investigations.
Those in the know about Kubernetes use kubectl describe as the first line of inquiry.
3. Invest in Application Performance Monitoring
As mentioned earlier, Kubernetes is a complex technology with a lot of moving parts. There is really no way that any single human or group of humans can keep track of it all. Tools are needed, particularly when it comes to performance monitoring.
A good performance monitoring tool keeps an eye on a wide variety of operational activities going on in the digital infrastructure powered by Kubernetes, including network IO, disk activity and CPU utilization.
Whether you’re running under a cloud provider such as AWS, Google Cloud or Azure or taking an on-premises or hybrid approach, investing in comprehensive application performance monitoring tools will save you time as well as heartache. You can use an open source solution such as Prometheus or take advantage of a commercial product such as AppDynamics or Datadog.
Whatever route you take, the important thing is to really make the investment required. While money is always a consideration, the most important part of the investment is to ensure staff are well versed in using the selected tool. This means making sure that employees are given the time and support required to achieve mastery. Just saying “We’re going to invest in APM” is not enough; action must follow.
4. Implement Distributed Tracing
The days of the monolithic cloud application are coming to a close. Today’s modern architectures are distributed and ephemeral. An application will be made up of a variety of services, each of which is dedicated to a particular area of concern. The behaviors backing a service can also come or go at a moment’s notice, scaling up or down to meet the current need, with code updates happening all the time. This is why Kubernetes is so popular: It’s designed to support an application infrastructure that is constantly changing.
Keeping track of it all is a daunting task, particularly when trying to identify bottlenecks in application performance. Remember, in a distributed environment, a request can make its way through any one of a number of services backed by logic that can exist in any geographical location at any time. It’s no longer about measuring request and response time; a deeper understanding is needed.
This is where distributed tracing comes in. See figure 2 below.
Figure 2: Distributed tracing enables comprehensive observation of application activity among a variety of microservices
Distributed tracing is designed to report request behavior and the environmental conditions relevant to that request as it makes its way among the variety of services needed to meet the task at hand. Distributed tracing is an important technology that is becoming standard for working with Kubernetes.
There are a number of open source distributed tracing tools you can use, such as Zipkin and Jaeger. Most commercial performance tools also provide distributed tracing tools. As with performance monitoring, the important thing is to make sure adequate preparation and support are given to those charged with implementing distributed tracing in your company’s application stack. Your efforts will only be as good as the attention given.
5. Encourage Developers to Start Using Code-Level Debugging Tools
An exciting new area of activity in Kubernetes debugging is making it so that developers can do step-by-step application debugging on code running in a container in any Kubernetes cluster. Google has a beta extension for Visual Studio Code called Cloud Code that enables line-by-line debugging of containers running in Kubernetes from within the Visual Studio Code IDE. This is a big deal, and once a company as big as Google makes headway in this area, other big companies such as Amazon, Microsoft and Redhat are sure to jump in with similar solutions.
Debugging running code in a cluster will be a game changer for developers and test practitioners alike. The time to prepare your company’s staff is now.
Putting It All Together
Debugging Kubernetes is a challenging undertaking. There really is no single tool or technique that can provide the breadth of inspection and information required to do effective troubleshooting: Getting a handle on issues requires a variety of approaches.
This includes making sure effective logging is in place. Staff also must have skills with the basic commands that Kubernetes provides to inspect various aspects of the given cluster. And you should take advantage of time-tested performance monitoring and distributed tracing solutions, as well as the new inline debuggers that are the horizon.
As the experienced Kubernetes professional will tell you, simply deploying containers and spinning up services is not enough. Making sure that everything is running to expectation — and keeps running to expectation — is where most of the work happens, not only for DevOps professionals, but for testing practitioners as well. Hopefully these five tips will make the work easier and your efforts more effective.
Bob Reselman is a nationally-known software developer, system architect, industry analyst, and technical writer/journalist. Bob has written many books on computer programming and dozens of articles about topics related to software development technologies and techniques, as well as the culture of software development. Bob is a former Principal Consultant for Cap Gemini and Platform Architect for the computer manufacturer, Gateway. Bob lives in Los Angeles. In addition to his software development and testing activities, Bob is in the process of writing a book about the impact of automation on human employment. He lives in Los Angeles and can be reached on LinkedIn at www.linkedin.com/in/bobreselman.