Tracing

Overview

The microservice architecture differs from the traditional monoliths in many aspects. From the request observability perspective, there are asynchronous boundaries among various different microservices that compose a request flow. Moreover, these microservices can have heterogeneous semantics when it comes to monitoring. A tracing solution that provides a holistic view of the request flow helps you to understand the system and take informed decisions regarding troubleshooting and performance optimization.

Tracing in Kyma uses Jaeger as a backend which serves as the query mechanism for displaying information about traces.

Architecture

The Jaeger-based tracing component provides the necessary functionality to collect and query traces. Both operations may occur at the same time. This way you inspect specific traces using the Jaeger UI, while Jaeger takes care of proper trace collection and storage in parallel. See the diagram for details:

Tracing architecture

Collect traces

The process of collecting traces by Jaeger looks as follows:

  1. The application receives a request, either from an internal or external source.
  2. If the application has Istio injection enabled, Istio proxy propagates the correct HTTP headers of the requests to the Jaeger Deployment. Istio proxy calls Jaeger using the Zipkin service which exposes a Jaeger port compatible with the Zipkin protocol.
  3. Jaeger processes the data. Specifically, the Jaeger Agent component receives the spans, batches them, and forwards to the Jaeger Collector service.
  4. The BadgerDB database stores the data and persists it using a PersistentVolume resource.

Query traces

The process of querying traces from Jaeger looks as follows:

  1. A Kyma user accesses the Jaeger UI to look for specific traces.
  2. Jaeger UI passes the request to the Jaeger Query service. The request goes through the Istio Ingress Gateway which forwards the incoming connections to the service.
  3. Jaeger Query passes the request to the Keycloak Gatekeeper for authorization. The Gatekeeper calls Dex to authenticate the user and the request, and grants further access if the authentication is successful.
  4. Finally, the functionality provided by the Jaeger Deployment allows you to retrieve trace information.

Details

Benefits of distributed tracing

Observability tools should clearly show the big picture, no matter if they help you monitor just a couple or multiple components. In a cloud-native microservice architecture, a user request often flows through dozens of different microservices. Tools such as logging or monitoring help to track the way, however, they treat each component or microservice in isolation. This individual treatment results in operational issues.

Distributed tracing charts out the transactions in cloud-native systems, helping you to understand the application behavior and relations between the front end actions and back end implementation.

The diagram shows how the distributed tracing helps to track the request path.

Distributed tracing

Jaeger

Overview

Jaeger is a monitoring and tracing tool for microservice-based distributed systems. Its features include the following:

  • Distributed context propagation
  • Distributed transaction monitoring
  • Root cause analysis
  • Service dependency analysis
  • Performance and latency optimization

Usage

The Envoy sidecar uses Jaeger to trace the request flow in the Istio Service Mesh. Jaeger is compatible with the Zipkin protocol, which Istio and Envoy use to communicate with the tracing back end. This allows you to use the Zipkin protocol and clients in Istio, Envoy, and Kyma services.

For details, see Istio's Distributed Tracing.

Install Jaeger locally

Read this document to learn how to install Jaeger locally.

Access Jaeger

Access the Jaeger UI either locally at https://jaeger.kyma.local or on a cluster at https://jaeger.{domain-of-kyma-cluster}.

Propagate HTTP headers

The Envoy proxy controls the inbound and outbound traffic in the application and automatically sends the trace information to Zipkin. To track the flow of the REST API calls or the service injections in Kyma, it requires the application to cooperate with the microservices code. To enable such cooperation, configure the application to propagate the tracing context in HTTP headers when making outbound calls. See the Istio documentation for details on headers required to ensure the correct tracing in Kyma.

Compare traces

Trace comparison allows you to compare the structure of two traces, rendered as a tree of connected services and operations. The colors help you to distinguish the differences between two traces.

Compare the traces using the Jaeger user interface.

  1. In the search page for traces, select the traces to compare and click Compare Traces.

    Tracing architecture

  2. The page shows the comparison of two traces selected in the previous step. The traces are marked with A and B.

    Tracing architecture

  3. Use the top menus for A and B to select the traces you want to compare.

    Tracing architecture

    Trace spans have different colors which indicate their meaning:

    • Dark colors indicate that the span is missing from one of the traces:
      • Dark red: The span is only present in trace A.
      • Dark green: The span is only present in trace B.
    • Light colors indicate that the span is present in both traces but occurs more often in one of the traces:
      • Light red: The span in A has more spans than B.
      • Light green: The span in B has more spans than A.
    • Gray: indicates that two traces have a span and the same number of further spans grouped in it.

    Additionally, spans are marked with numerical values indicating how often they occur in compared traces. The values can be positive or negative.

    NOTE: Missing spans can be interpreted as either the application not calling the downstream service, which might be a bug, or that the downstream service is down.

    Tracing architecture

Search for traces

You can search traces using tags. Tags are key-value pairs configured for each service. The full list of tags for a service from the details of that service's span.

For example, use these tags for event-publish-service:

  • event-type
  • event-type-ver
  • event-id
  • source-id

To search the traces, you can use either a single tag, such as event-type="order.created", or multiple tags, such as event-type="order.created" event-type-ver="v1".

Configuration

Jaeger chart

To configure the Jaeger chart, override the default values of its values.yaml file. This document describes parameters that you can configure.

TIP: To learn more about how to use overrides in Kyma, see the following documents:

Configurable parameters

This table lists the configurable parameters, their descriptions, and default values:

ParameterDescriptionDefault value
resources.limits.memoryDefines the maximum amount of memory that is available for storing traces in Jaeger.128M
jaeger.persistence.storageTypeDefines storage type for span data.badger
jaeger.persistence.dataPathDirectory path where span data will be stored./badger/data
jaeger.persistence.keyPathDirectory path where data keys will be stored./badger/key
jaeger.persistence.ephemeralDefines whether storage using temporary file system or not.false
jaeger.persistence.accessModesAccess mode settings for persistence volume claim (PVC).ReadWriteOnce
jaeger.persistence.sizeDefines disk size will be used from persistence volume claim.1Gi
jaeger.persistence.storageClassNameDefines persistence volume claim storage class name.

Troubleshooting

Basic Troubleshooting

Jaeger shows only a few traces

Istio Pilot sets the trace sampling rate at 1.0, where 100 is the maximum value. This means that only 1 out of 100 requests is sent to Jaeger for trace recording. To change this system behavior, run:

Click to copy
kubectl -n istio-system edit deploy istio-pilot

Set the traceSampling parameter to a desired value, such as 60.

NOTE: Using a very high value may affect Jaeger and Istio's performance and stability. Hence increasing the memory limits of Jaeger's deployment is needed.