Tracing

Overview

The microservice architecture differs from the traditional monoliths in many aspects. From the request observability perspective, there are asynchronous boundaries among various different microservices that compose a request flow. Moreover, these microservices can have heterogeneous semantics when it comes to monitoring. A tracing solution that provides a holistic view of the request flow helps you to understand the system and take informed decisions regarding troubleshooting and performance optimization.

Tracing in Kyma uses Jaeger as a backend which serves as the query mechanism for displaying information about traces.

Architecture

The diagram presents the tracing flow including the details of requesting and storing traces.

Tracing architecture

The Jaeger Deployment is the central element of the tracing architecture. It serves as a target of all query requests sent from the Jaeger UI. It is also the space for storing and processing the spans and traces created by Envoy, Istio, and Kyma services.

Request traces

The process of requesting traces from Jaeger looks as follows:

  1. A Kyma user accesses Jaeger UI.
  2. The user uses the UI to request the trace details for a given service by selecting the service from the Services drop-down menu and confirming the choice by selecting the Find Traces button. Jaeger passes the request to jaeger-query, which is the UI facade.
  3. The jaeger-query forwards the details to Jaeger Deployment. The kcproxy verifies each request and if the authentication is successful, Jaeger sends the requested information back.

Request traces

Store traces

Traces are stored in Jaeger in the following way:

  1. A Kyma user configures the application to propagate the correct HTTP headers for the outbound calls.
  2. Envoy passes the trace details to the Zipkin Kubernetes service. This service acts as a facade receiving the trace and span details.
  3. The Zipkin service forwards the tracing information to Jaeger Deployment, allowing it to process them.

Store traces

Search traces by tags

You can search traces using tags. Tags are key-value pairs configured for each service.

See the full list of tags for a service from the details of that service's span.

For example, use these tags for event-publish-service:

  • event-type
  • event-type-ver
  • event-id
  • source-id

To search the traces, you can use either a single tag, such as event-type="order.created", or multiple tags, such as event-type="order.created" event-type-ver="v1".

Details

Benefits of distributed tracing

Observability tools should clearly show the big picture, no matter if they help you monitor just a couple or multiple components. In a cloud-native microservice architecture, a user request often flows through dozens of different microservices. Tools such as logging or monitoring help to track the way, however, they treat each component or microservice in isolation. This individual treatment results in operational issues.

Distributed tracing charts out the transactions in cloud-native systems, helping you to understand the application behavior and relations between the front end actions and back end implementation.

The diagram shows how the distributed tracing helps to track the request path.

Distributed tracing

Jaeger

Overview

Jaeger is a monitoring and tracing tool for microservice-based distributed systems. Its features include the following:

  • Distributed context propagation
  • Distributed transaction monitoring
  • Root cause analysis
  • Service dependency analysis
  • Performance and latency optimization

Usage

The Envoy sidecar uses Jaeger to trace the request flow in the Istio Service Mesh. Jaeger is compatible with the Zipkin protocol, which Istio and Envoy use to communicate with the tracing back end. This allows you to use the Zipkin protocol and clients in Istio, Envoy, and Kyma services.

For details, see Istio's Distributed Tracing.

Install Jaeger locally

Read this document to learn how to install Jaeger locally.

Access Jaeger

Access the Jaeger UI either locally at https://jaeger.kyma.local or on a cluster at https://jaeger.{domain-of-kyma-cluster}.

Propagate HTTP headers

The Envoy proxy controls the inbound and outbound traffic in the application and automatically sends the trace information to Zipkin. To track the flow of the REST API calls or the service injections in Kyma, it requires the application to cooperate with the microservices code. To enable such cooperation, configure the application to propagate the tracing context in HTTP headers when making outbound calls. See the Istio documentation for details on headers required to ensure the correct tracing in Kyma.

Compare traces

Trace comparison allows you to compare the structure of two traces, rendered as a tree of connected services and operations. The colors help you to distinguish the differences between two traces.

Compare the traces using the Jaeger user interface.

  1. In the search page for traces, select the traces to compare and click Compare Traces.

    Tracing architecture

  2. The page shows the comparison of two traces selected in the previous step. The traces are marked with A and B.

    Tracing architecture

  3. Use the top menus for A and B to select the traces you want to compare.

    Tracing architecture

    Trace spans have different colors which indicate their meaning:

    • Dark colors indicate that the span is missing from one of the traces:
      • Dark red: The span is only present in trace A.
      • Dark green: The span is only present in trace B.
    • Light colors indicate that the span is present in both traces but occurs more often in one of the traces:
      • Light red: The span in A has more spans than B.
      • Light green: The span in B has more spans than A.
    • Gray: indicates that two traces have a span and the same number of further spans grouped in it.

    Additionally, spans are marked with numerical values indicating how often they occur in compared traces. The values can be positive or negative.

    NOTE: Missing spans can be interpreted as either the application not calling the downstream service, which might be a bug, or that the downstream service is down.

    Tracing architecture

Configuration

Jaeger chart

To configure the Jaeger chart, override the default values of its values.yaml file. This document describes parameters that you can configure.

TIP: To learn more about how to use overrides in Kyma, see the following documents:

Configurable parameters

This table lists the configurable parameters, their descriptions, and default values:

ParameterDescriptionDefault value
resources.limits.memoryDefines the maximum amount of memory that is available for storing traces in Jaeger.128M
jaeger.persistence.storageTypeDefines storage type for span data.badger
jaeger.persistence.dataPathDirectory path where span data will be stored./badger/data
jaeger.persistence.keyPathDirectory path where data keys will be stored./badger/key
jaeger.persistence.ephemeralDefines whether storage using temporary file system or not.false
jaeger.persistence.accessModesAccess mode settings for persistence volume claim (PVC).ReadWriteOnce
jaeger.persistence.sizeDefines disk size will be used from persistence volume claim.1Gi
jaeger.persistence.storageClassNameDefines persistence volume claim storage class name.

Troubleshooting

Basic Troubleshooting

Jaeger shows only a few traces

The current Istio Pilot settings define the trace sampling rate at 1.0, where 100 is the maximum value. This means that only 1 out of 100 requests is sent to Jaeger for trace recording. To change this system behavior, run:

Click to copy
kubectl -n istio-system edit deploy istio-pilot

Set the traceSampling parameter to a desired value, such as 60.

NOTE: Using a very high value may affect Istio's performance and stability.