This repository contains labs for the OpenTelemetry Hands-On Session. Content updated as of March 2022. The tech stack used are java
and Go
In this step we will prepare the environment for the hands on exercises
shopizer
directory, else change directory by executing cd ~/shopizer
mvn clean install
The final output should look like the following image
If you would like to download the source repo for your own use, you can get it here https://github.com/Dynatrace-APAC/Workshop-otel-shopizer
You are now ready to start the hands on!
Persona: IT Operations
Objective: OpenTelemetry complementing OOTB OneAgent instrumentation
Scenario
Within your terminal navigate into the folder sm-shop by entering
cd sm-shop
Now we are ready to launch the Web Application using the following command.
mvn spring-boot:run
Open a new browser window and navigate to http://the.public.ip:8080
Feel free to navigate around within that application. In the interest of simplicity we have introduced a load generator that requests the pages that are relevant for todays session automatically.
Switch to the Dynatrace environment.
You will also notice that it contains two different kinds of service calls:
http://127.0.0.1:8080/shop/product/*
and
http://127.0.0.1:8080/shop/category/*
Take a look at a PurePath that is named similar to this.
http://127.0.0.1:8080/shop/category/laptop-bags.html/ref=c:3
This is what Dynatrace captures out of the box with its very own Sensors. After a small configuration change we will see that there's a lot more data available.
The application we are working with is as a matter of fact already augmented with OpenTelemetry. We just haven't told Dynatrace yet to take advantage of it.
mvn spring-boot:run
Switch to the Dynatrace environment and within the Requests executed in background threads of com.salesmanager.shop.application.ShopApplication service > PurePaths, look for the transaction
http://127.0.0.1:8080/shop/category/laptop-bags.html/ref=c:3
Among the well known PurePath nodes (Database Calls, ...) you will now notice additional entries with the OpenTelemetry Icon.
The developer has chosen to "signal" to monitoring solutions which portions of the service flow are of importance. In this case, the developer has already augment the necessary codes with the OpenTelemetry. Each of these items/nodes are called Spans. What we have just done is to tell Dynatrace to "recognize* these spans as part of the PurePath.
In this specific case the developer might have been a bit too overzealous. The Span query-category is visible countless times within the PurePath. Dynatrace by default captures every span.
As a personnel in the "IT Operations" team, you do not have access to the application code and making the developer do this small change is not feasible. Dynatrace offers a way to configure this via the Dynatrace UI. You can control the capturing the spans by adding an exclusion rule within the Settings menu.
Persona: Application developers
Objectives:
Scenario
In the Dynatrace UI > Services > Requests executed in background threads of com.salesmanager.shop.application.ShopApplication service > PurePaths, look for the transaction
http://127.0.0.1:8080/shop/product/vintage-courier-bag.html/ref=c:2
Take a look at the PurePath nodes:
You'll notice that it also contains a Span named calc
We are looking to introduce a bit more context information to the outgoing HTTP request following the initial call of the calc method. Let's take a look at the source code that makes it possible.
Let's investigate and understand the Java code that provides OpenTelemetry instrumentation via the Otel Java SDK.
Within Visual Studio Code expand the following folders:
shopizer/sm-shop/src/main/java/com/salesmanager/shop/store/controller/product
Select the file ShopProductController.java.
Scroll down to line 100. The method calcPrice
is the one that currently produces this additional span.
public void calcPrice(Model model) {
Span span = getTracer().spanBuilder("calc")
.setAttribute("model", model.toString())
.startSpan();
try (Scope scope = span.makeCurrent()) {
HttpUtil.Get("http://127.0.0.1:8090/calc");
} finally {
span.end();
}
}
The code below creates a basic span and names it calc. Reference Otel SDK doc - Create a basic span
Span span = getTracer().spanBuilder("calc")
The code below annotates the span using Span attributes
. Reference Otel SDK doc - Span Attribute
.setAttribute("model", model.toString())
The code below signals the start of the span and also start timing it. The start and end time of the span is automatically set by the OpenTelemetry SDK. Reference Otel SDK doc - Create a basic span
.startSpan();
The .makeCurrent()
function of the code below is an an automated way offered by the SDK to propagate the parent span on the current thread. After all, we want calc to be nested under the "caller", which by the way, is automatically done by the OneAgent instrumentation. Reference Otel SDK doc - Create nested span
try (Scope scope = span.makeCurrent())
The final code below signal the end of the span.
span.end();
It is now YOUR turn to augment additional methods with OpenTelemetry. A few lines above method calcPrice you will find the method handleQuote at line 91.
Line 92 makes an outbound HTTP GET call to another API called quote
public void handleQuote(final String reference) {
HttpUtil.Get("http://127.0.0.1:8090/quote");
}
In the Dynatrace PurePaths, although we see a Golang icon next to the outgoing HTTP GET, the only reason why this is happening is because the HTTP call is also within the same server. What if the call is made to a remote server, or a Lambda fucntion? The call will never be listed in vintage-courier-bag.html PurePaths at all...
mvn spring-boot:run
The new Service Calls should now contain an additional OpenTelemetry Span.
Let's take a closer look at the quote
span.
Dynatrace realizes, that this span comes with additional metadata - a so-called Span Attribute
. Its value is however not getting captured by default. Click on the link below the attribute and configure Dynatrace to also collect its value.
There is no requirement to rebuild or restart the application for that. The attribute is already getting reported. Dynatrace just has ignored it up to now.
Wait for new Service Calls to come in and verify that the value for the reference is getting captured.
Persona: Application developer
Objective: Configure "remote" trace ingest and send OTel traces to Dynatrace
Scenario
In the previous section we successfully added context information to these outbound web requests. Unfortunately that doesn't get us insight into what's happening within the application that receives these calls. The server side of these calls is a locally running application written in Golang - unfortunately a version of Golang that's not yet supported by OneAgent.
The application is however already prepared for sending OpenTelemetry Spans to Dynatrace. We just need to configure where to send that data to.
Within Visual Studio Code expand the following folders and navigate to
shopizer/gosrvc/main.go
Around line 38 OpenTelemetry is getting configured. In our code, we are using the OLTP exporter that sends trace telemetry data to the collector using HTTP with protobuf-encoded payloads. This eliminates the need for an OpenTelemetry Collector as Dynatrace has an API endpoint that can accept such contents.
client := otlptracehttp.NewClient(
otlptracehttp.WithEndpoint("######.sprint.dynatracelabs.com"),
otlptracehttp.WithURLPath("/api/v2/otlp/v1/traces"),
otlptracehttp.WithHeaders(map[string]string{
"Authorization": "Api-Token ############################################################################################",
}),
)
What is left is to configure the Dynatrace URL and also the API token.
// SaaS
otlptracehttp.WithEndpoint("######.live.dynatrace.com")
// Managed with Let's Encrypt SSL cert
otlptracehttp.WithEndpoint("######.managed-dynatrace.com")
// Domain with own SSL cert
otlptracehttp.WithEndpoint("<BASE URL>")
Take note that the SDK automatically appends https
, so you do NOT need to specify the full URL, only the full domain.
Specially for Managed
As Managed requires additonal path for the environment, the URLPath will need to also include /e/Dynatrace-Environment-ID/
otlptracehttp.WithURLPath("/e/<EnvID>/api/v2/otlp/v1/traces"),
Navigate to Access Tokens within your Dynatrace environment and generate a token with the OpenTelemetry trace ingest permissions.
Copy and Paste the Access token in this line. Ensure that everything gets copied correctly without additional spaces or characters, especially hidden or special character sets.
"Authorization": "Api-Token ############################################################################################"
mvn spring-boot:run
In the Dynatrace UI > Services > Requests executed in background threads of com.salesmanager.shop.application.ShopApplication service > PurePaths, look for the transaction
http://127.0.0.1:8080/shop/product/vintage-courier-bag.html/ref=c:2
The PurePath nodes under the Go node should now show what's going on for the quote calls.
What's not covered yet are the calc calls.
Let's investigate and understand the Go code that provides OpenTelemetry instrumentation via the Otel Golang SDK.
Take another look at the file main.go and scroll down to line 70 where function quote starts.
func quote(w http.ResponseWriter, req *http.Request) {
ctx := otel.GetTextMapPropagator().Extract(req.Context(), propagation.HeaderCarrier(req.Header))
var span trace.Span
ctx, span = otel.Tracer(name).Start(ctx, "quote", trace.WithSpanKind(trace.SpanKindServer))
process(ctx, uint(rand.Intn(20)))
span.End()
fmt.Fprintf(w, "done\n")
}
Since the Java code is making a call to the Go service, we need to capture the traceid
from the caller. Perhaps we can refer to the caller (i.e. Java code/process) as the parent, and so the parent normally passes some "DNA information" (blood type for example) to the child (i.e. separate Golang code/process). Parent and child are different but share similar traits because of how the DNA is passed down. This is call trace context propagation.
Reference
The code below extracts the traceid
from the HTTP header.
ctx := otel.GetTextMapPropagator().Extract(req.Context(), propagation.HeaderCarrier(req.Header))
The code below initiates a variable called span
with type trace.Span
.
var span trace.Span
We need to start the span and name it. In addition, we give it a label - spand kind
, which indicates the role a Span plays in a Trace. Since we are starting the span in the main function, we will label the role of this span as Server. This is because we are expecting the main function to call sub-functions (i.e. "clients"). Thereby the SpanKind
is way to help the observerbility tool understand the relationship between the spans and can be used to represent "Client-Server", "callee-caller" relationship metaphorical.
Reference
The code below starts the span, names it quote and labels it as a Server type.
ctx, span = otel.Tracer(name).Start(ctx, "quote", trace.WithSpanKind(trace.SpanKindServer))
The code below calls the process
method, however, because we also want to nest all other callees under the caller (in this case, the main go program), then we must remember to pass the trace context ctx
with the callee (i.e. process
).
process(ctx, uint(rand.Intn(20)))
The final code below then ends the span and stops the timer. Reference Otel SDK doc - Creating spans
span.End()
It is now YOUR turn to augment additional methods with OpenTelemetry. Line 80 is the code for the function calc. It handles the incoming calc
requests.
func calc(w http.ResponseWriter, req *http.Request) {
process(req.Context(), uint(rand.Intn(20)))
fmt.Fprintf(w, "done\n")
}
mvn spring-boot:run
The new Service Calls should now contain an additional OpenTelemetry Spans for calc
, in the same way as quote
.
When inspecting the ingested traces together with the full trace or full PurePath, you might come across this
Persona: IT Operations
Objectives
...
button of the Failure rate chart and click on Show in Data Explorer. A problem card should appear after a few minutes. If no problem card appears, check the threshold settings and ensure that it is below the average failure rate in the chart
...
on the top right hand corner.These are retrieved from the pre-instrumented Go application.
Once the span attributes are added, wait a little while and inspect the latest transactions again.
What can you tell about the conditions surounding the errors?
An SRE team is responsible for finding good service-level indicators (SLIs) for a given service in order to closely monitor the reliable delivery of that service. SLIs can differ from one service to the other, as not all services are equally critical in terms of time and error constraints.
Being able to use a typical failure rate metric as a SLI, this helps the SRE team understand what kind of service levels that the organization needs to uphold in order to provide a resilient service for the business. This then allows the SRE team to define and then further tweak the operational goal within a service-level objective (SLO) to further improve.
Follow the online guide to define an SLO.
For the SLO name, use Processing limit SLO
.
For the metric expression, use
100-builtin:span_failure_rate:filter(eq("span.kind",internal))
Target percentage, use 80
.
Warning percentage, use 90
.
Timeframe to evaluate, use -1h
We hope you enjoyed this lab and found it useful. We would love your feedback!