Welcome to new things

[Technical] [Electronic work] [Gadget] [Game] memo writing

Using GCP to monitor goroutine leaks in the Go language as effortlessly as possible

Goroutine Leak

The Go language is a fascinating language because it is so easy to create goroutines (like threads).

However, since the goroutine is executed separately from the main flow, it was unnoticed when the goroutine process stopped midway due to a deadlock, etc. "I thought it had finished, but in fact it remained in execution. In some cases, the goroutine was still running.

If the program has an end, such as batch processing, the remaining goroutines are also terminated when the program is terminated. However, in the case of a persistent program, such as a server program, if there are goroutines that are stopped in the middle of the program, the number of stopped and unexecuted goroutines will accumulate. However, in the case of a persistent program such as a server program, if there is a Gortin stopped in the middle of the program, the number of unexecuted Gortins will accumulate.

Solution

What to do, then, is to develop the program while checking the number of Goroutine launches and write a program that does not cause Goroutine leaks.

Even if I can confirm that no Goroutine leaks are occurring locally, I am a little concerned that when I run it for an extended period of time on the server, it may not really be leaking.

Therefore, I had been checking the number of Gortin launches by periodically outputting the number of Gortin launches to a log or the like, even when launched on a server, but to be honest, I thought this process was tedious.

I want to make it easier.

I was touching GCP's Cloud Monitoring to get external notifications when there is a standard error output from a container of Kubernetes apps.

www.ekwbtblog.com

Cloud Monitoring is a service that mainly monitors the status of various GCP services, but when I looked at the documentation, I found a function that could easily be used as is to collect the status of apps, so I decided to use it to monitor the number of Goroutine launches. .......

Square Needle

Have Cloud Monitoring send the value of the Go language pprof to Cloud Monitoring, which will graphically display the number of Goroutine launches on Cloud Monitoring.

To send values from your program to Cloud Monitoring, you can use an open source library called OpenCensus, but it requires a lot of setup and even if you want to send a small value, you need to write quite a bit of code.

However, the OpenCensus Go language library has a function to retrieve and send pprof values to Yoshinari, which can be done in a few lines of code without knowing much about OpenCensus, so I used it as it is this time.

procedure

authorization settings

In order to send data to Cloud Monitoring, "monitoring administrator" privileges are required and should be granted to the app.

sample program

package main

import (
    "context"
    "log"
    "os"
    "time"

    "contrib.go.opencensus.io/exporter/stackdriver"
    "go.opencensus.io/metric/metricexport"
    "go.opencensus.io/plugin/runmetrics"
)

func main() {
    //////////////////////////////////////////////////
    // Send pprof to Cloud Monitoring
    //////////////////////////////////////////////////
    os.Setenv("GOOGLE_APPLICATION_CREDENTIALS", "./service-account-key.json")

    // exporter(Cloud Monitoring)
    exporter, err := stackdriver.NewExporter(stackdriver.Options{
        // String for retrieval from Cloud Montoring
        MetricPrefix: "test-app-go-pprof",
        // Data transmission frequency (60 seconds or more)
        ReportingInterval: 120 * time.Second,
    })
    if err != nil {
        log.Fatal(err)
    }
    defer exporter.Flush()
    exporter.StartMetricsExporter()
    defer exporter.StopMetricsExporter()

    // metrics(pprof)
    err = runmetrics.Enable(runmetrics.RunMetricOptions{
        EnableCPU:    true,
        EnableMemory: true,
    })
    if err != nil {
        log.Fatal(err)
    }

    // metrics -> exporter
    metricexport.NewReader().ReadAndExport(exporter)

    //////////////////////////////////////////////////
    // Goroutine Leak Test
    //////////////////////////////////////////////////
    ctx := context.Background()
    for {
        go func() {
            var a [1000 * 1000 * 10]byte // Memory consumption
            _ = a
            <-ctx.Done() // Stop
        }()
        time.Sleep(60 * time.Second)
    }
}

supplementary explanation

  • OpenCensus is divided into a data part (metrics) and a transmission part (exporter).
  • Make the exporter Cloud Monitoring (formerly known as stackdriver) and runmetrics to generate metrict from pprof.
  • Tie exporter (Cloud Monitoring) to read and output metrics (pprof).
  • Since various types of data are sent to Cloud Monitoring, a "MetricPrefix" is added for easy identification.

Cloud Monitoring

When the program is run, Cloud Monitoring

custom.googleapis.com/opencensus/test-app-go-pprof/process/cpu_goroutines

The number of goroutines is recorded under the name [Cloud Monitoring]-[Metrics Explorer], which can be seen as [Cloud Monitoring]-[Metrics Explorer].

Use GCP to check for goroutine leaks in the Go language as effortlessly as possible.

application

Since other pprof values are also sent, it is possible to display the number of goroutines and memory usage on the dashboard without modifying the program.

Use GCP to check for goroutine leaks in the Go language as effortlessly as possible.

If the number of goroutines exceeds the threshold, Slack can be notified, etc.

Watch for goroutine leaks in the Go language using GCP as effortlessly as possible.

Impressions, etc.

Now you can leave the leak check in Goroutine with peace of mind.

Various services and methods for this type of monitoring are in disarray, but the psychological hurdle to introducing them is high because they are not compatible and must be gutted into the application.

If you ask me if OpenCensus is a great major player, I feel like it's not quite there yet, and I don't feel like going too deep into it....

I really wanted to check the buffer usage of CHANNEL, but it suddenly became troublesome when I tried to send arbitrary values, so I decided not to go any deeper here and to be satisfied with the number of goroutines (pprof).

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com