The Diffusion metrics API

Getting information about how your system is running is crucial in most production environments. Diffusion has always been pretty good at providing lots of metrics, and allowing you to define your own in many cases through custom metric collectors.

Accessing these metrics and getting the data into your own monitoring system is pretty straightforward too. Mostly, this is through Diffusion’s Prometheus endpoint although JMX is an option too.

However – if you have written your own adapter, (or control client) that needs metrics from the server, you might not want to create a separate connection (be it HTTP to Prometheus, or JMX) to gather information. After all, you already have a websocket open to the server – why not use that?

With Diffusion 6.10 we created a metrics API that allows you to do just that. You will need your session to have VIEW_SERVER permissions – you probably don’t want your metrics exposed to the public! – and then you can write code like this:

CompletableFuture<Metrics.MetricsResult> future =
    session.feature(Metrics.class)
           .metricsRequest()
           .currentServer()
           .filter(Set.of("diffusion_topics_subscriptions",
                          "diffusion_topics_count"))
           .fetch();

This is using the Java API, but the other Diffusion SDKs also have the new metrics API available in them.

The metric names supplied match those that you would see in Prometheus, and you can either request them with a Set of names as you see above, or using a regular expression instead:

CompletableFuture<Metrics.MetricsResult> future =
    session.feature(Metrics.class)
           .metricsRequest()
           .currentServer()
           .filter(Pattern.compile("diffusion_.*"))
           .fetch();

The API is also able to fetch metrics from servers in a cluster, provided that you know the ServerName:

CompletableFuture<Metrics.MetricsResult> future =
     session.feature(Metrics.class)
            .metricsRequest()
            .server("server-1")
            .filter(Pattern.compile("diffusion_.*"))
            .fetch();

What happens if you don’t specify a server at all?

CompletableFuture<Metrics.MetricsResult> future =
    session.feature(Metrics.class)
           .metricsRequest()
           .filter(Pattern.compile("diffusion_.*"))
           .fetch();

Then you get metrics for all servers in the cluster, and the returned MetricsResult will let you identify which server provided which values.


Further reading

BLOG

Exploring Generative AI: Opportunity or Potential Headache?

March 25, 2024

Read More about Exploring Generative AI: Opportunity or Potential Headache?/span>

The Diffusion Data logo

BLOG

100 million updates per second - Landmark Diffusion cluster performance

July 02, 2024

Read More about 100 million updates per second - Landmark Diffusion cluster performance/span>

The Diffusion Data logo

BLOG

Benchmarking and scaling subscribers

March 15, 2024

Read More about Benchmarking and scaling subscribers/span>