The Diffusion metrics API
September 7, 2023 | Adam Turnbull
Getting information about how your system is running is crucial in most production environments. Diffusion has always been pretty good at providing lots of metrics, and allowing you to define your own in many cases through custom metric collectors.
Accessing these metrics and getting the data into your own monitoring system is pretty straightforward too. Mostly, this is through Diffusion’s Prometheus endpoint although JMX is an option too.
However – if you have written your own adapter, (or control client) that needs metrics from the server, you might not want to create a separate connection (be it HTTP to Prometheus, or JMX) to gather information. After all, you already have a websocket open to the server – why not use that?
With Diffusion 6.10 we created a metrics API that allows you to do just that. You will need your session to have VIEW_SERVER
permissions – you probably don’t want your metrics exposed to the public! – and then you can write code like this:
CompletableFuture<Metrics.MetricsResult> future = session.feature(Metrics.class) .metricsRequest() .currentServer() .filter(Set.of("diffusion_topics_subscriptions", "diffusion_topics_count")) .fetch();
This is using the Java API, but the other Diffusion SDKs also have the new metrics API available in them.
The metric names supplied match those that you would see in Prometheus, and you can either request them with a Set
of names as you see above, or using a regular expression instead:
CompletableFuture<Metrics.MetricsResult> future = session.feature(Metrics.class) .metricsRequest() .currentServer() .filter(Pattern.compile("diffusion_.*")) .fetch();
The API is also able to fetch metrics from servers in a cluster, provided that you know the ServerName
:
CompletableFuture<Metrics.MetricsResult> future = session.feature(Metrics.class) .metricsRequest() .server("server-1") .filter(Pattern.compile("diffusion_.*")) .fetch();
What happens if you don’t specify a server at all?
CompletableFuture<Metrics.MetricsResult> future = session.feature(Metrics.class) .metricsRequest() .filter(Pattern.compile("diffusion_.*")) .fetch();
Then you get metrics for all servers in the cluster, and the returned MetricsResult
will let you identify which server provided which values.
Further reading
BLOG
Exploring Generative AI: Opportunity or Potential Headache?
March 25, 2024
Read More about Exploring Generative AI: Opportunity or Potential Headache?/span>
BLOG
100 million updates per second - Landmark Diffusion cluster performance
July 02, 2024
Read More about 100 million updates per second - Landmark Diffusion cluster performance/span>
BLOG
Benchmarking and scaling subscribers
March 15, 2024