Extra Topic View Features in the Diffusion 6.6 Release
March 24, 2021 | Paddy Walsh
Diffusion™ 6.3 introduced topic views, a mechanism to produce virtual topics (known as reference topics) that take their values from other topics in the topic tree. In 6.4, we added topic view expansion, enabling a single source topic to be expanded to produce many derived reference topics. In Diffusion 6.5, the concepts of fan-out and topic views were combined to allow remote topic views, where the source topic can be on another server. In the 6.6 preview phase, we added topic view inserts, which allow data from topics other than the selected source topics to be merged into the resulting reference topic data.
The latest 6.6 release extends the capabilities of topic views even further, with new ‘preserve topics
‘ and ‘separator
‘ clauses.
Preserve Topics
When the path of a reference topic is derived from the value of the source topic, if the value changes, the path of the reference topic can also change. The simplest case of this is when the ‘scalar’ path directive is used, for example:
map ?Cars/ to <scalar(/owner)>/<path(1)>
In this example, for every topic found with a path prefix of Cars/
a new reference topic will be created with the same value as the source topic, but at a path made up of the value at the key owner
within the source topic, followed by the path of the source topic starting from element 1 (where the first element is 0).
To make this clearer, suppose that the Cars
subtopics all have a path that is the registration number of the car. So you might have the following topics with values as shown:
Cars/PCA556Y = { "type":"Ford", "model":"Sierra", "owner":"Bill"} Cars/D38TMA = { "type":"Ford", "model":"Escort", "owner":"Ben" } Cars/HY58XPA = { "type":"Fiat", "model":"Panda","owner":"Bill" }
An evaluation of the topic view would result in the following topics:
Bill/PCA556Y = { "type":"Ford", "model":"Sierra", "owner":"Bill"} Ben/D38TMA = { "type":"Ford", "model":"Escort", "owner":"Ben" } Bill/HY58XPA = { "type":"Fiat", "model":"Panda","owner":"Bill" }
This results in grouping the cars by owner. Now if an update occurred to Cars/HY58XPA
so that the owner field changed to ‘Ben
‘ then Bill/HY58XPA
would be removed and Bill/HY58XPA
created:
Bill/PCA556Y = { "type":"Ford", "model":"Sierra", "owner":"Bill"} Ben/D38TMA = { "type":"Ford", "model":"Escort", "owner":"Ben" } Ben/HY58XPA = { "type":"Fiat", "model":"Panda","owner":"Ben" }
The above behavior is most likely what you would want in a real application dealing with car ownership. But let’s say you want to keep historic topics when changes occur, so you do not want Bill/HY58XPA
to be removed. This is where the ‘preserve topics
‘ clause comes into play. The topic view specification can be changed to be as follows:
map ?Cars/ to <scalar(/owner)>/<path(1)> preserve topics
Now, when the update described above occurred the removal would not take place, so the resulting topics would be:
Bill/PCA556Y = { "type":"Ford", "model":"Sierra", "owner":"Bill"} Ben/D38TMA = { "type":"Ford", "model":"Escort", "owner":"Ben" } Bill/HY58XPA = { "type":"Fiat", "model":"Panda","owner":"Bill" } Ben/HY58XPA = { "type":"Fiat", "model":"Panda","owner":"Ben" }
Let’s consider a more realistic case for preserving topics. Suppose you have a data feed coming from some external source that provides you with foreign exchange rates, updating a topic (called Rate
) with the following structure:
Rate = { "currency":"GBP/USD", "rate":"1.40"}
And we have a topic view that has the specification:
map Rate to <scalar(/currency)> as <value(/rate)>
In this case, you would get a reference topic with path and value as follows:
GBP/USD = 1.40
However, if the Rate
topic is then updated to the following:
Rate = { "currency":"GBP/EUR, "rate":"1.16"}
The GBP/USD
topic will be removed, and a new GBP/EUR
topic created with a value of “1.16
“. In this application, the preferred behavior might be to preserve the original reference topic, so that a record of all rates that have been passed from the feed remain. This is where ‘preserve topics
‘ becomes useful, so we change the topic view specification to:
map Rate to <scalar(/currency)> as <value(/rate)> preserve topics
And now after the above update, both topics would remain, so you have:
GBP/USD = 1.40 GBP/EUR = 1.16
Updates for further new currencies would result in new reference topics. Updates for a currency that already has a reference topic would simply update the reference topic value.
Reference topics that are created when the ‘preserve topics
‘ clause is used remain until the source topic is removed or the topic view is removed.
The ‘preserve topics
‘ clause applies only to topic views that contain path directives that change the path of the target reference topic(s), so those that contain scalar
or expand
directives. For other topic views, the clause would be ignored.
Separator
In the above examples relating to ‘preserve topics’, we see a value from within a source topic being used to derive the path of a reference topic. For example, the currency field contained a value GBP/USD
and was used to generate a topic path for the reference topic. In this case, we actually have two nodes created within the topic tree. GBP
is the parent node and USD
is a child node. In this scenario GBP
can have many child topics and there can even be a separate topic at the GBP
path.
This described behavior is normally fine, but there may be cases where you do not want a /
character within a value to be treated as a path separator and lead to a separation of topic nodes. This is where the new ‘separator
‘ clause comes in. It allows you to specify a string which will replace all path separator characters encountered within a field value when creating a path mapping.
So, the example used above can be extended as follows:
map Rate to <scalar(/currency)> as <value(/rate)> separator '_'
Whenever a path separator is encountered in a path mapping directive, it will be replaced by the specified string, in this case, the underline character. If the field value is “GBP/USD
“, rather than generating a topic path of “GBP/USD
“, it will generate a topic path of GBP_USD
. The important distinction here is that GBP_USD
is a single topic tree node.
Note that the separator replacement string does not need to be a single character. It can be a string of any length, even zero length.
So:
map Rate to <scalar(/currency)> as <value(/rate)> separator ""
would result in GBP/USD
being mapped to a path of GBPUSD
.
Caveats
Reference topics that are created when the ‘preserve topics
‘ clause is used remain until the source topic is removed or the topic view is removed.
Reference topics are currently not persisted to file (or the cluster), so if the server closes the effect will be the same as removing the source topic. Upon restart, the reference topics would be built from scratch, update by update. This effect may also mean that new servers joining a cluster would not necessarily have the same reference topics as retained on other servers in the cluster. In a future release, this issue will be resolved.
Summary
We’re constantly working to improve topic views. The two new options described above are the result of direct analysis of the needs of Push Technology customers. The ‘preserve topics’ option allows the changes from a source to be retained and thus makes the publication of values from an external feed easier to manage. The ‘separator’ option allows further control over the generated hierarchy of reference topics.
Topic views will continue to evolve to provide Diffusion users with more and more powerful data wrangling capabilities.
Further reading
BLOG
Unlocking the Value of ISO 27001 Certification: A Journey of Security and Continuous Improvement
March 25, 2024
BLOG
100 million updates per second - Landmark Diffusion cluster performance
July 02, 2024
Read More about 100 million updates per second - Landmark Diffusion cluster performance/span>
BLOG
Exploring Generative AI: Opportunity or Potential Headache?
March 25, 2024
Read More about Exploring Generative AI: Opportunity or Potential Headache?/span>