Merge Topic Data with Topic View Inserts in Diffusion 6.6

Diffusion™ 6.3 introduced topic views, a mechanism to produce virtual topics (known as reference topics) that refer to other topics in the topic tree for their value. In 6.4, we added topic view expansion, enabling a single source topic to be expanded to produce many derived reference topics. In Diffusion 6.5, the concepts of fan-out and topic views were combined to allow remote topic views to specify source topics at another server.

Now, in 6.6 we have added topic view inserts, which allow data from topics other than the selected source topics to be merged into the resulting reference topic data.

Inserts allow data from JSON or scalar topics to be inserted into JSON topics.

How are inserts specified?

Inserts are enabled using the new insert clause in a topic view specification

The insert clause follows the path mapping in a topic view specification and takes the following form:

insert path_specification key source_key at insertion_key default constant

Only the insert and at keywords are mandatory, all others may be defaulted.

So, for example:

map ?Some_Source_Topics/ to Mapped_Topics/<path(1)> insert Some_Other_Topic at /Some_JSON_Pointer

Maps all topics beneath path Some_Source_Topics to similarly named topics under path Mapped_Topics inserting the complete value of Some_Other_Topic into the current data with the key named Some_JSON_Pointer.

The meaning of each part of the insert clause is described below:

insert path_specification

The insert keyword introduces the clause and specifies the path of the topic from which data is to be obtained and merged with the current data value. The meaning of ‘current data value’ can depend upon the other clauses in the specification and is defined in more detail below. But for now, just think of it as the value from the selected source topic.

The path_specification defines the path of the topic to insert data from and it is similar to the current target path mapping in that it can contain:-

  1. Constants
  2. <path()> directives
  3. <scalar()> directives

The path directives operate on the path of the selected ‘source’ topic and the scalar directives operate on the current input data as defined above.

For example:

Topic/<path(1, 2)>/<scalar(/foo)>

would be specifying insertion from a topic whose path is Topic/ followed by elements 1 to 3 of the source topic path followed by / and the scalar value at the key /foo in the current input data.

key source_key

This optionally specifies the key (a JSON pointer) of an item within the topic indicated by path_specification. If not specified then it is assumed that the whole of the data value of the selected topic will be inserted.

at insertion_key

This specifies a JSON pointer indicating the location of the insertion in the current data value.

Typically this would be an object key which would indicate the key of the value in the data. If the data already had an item with the same key it would be overwritten, otherwise a new item would be added to the parent indicated by the specified key. The parent would have to exist otherwise the insertion would  not occur and a warning would be logged.

This can also indicate an entry in an array. If an index key is provided, the existing entry at the specified index would be replaced. An index of one greater than the current number of entries could be used to append to an array, but it is much easier to use the special ‘-‘ character instead. For example, to append to the end of an array at MyArray you can use an insertion key of /MyArray/- .

So if the key was specified as at /Address/Street, it would indicate that the value from the selected topic to insert from would be inserted within the current data value within an object called Address at a key called Street. If the object currently had a key called Street, it would be overwritten; otherwise, it would be inserted into the object.

If the resolved key indicates a scalar item then no insertion will take place.

default constant

Normally if the source topic to insert from cannot be found (or was not a JSON or scalar topic) or the specified key within it does not exist then no insertion will take place in the current data value. It will simply be as if the insert had not been specified. However, it is possible to override this behavior using the default keyword.

If default is specified and the topic to insert from or the key within it are not found, the constant value will be inserted as a scalar value at the insertion point.

What is the ‘current data value’?

The insert clause specifies the path of the topic from which data is to be obtained and merged with the current data value. The current data value is typically the value from the selected source topic; however, there are some situations where this is not the case:

  • If the path mapping includes one or more expand directives, the current data value will be the expanded data value.
    So, for example, if a path mapping is used to expand an array of 5 elements then the insert clause would be executed 5 times, once for each element.
  • If the insert is preceded by an as<value(key)> directive, the current data will be the data indicated by the key.
  • If the insert was preceded by another insert clause, the current data will be the output from that clause.

Examples of Inserts

map Topic1 to Topic2 insert AnotherTopic at /other

This is the very simplest example, where Topic1 is mapped to Topic2 and the data within AnotherTopic is inserted into it at the key named other. If AnotherTopic does not exist (or is not JSON or scalar), Topic2 will be created with the same value as Topic1 but with nothing inserted. It is assumed that the value of Topic1 is an object – if it is an array, then no insertion will occur.

map Topic1 to Topic2 insert AnotherTopic at /other default "unknown"

As in the previous example, but in this case if AnotherTopic does not exist then Topic2 will be created, with key other inserted with a scalar value of unknown.

map ?Topics/ to Mapped/<path(1)> insert AnotherTopic at /other

Like the previous example, but in this case all of the topics under the path Topics will be selected and mapped to topics with the same name under the path Mapped. Every selected topic will have the value of AnotherTopic inserted into it (assuming they are JSON objects) – unless AnotherTopic does not exist, in which case no insertions would take place.

map ?Topics/ to Mapped/<path(1)> insert Others/<path(1)> at /other

This example introduces the more powerful path mapping capabilities of the insert clause. In this case, each selected topic has an insertion from a topic with the same topic under the path Others. For example Topics/A/B would generate a reference topic at path Mapped/A/B which has the value of Others/A/B inserted at the key other.

map ?Topics/ to Mapped/<path(1)> insert Others/<scalar(/foo)> at /other

Similar to the previous example, but in this case the path of the insertion topic will be derived from a value within the selected source topic. So if topic Topics/A/B has a value of “bar” at key “foo” then topic selected to insert from would be Others/bar.

map ?Topics/ to Mapped/<path(1)> insert Others/<path(1)> key /foo at /other

All previous examples have shown the insertion of the whole value of another topic. Here the key keyword is used to select a specific item foo within the insertion topic value. If the insertion topic does not have the value with key foo then a reference topic will be created but no insertion will occur as we have specified no default.

When expand directives are used, the insert will occur for every output from the expansion, so for example:

map Topic1 to Expanded/<expand()> insert AnotherTopic at /other

If we assume that the content of Topic1 is an array of objects, then each array element will be expanded to produce a new topic at path Expanded/0, expanded/1 and so on, and each resulting reference topic will have the value from AnotherTopic inserted at the key /other.

Insert clauses can be chained as shown in the following example:

map Topic1 to Topic2 insert AnotherTopic at /other insert YetAnotherTopic at /yetAnother

In the above example, values from two different topics are inserted into the data to produce the reference topic.

And finally, the insert clause can be used along with as <value()> clauses, for example:

map Topic1 to Topic2 insert AnotherTopic at /foo/bar as <value(/foo)>

In this example, the data from AnotherTopic is inserted at the key foo/bar, then the full value of foo is projected.

Multi-line specifications

As the Domain Specific Language used to write topic views has become more complex, newlines and comments are now supported to improve readability. For example:

map ?A//
from Server1
to <path(1)>
as <value(/foo)>
# Join 2 topics
insert Topic2 at /T2
insert Topic3 at /T3
throttle to 1 update every minute

Caveats

Insertion only occurs at the point in time that a topic view is evaluated for a source topic. This happens when the topic is created and every time it is updated. Interim changes to the insertion topic(s) will NOT be reflected in the reference topic until the source topic is updated.

The normal caveats relating to topic views apply to remote topic views. For example, you should avoid view specifications that derive topics from a highly volatile set of source fields, as there can be high CPU and memory costs relating to rapid addition and removal of topics.

Summary

Topic view inserts introduce powerful new functionality into the ever-developing language of topic views.

You can try topic view inserts right now in Diffusion 6.6 Preview 1, which is supported for production use. Let us know what you think, and especially if you come across any issues or have feature requests about how we can make topic views even more useful.


Further reading

BLOG

Exploring Generative AI: Opportunity or Potential Headache?

March 25, 2024

Read More about Exploring Generative AI: Opportunity or Potential Headache?/span>

BLOG

Unlocking the Value of ISO 27001 Certification: A Journey of Security and Continuous Improvement

March 25, 2024

Read More about Unlocking the Value of ISO 27001 Certification: A Journey of Security and Continuous Improvement/span>

The Diffusion Data logo

BLOG

Benchmarking and scaling subscribers

March 15, 2024

Read More about Benchmarking and scaling subscribers/span>