Merge Topic Data with Topic View Inserts in Diffusion 6.6
October 23, 2020 | Paddy Walsh
Diffusion™ 6.3 introduced topic views, a mechanism to produce virtual topics (known as reference topics) that refer to other topics in the topic tree for their value. In 6.4, we added topic view expansion, enabling a single source topic to be expanded to produce many derived reference topics. In Diffusion 6.5, the concepts of fan-out and topic views were combined to allow remote topic views to specify source topics at another server.
Now, in 6.6 we have added topic view inserts, which allow data from topics other than the selected source topics to be merged into the resulting reference topic data.
Inserts allow data from JSON or scalar topics to be inserted into JSON topics.
How are inserts specified?
Inserts are enabled using the new insert
clause in a topic view specification
The insert clause follows the path mapping in a topic view specification and takes the following form:
insert
path_specification key
source_key at
insertion_key default
constant
Only the insert
and at
keywords are mandatory, all others may be defaulted.
So, for example:
map ?Some_Source_Topics/ to Mapped_Topics/<path(1)> insert Some_Other_Topic at /Some_JSON_Pointer
Maps all topics beneath path Some_Source_Topics
to similarly named topics under path Mapped_Topics
inserting the complete value of Some_Other_Topic
into the current data with the key named Some_JSON_Pointer
.
The meaning of each part of the insert
clause is described below:
insert
path_specification
The insert
keyword introduces the clause and specifies the path of the topic from which data is to be obtained and merged with the current data value. The meaning of ‘current data value’ can depend upon the other clauses in the specification and is defined in more detail below. But for now, just think of it as the value from the selected source topic.
The path_specification defines the path of the topic to insert data from and it is similar to the current target path mapping in that it can contain:-
- Constants
<path()>
directives<scalar()>
directives
The path
directives operate on the path of the selected ‘source’ topic and the scalar
directives operate on the current input data as defined above.
For example:
Topic/<path(1, 2)>/<scalar(/foo)>
would be specifying insertion from a topic whose path is Topic/
followed by elements 1 to 3 of the source topic path followed by /
and the scalar value at the key /foo
in the current input data.
key
source_key
This optionally specifies the key (a JSON pointer) of an item within the topic indicated by path_specification. If not specified then it is assumed that the whole of the data value of the selected topic will be inserted.
at
insertion_key
This specifies a JSON pointer indicating the location of the insertion in the current data value.
Typically this would be an object key which would indicate the key of the value in the data. If the data already had an item with the same key it would be overwritten, otherwise a new item would be added to the parent indicated by the specified key. The parent would have to exist otherwise the insertion would not occur and a warning would be logged.
This can also indicate an entry in an array. If an index key is provided, the existing entry at the specified index would be replaced. An index of one greater than the current number of entries could be used to append to an array, but it is much easier to use the special ‘-‘ character instead. For example, to append to the end of an array at MyArray
you can use an insertion key of /MyArray/-
.
So if the key was specified as at /Address/Street
, it would indicate that the value from the selected topic to insert from would be inserted within the current data value within an object called Address
at a key called Street
. If the object currently had a key called Street
, it would be overwritten; otherwise, it would be inserted into the object.
If the resolved key indicates a scalar item then no insertion will take place.
default
constant
Normally if the source topic to insert from cannot be found (or was not a JSON or scalar topic) or the specified key within it does not exist then no insertion will take place in the current data value. It will simply be as if the insert had not been specified. However, it is possible to override this behavior using the default
keyword.
If default
is specified and the topic to insert from or the key within it are not found, the constant value will be inserted as a scalar value at the insertion point.
What is the ‘current data value’?
The insert
clause specifies the path of the topic from which data is to be obtained and merged with the current data value. The current data value is typically the value from the selected source topic; however, there are some situations where this is not the case:
- If the path mapping includes one or more
expand
directives, the current data value will be the expanded data value.
So, for example, if a path mapping is used to expand an array of 5 elements then the insert clause would be executed 5 times, once for each element. - If the insert is preceded by an
as<value(key)>
directive, the current data will be the data indicated by the key. - If the
insert
was preceded by anotherinsert
clause, the current data will be the output from that clause.
Examples of Inserts
map Topic1 to Topic2 insert AnotherTopic at /other
This is the very simplest example, where Topic1
is mapped to Topic2
and the data within AnotherTopic
is inserted into it at the key named other
. If AnotherTopic
does not exist (or is not JSON or scalar), Topic2
will be created with the same value as Topic1
but with nothing inserted. It is assumed that the value of Topic1
is an object – if it is an array, then no insertion will occur.
map Topic1 to Topic2 insert AnotherTopic at /other default "unknown"
As in the previous example, but in this case if AnotherTopic
does not exist then Topic2
will be created, with key other
inserted with a scalar value of unknown
.
map ?Topics/ to Mapped/<path(1)> insert AnotherTopic at /other
Like the previous example, but in this case all of the topics under the path Topics
will be selected and mapped to topics with the same name under the path Mapped
. Every selected topic will have the value of AnotherTopic
inserted into it (assuming they are JSON objects) – unless AnotherTopic
does not exist, in which case no insertions would take place.
map ?Topics/ to Mapped/<path(1)> insert Others/<path(1)> at /other
This example introduces the more powerful path mapping capabilities of the insert clause. In this case, each selected topic has an insertion from a topic with the same topic under the path Others
. For example Topics/A/B
would generate a reference topic at path Mapped/A/B
which has the value of Others/A/B
inserted at the key other
.
map ?Topics/ to Mapped/<path(1)> insert Others/<scalar(/foo)> at /other
Similar to the previous example, but in this case the path of the insertion topic will be derived from a value within the selected source topic. So if topic Topics/A/B
has a value of “bar
” at key “foo
” then topic selected to insert from would be Others/bar
.
map ?Topics/ to Mapped/<path(1)> insert Others/<path(1)> key /foo at /other
All previous examples have shown the insertion of the whole value of another topic. Here the key
keyword is used to select a specific item foo
within the insertion topic value. If the insertion topic does not have the value with key foo
then a reference topic will be created but no insertion will occur as we have specified no default.
When expand directives are used, the insert will occur for every output from the expansion, so for example:
map Topic1 to Expanded/<expand()> insert AnotherTopic at /other
If we assume that the content of Topic1
is an array of objects, then each array element will be expanded to produce a new topic at path Expanded/0
, expanded/1
and so on, and each resulting reference topic will have the value from AnotherTopic
inserted at the key /other
.
Insert clauses can be chained as shown in the following example:
map Topic1 to Topic2 insert AnotherTopic at /other insert YetAnotherTopic at /yetAnother
In the above example, values from two different topics are inserted into the data to produce the reference topic.
And finally, the insert clause can be used along with as <value()>
clauses, for example:
map Topic1 to Topic2 insert AnotherTopic at /foo/bar as <value(/foo)>
In this example, the data from AnotherTopic
is inserted at the key foo/bar
, then the full value of foo
is projected.
Multi-line specifications
As the Domain Specific Language used to write topic views has become more complex, newlines and comments are now supported to improve readability. For example:
map ?A// from Server1 to <path(1)> as <value(/foo)> # Join 2 topics insert Topic2 at /T2 insert Topic3 at /T3 throttle to 1 update every minute
Caveats
Insertion only occurs at the point in time that a topic view is evaluated for a source topic. This happens when the topic is created and every time it is updated. Interim changes to the insertion topic(s) will NOT be reflected in the reference topic until the source topic is updated.
The normal caveats relating to topic views apply to remote topic views. For example, you should avoid view specifications that derive topics from a highly volatile set of source fields, as there can be high CPU and memory costs relating to rapid addition and removal of topics.
Summary
Topic view inserts introduce powerful new functionality into the ever-developing language of topic views.
You can try topic view inserts right now in Diffusion 6.6 Preview 1, which is supported for production use. Let us know what you think, and especially if you come across any issues or have feature requests about how we can make topic views even more useful.
Further reading
BLOG
Benchmarking and scaling subscribers
March 15, 2024
BLOG
Creating a WebSocket Server for PubSub
June 28, 2024
Read More about Creating a WebSocket Server for PubSub/span>
BLOG
Exploring Generative AI: Opportunity or Potential Headache?
March 25, 2024
Read More about Exploring Generative AI: Opportunity or Potential Headache?/span>