With all the devices put in place everywhere, and software and data engineers jumping on board with streaming data technology, there's one thing that small thing that is not looked at, and it baffles me: data integration and business verification.
Event
On 2023-01-01 at 13:55, I've finished my lunch. This is considered an event, not a life-changing event, but an event nevertheless. The data in this event is the date and time, that it was my lunch. All this is fine if one knows the context, and if that's all you're interested in. As long as nobody asks who I am, or what I had for lunch. Which brings me to the point of integration. The event is very difficult to integrate back into the cafeteria's lunch administration system. Yes, I had lunch, but which lunch? Also difficult to integrate with the cash register, for nobody knows who I am and therefor not know which payment belongs with that lunch.
Minecraft
The following simplified world of minecraft we model the facts regarding the player. The facts are pretty straightforward:
[Inventory]
"player 99234890823 holds torches"
"player 99234890823 holds diamond sword"
[Player Age]
"player 99234890823 is 338 cycles of age"
[Player Experience]
"player 99234890823 has 8873729 experience points"
character and enemy
The enemies in Minecraft are also very simple. The following facts regarding an enemy (being a creeper character) are modelled as well. The character has a few properties which apply as defaults once a character is instantiated into the game as a real living enemy of our player.
[Character Experience]
"creeper has 5 experience points"
[Character Health]
"creeper has 20 health points"
[Character Max Attack]
"creeper has 97 maximum attack points"
[Enemy Age] "enemy 50 is 338 cycles of age"
[Enemy Character]
"enemy 50 is a creeper"
[Enemy Experience]
"enemy 50 has 5 experience points"
[Enemy Health]
"enemy 50 has 20 health points"
the distance
Naturally, while the player moves around in the world of minecraft, so will the creepers. The players have their missions, while creepers simply explode as soon as they approach the player. To determine if a creeper should explode we first need the fact that states the distance.
[Player Enemy Distance]
"the distance is 3.43 units between player 99234890823 and the enemy 50"
So far, all was pretty simple, and it allows us to build a system which administers the players, enemies, characters. On top of that, it can now calculate distances between the parties to trigger a simple creeper explosion when applicable.
Creeper explosion
Statistically speaking, it'll happen that a creeper will explode at some point in the game to fulfill its mission.
[Creeper Explosion Power]
"creeper enemy 50 exploded with a force of .887"
[Distance When Exploding] "the distance is 3.43 units between player 99234890823 and the enemy 50 when the enemy explodes"
Player Death
In the unfortunate case the creeper exploded close enough to the player, to end the players' life, we need to report a few things as our event.
[Death by Creeper Explosion]
"player 99234890823 died by exploded creeper 50"
The event
Now we have modelled the part of minecraft relevant to players and creepers to the point where an explosion killed the player. Suppose we want to let some other system know about this event. We want to report the death of the player, the cause of death, the circumstances and the inventory the player left in the playing field. We would write down the following to report this:
In this event report we would like to see:
- Which player was killed, which items from the inventory were left behind, the age and experience level.
- In regards to the cause we'd like to report the character that caused the death and some details of that cause such as explosion power and distance, and the age of the enemy.
- And naturally the time of death.
To report this data we need only part of the entire model, since we're not interested in which creeper it was exactly. Since there are too many irrelevant creepers in minecraft, only the actual players and the items from their inventory are relevant.
A selection of the data elements from the model are these:
As you'll instantly see, there's very little correlation between the strength of the explosion, the distance between the creeper and the player, and the age of the creeper. This is due to our requirements which did not need to know which creeper was responsible for the explosion and death of our player.
And still the event may be reported and be perfectly valid. Looking at the reported data elements in a hierarchy (graphically or in a json or xml file) would show very little problem for the reader. This is the nature of data events. We can pick and choose any data element we would like to report or have reported.
This data event can be reported in different hierarchies as well. In the accompanied screenshot, we organized all data elements as sub-items of our dead player. Nothing would stand in our way to organize it by Item, Age, or anything else for that matter. That is in the nature of the ever-changing perspectives. We can recognize this as one version of the facts, and many versions of the truth.
Integration
When the data events are coming from our system, and we need to integrate it into a real minecraft world, we will be missing crucial data. The most obvious in this case is "Which enemy did explode?". And the most obvious, though simple in its example, points to the heart of the problem.
To integrate data from various systems, we need to know and provide the shared identifiers of the data elements on both ends of the communication pipeline. The integration parts always proved the hard problem. Writing and providing data events if very flexible, but in some cases too flexible for the purpose of real integration.
So, the next time you come across an API, a Data Event, or a Message Bus, always ask yourself the following questions:
- Is this meant to be for reporting only?
- Is there, at any point in time, a need to integrate the data?
- Can we use the same structure to update, insert or delete the data, and might we need to?
- Does the structure align with cross-system communication definitions?
- How does every data element used, map to business terms, meanings and identifications?
game over
This minecraft example is a made up example. We do not condone violence in any form, but do recognize the need for play. The real issue displayed above is that too much data events are handled as if they're easy to build, and easier to manage. And the integration questions, if unanswered, simply show business/IT-game is not played fair. Technically, a lot is possible, but we tend to take shortcuts and that will lead to a technical debt that results in having to do it all over from the start. We should consider the real impact, or it'll be game over all over again.
Note: This article is loosely inspired after reading Event Modeling by keen.io. It is not meant to burn their product or article. It is meant to make a point about integration needs.