In a previous post I looked at the history of trust insertion technologies for enterprise storage systems. I commented that these enterprise concepts can be applied to data flowing into an edge ecosystem.
In this post I'd like to propose an example stack that highlights trust insertion techniques being applied against incoming device data. The diagram below proposes an example stack and compares it against the enterprise model.
The right-hand side of the diagram above shows a "thing" that is periodically generating data. This data is "augmented" by trust insertion technologies as it flows towards applications. Each technology is described below (starting from the bottom up):
- Hardware root of trust. if sensor data is ingested into a hardware entity with dedicated trust characteristics, the data can be viewed as "more trustworthy". This type of hardware, for example, can enable the following:
- The data is flowing into a device that has undergone a secure on-boarding process.
- The data is flowing into a device that has completed a secure boot.
- The data can be digitally signed, for example, by using a securely-stored private key. This provides a level of proof that the data passed through this trusted hardware layer.
- Note that the "thing" itself may have its own hardware root of trust characteristics (e.g., it was manufactured with a private key that signs data at the time of creation)
- Trusted identity/ownership: Data with a clearly identified (i.e., cryptographically-provable) identity can help reduce compliance fines (e.g., the owners of the data are much more likely to comply with regulations regarding the data) and increase data profits (e.g., if they ever want to sell their data that have proof that they are the original owner). Identity may be built into hardware devices (e.g., TPM) or identity may be assigned to IoT ingest frameworks by using new forms of decentralized identity.
- A securely-onboarded, secure-boot HW root of trust device can make a strong statement as to ownership of the data if it signs the data before pushing it to other parts of a corporation or to another edge ecosystem.
- An edge device (e.g., a sensor) may have already signed the data before it arrives at a framework like EdgeX (see below).
- A non-HW-root-of trust environment may have an owner configured into the system, and if this ownership has already been configured and the identity's private key is used to sign the data, the data may be considered more trustworthy.
-
Open Ingest Software: the use of an “open” IoT data ingest platform (like EdgeX Foundry) can give a level of confidence to applications that process edge data. With a closed system it is less possible to know what functionality was present when the data was ingested. If a well-known, fully open release of ingestion software is being used, with a published version number, the data flowing through that software can be thought of as more trustworthy.
-
Provenance Metadata: in addition to the use of Open Ingest software being present in the ingest environment, there may be other software or hardware elements present on the ingest device that give additional confidence to the data, (e.g., the version of LINUX) or software elements that guard against intrusion or threats (e.g., RSA’s Iris project). The presence and versions of these software elements can be gathered and “attached” to sensor data as a permanent record that travels with the data (or is otherwise somehow permanently associated with the data). This provenance metadata is also important for monetization: it can tell where the data came from (e.g., geography, owner) and this metadata can once again be used to enforce compliance laws/regulations, or it can be used as proof-of-ownership when trying to monetize data.
-
Authentication/Authorization support: If the ingest environment has strong authentication or authorization software capabilities, it is making a statement (to a potential application) that additional care was taken to verify that only certain people can (a) make changes to the ingest configuration, or (b) subscribe to receive data events from sensors. If the control path of the ingest device only allows changes to be made by specific administrators (e.g., an EdgeX admin wants to add a new EdgeX microservice), the data can be viewed as more trustworthy. Similarly, the use of a secure on-boarding management stack, for example, can enhance confidence in edge data.
-
Secure Transport: as data flows into an ingest ecosystem, and as it flows out of an ingest ecosystem, it may be possible to use encrypted, deterministic network flows to protect and deliver data. VMware Velocloud (SD-WAN) is an example of a technology that can be used for that purpose. If the network channels have a specific degree of security associated with them, this can improve application confidence in data. In addition to the secure transport of data, the protection of management commands (e.g., making configuration changes) may increase confidence. In addition if a secure WAN is used to protect data and deliver it on-time to a cloud environment, the data flowing over the connection may be viewed as more trustworthy.
-
Scale-out, Immutable storage: an edge ingest ecosystem may store data locally on an open edge close to the actual devices. The choice of storage can play a big part in data trustworthiness, and many of the following storage attributes can influence this trust:
- Is the storage system scalable to multiple decentralized parties (e.g., Bit Torrent-like accessibility and performance, which is the claim of the IPFS storage system)? This may improve confidence that the data can be delivered in a timely fashion to applications.
- Does the storage system have immutability characteristics (e.g., a content-addressable system that assigns a hash value to stored objects)? This can give the application confidence that the data has not changed.
- Does the storage system have non-delete capabilities or retention of data? If so, this may give the application confidence that the data will still be around if needed in the future (or deleted after a pre-determined amount of time).
- Ledgers: the rise of performant, scalable ledgers (e.g., Ethereum, Hyperledger, Project Concord) provides an immutable, trusted way to register the creation of new data assets and associate rich metadata, provenance, and confidence characteristics. Ledgers can provide the ability to manage edge data as an asset (keeping a “ledger for data” in a similar fashion as creating a ledger for financial assets). In addition, when ledgers are applied in a cryptocurrency environment, they provide the potential to exchange data assets for value (e.g., the rise in activity in the area of data brokers and data marketplaces). The use of a ledger, therefore, will result in an increase in application confidence:
- If a ledger is used to register the creation of assets, confidence increases.
- If a ledger is used to record the fact that policies (see below) are being applied to those assets, confidence increases because now there is an audit trail that can be referenced in the case of an audit, proving compliance and reducing the risk of significant fines.
- Ledgers can also be used to establish data ownership (a private key is used to sign the ledger entry). If data itself can’t be signed, ledgers can be used to prove ownership.
These trust insertion technologies are exemplary; there are other techniques that can be used (e.g., determining if sensor readings themselves can be trusted through historical comparisons). The ledger section also refers to the application of policies at the point of ingest: this is another great way to insert trust.
These trust insertion technologies can work together to provide trust to applications. How can they be stitched together in an open way? I will explore this question in a future post.
Steve
Dell Technologies Fellow