Linking Data in Supply Chains to Bring Much-Needed Resilience to Global Businesses

OriginTrail v4.1: Permissioned and public data in a single decentralized graph

Branimir Rakic
OriginTrail

--

The ongoing dichotomy of public and permissioned networks in the DLT space with regards to data privacy needs a radically different, connection-first approach. Connecting permissioned and public data in one Decentralized Knowledge Graph is the key new functionality of the latest version of OriginTrail, the first major upgrade to the OriginTrail node client in the Freedom-Gemini stage. It enables new exciting use cases — like data marketplaces, data portability, and automated GDPR compliance — and positions OriginTrail at the forefront of innovation in DLT solutions for supply chains.

Ever since the inception of the OriginTrail project in early 2013, which at that time focused on showcasing food provenance to end consumers, it has been clear that obtaining the necessary information to trace back to the product origin presents a difficult problem. Understanding what has happened with any object of interest involves tracing the history of relevant supply chain events, including those involving the product’s components (i.e. “parts,” usually obtained from different suppliers) and often its aggregations (i.e. tracing a group of products on pallets) across the many stages of a possibly globe-spanning supply chain. To illustrate the effect of such an issue in simple terms, consider that a staggering 70% of 300 respondents to a survey conducted by Resilinc in late January and early February, immediately following the Covid-19 outbreak in China, said they were still in data collection and assessment mode, manually trying to identify which of their suppliers had a site in the specific locked-down regions of China.

The reason behind this surprising statistic is that gathering all the necessary data to obtain such knowledge involves major hurdles. The complexities, among others, include having heterogeneous data structures across IT systems, limited data accessibility, integration complexity due to low system interoperability and lack of mechanisms to ensure the integrity of data and digital processes within the supply chain network, to ensure only relevant and trustworthy information is shared.

Such issues have led to an information landscape of “data silos”, the bridging of which eludes even the largest technology vendors for a very good reason. Tackling this difficult problem of a “fragmented reality” requires taking a radically new, connection-first approach to how we build supply chain data networks. We need to be thinking of, building and engineering systems with an ever-increasing amount of data in mind (“connected data”), systems that can easily communicate with one another and ultimately trust each other (“connected systems”). Such interconnectedness of data and systems will vastly improve organizational and supply chain resilience, which is of crucial importance in times of crisis such as the current global pandemic.

The good news is that we’ve seen similar systems built before. The World Wide Web (WWW) presents a “network of machines” that effectively communicate with one another. The WWW is also a “network of data” thanks to key underlying concepts such as Hypertext. The evolution of the WWW owes its success to a complex stack of open protocols, standards, and infrastructure essentially built on the connection-first principles.

Connecting Systems Can Now Be Done While Escaping Vendor Lock-Ins

The OriginTrail ecosystem has been evolving with a connection-first approach from the very beginning. The launch of the permissionless OriginTrail Decentralized Network (ODN) in 2018 (Vostok stage) has provided the necessary infrastructure, efficiently utilizing a stack of open technologies: the Ethereum blockchain in the consensus layer, decentralized P2P Kademlia overlay network, strong cryptographic primitives, and open data standards to name a few. With iterative development, the system has vastly improved over time, culminating in the major v4 release of the network client in December 2019, which marked the beginning of the Freedom-Gemini development stage and enabled higher data scalability, better system connectivity, and interoperability.

As of today, the ODN spans the globe with participating network nodes distributed across continents, operated by both companies and individuals in a permissionless manner (according to the specifications of the OriginTrail protocol), and facilitating easy connectivity with legacy systems from major vendors such as Oracle, SAP, Microsoft, and Salesforce. Being a completely open-source, permissionless network, OriginTrail effectively tackles the vendor lock-in problem, enabling frictionless system connections through high interoperability.

Connecting Data, the Google Way

The mechanisms within the OriginTrail protocol today closely resemble the way Google utilizes hyperlinks between web pages and manages to understand their data contents (the key factor which differentiated Google from early web page indexes such as Yahoo and later partially evolved into the famous PageRank algorithm), as both technologies harness the power of their respective connection-first data structures, also known as knowledge graphs. However, important differences in the nature of the supply chain IT landscape and the World Wide Web require a more complex approach in building the global supply chain knowledge graph than the one Google has taken, specifically related to data governance, decentralization, and employed standards.

Connections are “first-class citizens” in knowledge graph data structures.

Therefore, all of the participating ODN network nodes support a novel data structure — a Decentralized Knowledge Graph (DKG) — within the ODN data layer with several key characteristics:

  • Linked data first structure: The graph, enabling connections between data points from all published datasets on the network, conformant with Semantic Web technologies such as RDF and JSON-LD.
  • Schema flexibility: Enabling the mapping of virtually any data model, preferably structured according to relevant standards (such as GS1 EPCIS and CBV) and recommendations (W3C Web of Things, Verifiable Credentials, PROV, etc.) for machine readability.
  • Identity verification: Enabling the utilization of novel identity frameworks such as self-sovereign identity (SSI), in conjunction with industry-specific identification schemes (such as GS1 GTIN, GIAI, and GRAI).
  • Efficient cryptographic integrity verification of subgraphs: Using associated dataset graph fingerprints, computed as Merkle roots of the input datasets.
  • Cryptographic connection entanglement: Allowing the linking of data points only when specific cryptographic rules are satisfied.

In short, querying the OriginTrail DKG enables finding all available (connected) information on a particular supply chain object in a matter of seconds, spanning any number of datasets with any structure, originating from different IT systems, easily verifying the data provenance and integrity.

Not All Data Is Created Equal

Ever since the emergence of public decentralized networks, a major concern for both enterprises and individuals has been the inherent transparency of such networks. Naturally, publishing sensitive data on a public, decentralized and immutable network has been an obstacle to the adoption and proliferation of the value propositions of novel consensus mechanisms implemented in a large number of upcoming innovative protocols. With the hope of harnessing the value offered by the revolutionary technology of the blockchain, many have experimented with building “permissioned”, private blockchain networks that are usually managed (and controlled) by a small number of parties. Although such system implementations offer some additional value in certain cases (and somewhat higher TPS, as with fewer network nodes needed to form consensus, the transaction throughput can increase), they inevitably become another information silo, with a questionable degree of trust for anyone outside of that permissioned network.

The OriginTrail Decentralized Knowledge Graph approaches this dichotomy of private and public from a unified perspective. Taking an arbitrary sample of information belonging to a data subject (company, institution, or individual) there’s inevitably going to be parts of the data that are shared only with specific parties, while parts of information are shared publicly. As an example, a person might want to put relevant information about their professional career on their public Linkedin profile, but will most certainly try to keep info such as their social security number (SSN) secret and only present it to parties that need to see it.

Connecting “Private” and “Public” Data: The 4.1 ODN Release

OriginTrail applies a similar principle as described in the example above and provides the ability to publicly store structured, cryptographically verifiable information in a cost-effective decentralized way while keeping the same information connected via relevant cryptographic pointers to the private data they keep in their local version of the DKG. The participating parties can decide sovereignly to share their private information (or private subgraphs) to other participants (verifiers) in the network based on their network identity (i.e. ERC725 identity), enabling verifiers to cryptographically verify the integrity and provenance of shared information utilizing the publicly available proofs in the DKG.

In this way, the system mimics the natural state: Not everyone has all the same information nor wants to share their private information with everybody, but they are joined to the same network of entities (ODN nodes, connecting systems) and the same network of data (the DKG, connecting data). Because OriginTrail is not a blockchain, but rather a decentralized graph data sharing and management network operating in conjunction with the blockchain within its protocol stack, it evades many constraints that are imposed by the nature of distributed consensus algorithms and focuses on linking data in a permissionless, connection-first decentralized network.

After significant effort and testing by the core developers, we are proud to announce the launch of the first major update — the v4.1 version — coming to the OriginTrail Testnet and Mainnet this week! With this update, the OriginTrail DKG will enable:

  • Adding additional data to the network DKG by attaching permissioned claims, according to the verifiable credentials data model;
  • Assigning specific access permissions on subgraph vertex level; and
  • Verifying graph integrity (containing both public and permissioned data), utilizing the same OriginTrail protocol spec.

Note: The v4.1 update is backwards compatible and will be accompanied by updated documentation for developers and system integrators looking to test out the new features. We highly recommend that node runners update to v4.1 as the forthcoming updates will introduce new features that are incompatible with older versions (pre v4.1) and will limit older node functionality in the future if not updated.

New Use Cases Enabled by the OriginTrail DKG

With the new features introduced, the ODN enables several new use cases for the OriginTrail technology. The key ones to highlight are:

  • Supply Chain Search Engine (a decentralized, trust-minimized EPCIS repository): As with common search engines, companies, shoppers, and regulatory bodies can lookup any supply chain entity (product, actor, document, etc) by its identifier to retrieve its historical graph with all relevant public information, including pointers to existing, yet permissioned data. All the retrieved information is structured according to relevant standards, machine-readable, and interoperable with existing systems.
  • Decentralized Data Marketplace: Enabling the purchase of datasets directly from content (data) creators without intermediaries and with inherent trust. Such an MVP implementation has already been developed within the Food Data Market project, utilizing the OriginTrail DKG and the FairSwap protocol, and will be piloted within the Laboratory Data Marketplace project through the Block.IS program.
  • GDPR-related data provenance and portability: Enabling web portals to transparently and with integrity show their visitors the provenance of personal data and how it has been used, enabling them to easily extract all personal data and transfer it to another service. As OriginTrail is already compatible with the SSI framework, the new features enable a truly decentralized and automated DSAR process.

Visit www.origintrail.io and learn more about the possibilities for building supply chains of the future. Subscribe here and keep in touch for more information to come.

👇 More about OriginTrail 👇

Web | Twitter | Facebook | Telegram | LinkedIn | GitHub | Discord

--

--

Branimir Rakic
OriginTrail

Builder, explorer and a glass-half-full type of a person. Into social empowering technologies & art. Co-founder and CTO at @origin_trail