Complex Service Interactions: Service Ontology and the API of APIs

In a basic transaction on Osiris, a user gives tokens to a single service provider, who performs the requested AI task. However, many tasks will require a more complex combination of actions by multiple AI service providers. For example, control of humanoid robots requires multiple AI services—natural language processing, motor control, speech synthesis, etc.—to collaborate according to a particular architecture.

For a simpler example, let’s say Alice requests that Osiris summarize a French-language website with embedded video. Her request is sent to a document summarizer service, but perhaps the top service specializes in English text summarization. Without recourse to other services, Alice's request cannot be fulfilled. However, by relying on the network of AI services, we can create an arrangement in which

  1. the text on the website is sent to a document translation service, which returns an English version;

  2. the embedded video is sent to a video summarization service, which returns a textual summary of key facts and events in the video; and

  3. the original document summarizer service puts together these results and provides a useful summary of the website, even though it cannot understand French text or process video.

Because of these interactions between services, the document summarizer provides higher value for its customers and can earn more. Moreover, demand for the other two services grows. The result is a more vibrant marketplace. Interactions can grow more and more complex. The video summarizer can outsource face recognition, object recognition, speech detection, and speech-to-text transcription. The document summarizer may also outsource entity recognition to other services. Any of those can explicitly hire hardware services for storage or GPU access.

Out of this complex, dynamic interaction of numerous network participants carrying out complex AI services using their collective intelligence comes a Osiris-wide AI mind with a level of intelligence that is greater than the sum of its parts. (Notably, contemporary neuroscience’s best understanding is that in the human brain, general intelligence emerges from 300 to 400 distinct subnetworks working together, each with its own architecture and set of functions and connected to specific other subnetworks in a carefully patterned way.) Furthermore, this emergent AI mind will be continually enhanced, as AI developers around the world add new nodes into the network, contributing to and profiting from Osiris economy.

The platform will enable these complex interactions through three layered resources:

  1. At the bottom, the type repository in the Registry allows services to state their inputs and outputs in a standard way. Service ads can say things like the following:

a. I provide outputs of this given type (“text”).

b. I provide outputs of this given type and value (e.g., “Language” is “English”).

c. I require inputs of this given type.

d. I require inputs of this given type and value (e.g., I can summarize docs if “Language” is “English”).

  1. Built on top of the type repository, we will have a collection of APIs, which refer to concrete type data and metadata. This allows for standard specifications for AI tasks like “face recognition,” “document summarization,” “genomic dataset annotation,” and so forth. These APIs are a more vibrant semantic version of the standard gRPC specs already provided by the services. As the APIs are public, any developer can implement multiple APIs that provide the same service.

  2. At the top level is an ontology of AI services that makes the APIs understandable and browseable. This ontology will be a directed acyclic graph with a few different roots, covering, for instance, areas of AI, application domains, and so forth. So “face recognition” would be found somewhere in the ontology, and it would be a child node of nodes such as “image processing,” “deep neural networks,” etc.

These three levels allow developers to find services that will accept their data as input and perform the desired function. The document summarizer AI developer in the example above needs this structure to identify auxiliary services needed to complete the job.

Provided the specifications at each level are precise enough, they also allow the emergence of AIs that connect other AIs, or programmatic service finding. These are called matchmaking agents.

AI services that serve as evaluators can be developed and launched on the network. They will specialize in assessing and rating the quality of work done by a particular service. This will allow users to search for services that offer a particular service (e.g., face recognition), are compatible with a particular API, and meet a certain standard according to an independent AI evaluator. These automated evaluations are useful for consumers, highly valuable for matchmaking agents, and a key input to the reputation system described in the next section.

Independent evaluators and public, standard APIs make it easy for new entrants to the marketplace to find customers. They can support popular APIs, enabling plug-and-play replacement of existing providers, and use independent evaluators to show the quality of their services to the marketplace.

Particularly for large enterprise customers, specialized agents can be developed to scan for new, exciting services on the market and test them on a particular problem (for example, finding patterns in a financial dataset), helping the customer select a service based on A/B testing or multiarmed bandit selection.

Last updated