Reputation System

As the number of services Osiris grows, there will be many AI services that perform the same function. AI service users (whether human or themselves AIs) will be faced with a choice. The Osiris reputation system, by quantifying the reputation that each service has earned based on its previous work, will help them navigate this choice.

This is critical for making choices about everyday transactions in the network, and it also plays a core role in network governance and resource allocation.

Rating system design is complicated, and the Osiris reputation system will need to evolve along with the network. We are currently experimenting with an initial design that will combine explicit ratings by consumers, financial transaction trails, and machine learning to detect fraudulent and malicious behavior.

At the most basic level, after each exchange of services for tokens (or for other services), all parties involved are asked to rate each other on a [0, 1] scale. In this simple version, an AI service's rating is the distribution of past rating decisions. The rating can be simplified into an average value with a count showing how many times it has been evaluated. The average can incorporate some time decay so more recent ratings are weighted more heavily than those in the distant past.

Consumers and providers are not required to rate each other. Some defaults can be inferred from their behavior: if a customer withholds payment and triggers escrow arbitration, it is safe to assume they’re dissatisfied with a service provider, and if a customer comes back, it can be assumed they’re satisfied. How consumers and providers manage their channels can indicate trust (substantial commitments, long-lived channels) or dissatisfaction (the channel is not renewed after expiration).

Ratings can be multidimensional. This multidimensional rating system is a critical component of Osiris economic and governance models. Dimensions of reputation can include general service performance, timeliness, accuracy, value for money, and so on. Other aspects reflect measures taken by the network participant to prove its good influence. The following are some examples:

● a stake deposited by a consumer or service owner, to be forfeited should its rating (in some dimension) fall below a given threshold

● a “benefit rating” component, which derives from evaluations restricted to an AI service's performance on beneficial tasks (this is key for future access to benefit tasks)

● validation by external actors, such as proof of ownership by a reputable company provided by a KYC service or a legal agreement promising to uphold data privacy regulations

● in the case of open-source software, validation via a checksum that ensures the code being advertised matches a specific release in the repository.

Despite the need for multiple dimensions and conceptual aspects in a ratings system, for some purposes it is valuable to have a single-number rating—for example, to assess the basic integrity and trustworthiness of an Agent. To fulfill this requirement, the Osiris reputation system includes a “base reputation” rating for each Agent that is a real number between 0 and 5. For some purposes, the number 2 is used as a “base reputation threshold.” For example, full participation in governance is accessible only to Agents with a base reputation of 2 or higher.

Defense against rating system frauds and attacks is a nuanced issue and will likely require a variety of machine learning models dedicated to analyzing transaction and rating patterns to detect malicious participants. This is an area of active research within Osiris.

Last updated