Versile Decentral Identities

This is the Versile Decentral Identities 0.8 specification.

VDI is a mechanism for decentrally created identities which can be used to authenticate with services or other users, and which can be encrypt/sign information on behalf of this identity. A decentral identity is a represented by the public-key component of an RSA keypair generated from a set of secret identity data.

Es an example, VDI would generate from the passphrase 'w9FLk2pDnc9G9f' the following 1024-bit RSA key pair which can be used for authentication, encryption or signatures:

-----BEGIN VERSILE RSA KEY PAIR-----
/3eu580QMwsVNmB/8Tvl3TEscThfxo8Ly2rVFEKHuHNZiTsERWl03pyANvAfNZCDMi03CmmXmWWn
EIojDWof/zG0c6AwSXOvyFeNVTgWQBZBhw3YMrh0Oz42T5KaKW4AmTi5+sGZajvF1U/z37ZUt8tk
qoxzwApFo9cDI7b+i9ifkPj/Cv93obSfy31VnswCVeeDOFbTMsKOm0XpVL9ByzSK+g3XX41XqfA+
upVh3cxtHn2YWSOX2A5hZ46kA9xvyk6JsoFuwl6bMhR5v3w8i6RfVmJ5Ts0Jp500mveLqapoJLsk
LqjXyT1zBRE+H61LfGEyvJLTgpD1qyvtz+4fmAd4tqVxNer/N8tZvpRV+D/KoVhHFclxVAh5O64N
6DZVdc007Lgc5XeVUmsE919hjodtpsU0XjgfobNj5EpLJMFWAXHmR7kvMrr/N9wwsNKaeoT/QI+s
Wq6Gq8Sja/KYyuVOsFo0EqcK/Oar7z96CRpZXXI7PWeTiSxU2+3mfiORxwFsw4XP/tJBvMA=
-----END VERSILE RSA KEY PAIR-----

VDI enables public-key based mechanisms with the same convenience as (somewhat complicated) passwords.

The Problem Solved by VDI

Standard identity management practices typically involves forced central authentication of identities (e.g. a “user” is authenticated by a “password”), weak password protection schemes (as people often choose insecure passwords) and complex user/password ecosystems (with too many usernames and passwords to remember so people often use the same username and password).

VDI aims to improve this situation by using public keys to authenticate an identity (enables decentral authentication) with keypairs that are generated from a set of secrets which are fed through cryptographically strong pseudo-random functions. By using the same algorithm for key generation on different devices the same set of identity secrets will always be produce the same identity.

With VDI the public key component takes the role of a ‘username’ which can be authenticated by the private key component which plays the role of a ‘password’. However there is no need for central authentication because a user can prove knowledge of the ‘password’ using the private key. Also the private key component never needs to be revealed to a 3rd party so the ‘password’ cannot be stolen from a server.

Note

A central service can still maintain a map between more user friendly “user names” such as “email addresses” or “aliases” which it can associate to a VDI identity. Such a map and user space would have to be authenticated by a central service.

A user can use the same identity across multiple services, or if the user wants to use different identities for different services it is sufficient to introduce slight variation in the identity secrets (such as adding the name of the service) in order to produce a unique new identity.

Decentral Identity Scheme A

The VDI scheme Decentral Identity Scheme A (currently the only defined scheme) is identified with scheme identifier ‘dia’, or in combined form as ‘dia<bits>’ which includes the number of bits in the generated RSA key of the identity, e.g. ‘dia2048’ for a 2048-bit RSA key.

Secret Phrase

An identity RSA key length must be specified which is the number of bits of the key. DIA specifies the length must be minimum 512 bits (in practice this is insufficient security but the standard does not force higher bit counts), and the key length must be a multiple of 8.

The identity is generated from three string data components:

Component Represents
Purpose A label of the purpose for the identity
Personal Some easy-to-remember personal data which
Passphrase Passphrase protecting the identity

The ‘purpose’ component is any string which identifies the intended scope the identity is used with, e.g. it could be ‘gmail.com’, ‘ebay.com’, or another part of a URL associated with the service the identity is used with. It does not need to be secure; it should be something which is very easy to remember and associate with the identity used (such as the URL of a service).

The ‘personal’ component is an inexpensive way of adding some data to the key which will help make it ‘unique’, and make it even harder to use brute-force attacks against the key. E.g. a string like ‘my car is really slow’ is easy to remember. Even using a name or birthdate would add some additional protection.

The ‘passphrase’ component is the main component of the key which protects the key. It should be “long and complex” in order to protect it against brute-force attacks. DIA requires the passphrase must be minimum 10 characters, and for reasonable ‘practical security’ it should be minimum 13-14 characters (see Identity Security):

Warning

The passphrase must be high quality (which typically implies it should be randomly generated from a good source of cryptographically strong random data) and sufficient length in order to protect the identity.

Identity Generation

An identity is generated from secret identity data as per the following algorithm:

  1. Let p_str be the concatenation of the strings ‘Purpose:’, purpose, ‘:Personal:’, personal, ‘:Passphrase:’, passphrase
  2. Let p_in be the UTF-8 encoded byte representation of p_str
  3. Let phrase be the concatenation of p_in, b’:Scheme:dia’, and the (ASCII) number of key bits

E.g. for a 1024-bit key with purpose ‘str_a’, personal ‘str_b’ and passphrase ‘str_c’ the phrase would be

'Purpose:str_a:Personal:str_b:Passphrase:str_c:Scheme:dia1024'

Let the pseudo-random function p_rand be the PRF function defined by RFC 5246 using SHA256 as the hash function, an empty secret and seed=phrase .

Let p_len be the number of bytes n (not bits) of the generated identity, divided by 2 and rounded down. Let q_len be n-p_len.

Generate a prime p by extracting the first p_len bytes from p_rand, setting both the two most significant bits to 1 and setting the least significant bit to 1, and interpreting the resulting byte sequence as a non-negative integer (most significant byte first). Let candidate be the resulting number. Test whether candidate is a prime, if not increase candidate by 2. Keep testing for primality and adding 2 until a prime is found. Let p be the resulting prime.

Generate a prime q by extracting the next q_len bytes from p_rand and executing the same algorithm to produce a prime.

Create an RSA key from components p and q, using 65537 as the public exponent (if 65537 is not a valid exponent use the smallest number e>65537 which is valid).

Identity Security

The primary protection of the identity is the ‘passphrase’ which critically needs to be a “strong” password in order to protect against brute-force attacks.

Note

A strong password is critical to the security of the identity which can be hard for many users to create and remember. However because the scheme essentially allows reusing the same passphrase for many different identities, the user has an incentive to create and remember at least one single very complex passphrase, with the benefit that everything else gets much easier to remember.

The public key component of a decentral identity can be attacked by brute-force as an attacker can perform offline computations to try to identify a set of identity secret data which produces the identity’s known public key component. It is important the secret id data should make such attacks practically, i.e. expected time/resources should by far exceed the value of achieving a successful attack.

Let’us perform some computations what this means in terms of how the secret id data is constructed. For our computation we will assume there are no weaknesses discovered in the pseudo-random function and its relationship with prime number generation - or the rise of “quantum computing” or technologies which would obsolete RSA alltogether.

Very High Security

Let us identify a “very high” level of secret identity data protection for the identity. For a brute-force attack, testing a set of input data could potentially be reduced to the operation of

  1. Creating PRF data for key generation
  2. Extracting prime number candidates for prime number generation
  3. Validating if their product is “near” the n parameter of the known public key
  4. If yes then try to construct prime numbers which generate n.

Let us assume that (4) occurs sufficiently infrequent that its associated time/resources can be ignored. Further assumptions could be:

  • Everyone on the planet is involved in cracking the key, with 7 billion people dedicating their computing resources to this effort (assume the earth population stays constant)
  • Every individual contributes 1000 cores of computing power
  • Specialized hardware is used which is able to perform steps 1-3 at 1GHz frequency, churning through one billion candidates of secret id data per second per core

For our very high level of security we will require that the expected time to identify the key via brute-force with the above available resources must be 100 years, meaning the time to exhaust all possible keys is 200 years.

The required number of secret id data permutations in order to brute-force identity secret data with 100 years time expectancy is ~4.4e31 combinations.

We assume the ‘passphrase’ component of the secret id data is generated entirely of lower- and upper-case characters A-Z and numbers, so for each character there are 62 combinations. If the passphrase is generated entirely random then 18 characters in the passphrase has ~1.8e32 combinations, twice the required number of combinations.

If a hard-to-guess ‘personal’ secret id data component is used that also contributes to key strength. If the permutations by the ‘personal’ component is equivalent to 2-3 characters in the passphrase, equivalent security is provided by a good ‘personal’ phrase plus a 15-16 character passphrase.

Note

Unless weakness are found that can be exploited in the Pseudo-Random Function and key generation (or other RSA weaknesses), with the above simplifying assumptions a passphrase of 15-16 (entirely random) characters plus a reasonably good ‘personal’ phrase should be sufficient to protect against a 100-year planetary effort

Practical Security

If we relax the security requirements we can get away with a shorter key. Let us replace the ‘very high security level’ assumptions with the following assumptions:

  • Computing resources equivalent to 1 billion people contributing 64 computing cores each (e.g. a dedicated ‘major power’ effort)
  • Each core can process 100 million keys per second
  • Expected time of brute-force attack 10 years for that computing power (so 20 years to exhaust the whole key space)

The number of identity secret data permutations required to sustain this attack is now down to 4.0e27 combinations which is about 10% of the number of permutations of a random passphrase of 16 characters (4.7e28 permutations). If the ‘personal’ phrase is good enough to account for 2-3 characters worth of security, then only 13-14 passphrase characters are required for equivalent security.

Note

Unless weakness are discovered than can be exploited in the Pseudo-Random Function and key generation (or other RSA weaknesses), a passphrase of 13-14 (entirely random) characters plus a reasonably good ‘personal’ phrase should be sufficient to protect against a 10-year dedicated effort of a major power against the secret identity.

Choosing Password Length

The above estimates suggest 13-14 (entirely random) passphrase characters plus a reasonably good personal phrase should be strong enough for almost any practical purpose, and 15-16 characters plus a good personal phrase should be enough for most scenarios.

Warning

Of course there are no guarantees; crypto systems that were thought to be strong have later been broken. Technology constantly moves forward with increased computing power and people are already speculating whether quantum computing could “break” RSA.

Implementations

An implementation of Versile Platform must provide a mechanism for generating decentral identies from their associated input data.