## Introduction
Add data validation and data versioning to [models](https://gitā¦hub.com/eblanshey/safe-api/issues/2), which is absolutely critical for the SAFE network. The descriptions below are a copy-paste from the proposal:
- **Data validation:** every app has set rules on what its data should look like. If an object contains a `name` field, for example, it expects it to be a string of a certain length, and NOT an integer or array. In SAFE, anybody can upload any data they want, so client-side data validation must be included in the app to prevent breakage. Task: create an easy way for the dev to set rules for their data. If rules are not met, then either use some defaults, or remove the object entirely. The dev will have a guarantee that the object returned from the API will follow the application's rules, even if a nefarious user tries to cause trouble. Allow devs to attach error messages for each validation rule so they could be used to quickly validate forms, too.
- **Data versioning:** Object schemas change over time: a `string` field may eventually become an `array` field to allow multiple options. The developer cannot go and change all existing data to reflect this change, so his app needs to be smart and maintain backwards compatibility for every single object schema change. Over time, this can lead to massive headaches and spaghetti code in order to make sure old data is supported. Task: introduce versioning for models. Every object PUT to the network should have an integer version number. Upon fetching it, if the model detects that the version number is not the latest one, it incrementally modifies its data using the model's set rules until it matches the rules of the latest version. This should make app development much easier knowing that even old data on the network will be returned using the latest schemas. This allows newer code to read older data, but not vice versa. Bonus: if the requester of the data is also the data owner, automatically UPDATE the object on the network (if possible) so that the same version processing won't be required in the future for all requests. With Structured Data this will be free.
## Validation
Fields validated will have rules set like this:
``` javascript
// Define a model that has 5 properties: title, myNumber, date, myName, and hello.
Model.validation = {
0: { // version of the rules (explained later)
title: { // property name
rules: Rules.required().string().max(100), // chainable rules
default: "Default Title" // if fails validation, return this value
},
myNumber: {
rules: Rules.integer().min(0),
default: undefined // remove this property if validation fails
},
date: {
rules: Rules.date().dateBetween('2015-01-01', '2015-12-31'), // as parsable by MomentJS
default: Remove // special object for critical props that, if provided as default value, essentially ignores this whole object
},
myName: {
rules: Rules.custom(function(entity) {
if (!entity[myName].includes('Hi, my name is')) {
return false; // fails
} else {
return true; // passes
}
default: 'Hi, my name is John Doe'
},
hello: {} // no validation at all, but this is required if the field is present
}
}
```
Each property in the object being retrieved or put to the network must pass the validation rules provided. Each property is an object that contains `rules` and `default`, both optional.
A `Rules` object is supplied to `rules` that contains many chainable validation methods, including `custom()` allowing you to supply your own validation function that is passed an entity object. It should return a boolean indicating passing or failure. If it returns a function, then the return value of that function will be used as the new value. No validation rules are run if `rules` is not supplied. Rules are run in order of appearance.
If data is being `PUT` to the network and validation fails, it will throw an error. The `required()` rule only applies before `PUTs`.
If data is being retrieved, the `default` value will be used in place of fields that failed validation. If `default` is not set, it defaults to `undefined`. If `default` is `undefined`, the field will be removed from the entity. If `default` is a function, it will be called given the `entity` object, allowing to create a default value based on other fields. If `default` is an instance of "Remove", the entire entity will be set as `exists = false` and data discarded, in other words, as if it doesn't exist on the network. If the field does not exist on the object at all, it will be set to the default value.
Example validation rules:
- Primitives: `string`, `number`, `array`, `boolean`, etc
- Numbers: `max`, `min`, `between`
- Strings: `max`, `min`, `size`, `alpha`, `alphanum`, `base64`, etc
- Date: `date`, `dateBetween`, `dateFormat`
- Field modifiers: `trim` (strings)
If a field is present in the entity that has no matching key in the validation object, it is discarded. Thus, every field in an entity _must_ be defined in the validator.
## Versioning
When defining models, add a static `currentVersion` property that for the first time should be `0`. Every time any of the above rules are modified in a non-backwards-compatible way, such as a `string` becoming an `array`, increment the static version by one and add a transformer function that transforms the entity object from the previous rule-set.
Every entity **must** have a property `version` set that is an integer value, indicating the version of the validation used when creating or last updating the object. If not set on the entity, it is discarded (discuss--or should it assume the latest version?)
When changing an existing rule, the process is as follows:
1. Increment the model version by 1.
2. Add a new validation object with the key set as the new model version
3. Add a transformer that takes an entity using the previous ruleset and returns an updated entity.
Take the following Post model as an example:
``` javascript
class Post extends Model {}
Post.name = 'Post'
Post.currentVersion = 0
Post.validation = {
0: {
title: {
rules: Rules.required().string()
},
category: {
rules: Rules.required().string() // e.g. 'art'
}
}
}
```
We now have Post model whose current version is 0. The current validation rules define `title` and `category` fields, which are both strings. Now what if we wanted to make `category` an array instead of a string to support multiple categories? We can't just change the rules, since all the existing data on the network has a string, and thus would fail validation. We need to define a transformer along with the new rule:
``` javascript
Post.currentVersion = 1 // increment the current model version
Post.validation = {
0: {
title: {
rules: Rules.string()
},
category: {
rules: Rules.string(), // e.g. 'art'
default: 'Misc'
}
},
1: { // add a new validation rule that overrides the previous one
category: {
rules: Rules.array(),
default: ['Misc']
}
}
}
Post.transformers = {
// Takes entity with version 0 and turns it into version 1.
// By the time this function is called, validation 0 has already been called.
// Thus you are guaranteed to work with valid data.
1: function(entity) {
// turn category into an array
entity.category = [entity.category]
}
```
If a post is retrieved from the network with a `version` set to 0, meaning it was uploaded when version 1 didn't exist, the following will happen:
1. Run entity through validation for version 0.
2. Transform the resulting entity to version 1.
3. Run validation for version 1
4. If the entity owner is the currently authenticated user, PUT the latest object data to the network to prevent running transformers again.
## Future Considerations
- Since the dev already spent time creating validation rules, it makes sense to allow him to use them to validate user input from a form. How should this be taken care of? As an example, every field in validation could have a string "inputError" that contains an error if validation didn't pass, and an addition rules object could be provided that only ran on input, that could have `required()`, `isNumeric()`, etc. I am thinking to delay this part as it's not technically related to SAFE.
Have any comments or questions? Please share!