upvote
When I came to this idea on my own, I called it "translation at the edge." But for me it was more that just centralizing data validation, it also was about giving you access to all the tools your programming language has for manipulating data.

My main example was working with a co-worker whose application used a number of timestamps. They were passing them around as strings and parsing and doing math with them at the point of usage. But, by parsing the inputs into the language's timestamp representation, their internal interfaces were much cleaner and their purpose was much more obvious since that math could be exposed at the invocation and not the function logic, and thus necessarily, through complex function names.

reply
I disagree. I think the key insight is to carry the proof with you in the structure of the type you 'parse' into.
reply
Could you clarify what you mean by "carry the proof"?
reply
Let's say you have the example from the article of wanting a non-empty list, but you don't use the NonEmpty type and instead are just using an ordinary list. As functions get called that require the NonEmpty property, they either have to trust that the data was validated earlier or perform the validation themselves. The data and its type carry no proof that it is, in fact, non-empty.

If you instead parse the data (which includes a validation step) and produce a Maybe NonEmpty, if the result is a Just NonEmpty (vs Nothing) you can pass around the NonEmpty result to all the calls and no more validation ever needs to occur in the code from that point on, and you obviously reject it rather than continue if the result is Nothing. Once you have a NonEmpty result, you have a proof (the type itself) that is carried with it in the rest of the program.

reply
Typed functional programming has the perspective that types are like propositions and their values are proofs of that proposition. For example, the product type A * B encodes logical conjunction, and having a pair with its first element of type A and its second element of type B "proves" the type signature A * B. Similarly, the NonEmpty type encodes the property that at least one element exists. This way, the program is "correct by construction."

This types-are-propositions persoective is called the Curry-Howard correspondence, and it relates to constructive mathematics (wherein all proofs must provide an algorithm for finding a "witness" object satisfying the desired property).

reply
From the article:

    validateNonEmpty :: [a] -> IO ()
    validateNonEmpty (_:_) = pure ()
    validateNonEmpty [] = throwIO $ userError "list cannot be empty"
    
    parseNonEmpty :: [a] -> IO (NonEmpty a)
    parseNonEmpty (x:xs) = pure (x:|xs)
    parseNonEmpty [] = throwIO $ userError "list cannot be empty"
Both consolidate all the invariants about your data; in this example there is only one invariant but I think you can get the point. The key difference between the "validate" and "parse" versions is that the structure of `NonEmpty` carries the proof that the list is not empty. Unlike the ordinary linked list, by definition you cannot have a nil value in a `NonEmpty` and you can know this statically anywhere further down the call stack.
reply
I think that's an excellent way to build a defensive parsing system but... I still want to build that and then put a validator in front of it to run a lot of the common checks and make sure we can populate easy to understand (and voluminus) errors to the user/service/whatever. There is very little as miserable as loading a 20k CSV file into a system and receiving "Invalid value for name on line 3" knowing that there are likely a plethora of other issues that you'll need to discover one by one.
reply