upvote
Inside the magic AI box is literally nothing but this loop:

    int n_tokens = 0;
    while (n_tokens < TOKENS_MAX) {
        int next_token = decode(context, ++position);
        print(token_to_text(next_token));
        ++n_tokens;
    }
If you don't believe me then just download llama.cpp and see for yourself.
reply