upvote
Sure, but understanding the core concepts are essential to make things efficient and as far as I understand, this has mainly educational purposes ( it does not even run on a GPU).
reply
yep, agreed. wasn’t knocking the project at all, it’s great for exactly that purpose
reply
I think the hard part is improving on the basic concept.

The current top of the line models are extremely overfitted and produce so much nonsense they are useless for anything but the most simple tasks.

This architecture was an interesting experiment, but is not the future.

reply