So, UUIDv4 as a PK on a clustered index can be perfectly feasible for cases where you want to avoid disclosing stuff and row insertion performance isn’t that important.
For a single database, bigints are smaller and faster, with less footguns.
UUIDs can be nice for an opaque public ID, however I'd still prefer something like a Sqid for space and usability.
But be careful!! Javascript WILL interpret your bigints as Number() and round them down because they are too big without telling you!!!
Famously seen by every snowflake user that has interacted with Javascript, quite an annoying problem.
Then it will always be a string and you will be free to change the format/type of the key in the future to UUID or whatever you like.
const json = '{ "a": 9007199254740993 }'
JSON.parse(json, (_key, value, context) => /^\d+$/.test(context.source) ? BigInt(context.source) : value)In JS - BigInt is 64bit integer.
In anything else - BigInt is a arbitrarily large integer.
const obj = { a: 9007199254740993n }
JSON.stringify(obj, (_key, value) => typeof value === 'bigint' ? JSON.rawJSON(value.toString()) : value) {"a":9007199254740993}
not {"a":"9007199254740993"} const json = '{ "a": 9007199254740993 }'
JSON.parse(json, (_key, value, context) => /^\d+$/.test(context.source) ? BigInt(context.source) : value)Node.js drivers will correctly read int64 as string or bigint, not number.
E.g. pg for PostgreSQL
Maybe there’s a buggy driver but I don’t know it.
You can of course, change the api such that it does {"id": "1324535222364012585"} instead and voila, it will no longer try parsing it as number. Or the many other workarounds people have recommended above (like appending a prefix, or using a different encoding), but why is it trying to parse a number thats too big and instead of throwing it just rounds down without telling you????!
It does not actually make it impossible to query the wrong table it just tells you quickly when you’ve done so.
All sequences use step 16.
Type A has discriminant/offset 0, yielding IDs {0, 16, 32, 48, 64, …}.
Type B has discriminant/offset 1, mapping to IDs {1, 17, 33, 49, 65, …}.
All the way up to Type P with discriminant/offset 15 and IDs {15, 31, 47, 63, 79, …}.
This is also trivially invertible so that you can determine the type from the ID.
A more common approach is to make IDs opaque strings and put a type prefix—A0, B12, P34, that kind of thing. But this way you can keep it as a number, if you wish.
Inappropriate aliasing of integer keys allows for silent errors in queries because it will actually return some result a lot of the time. A UUID is immune to this problem. The model recognizes its mistake a lot more reliably when previously non-empty tables start showing up empty after attempting a join.
I read this post more as an illustration of the *value* of UUIDv7 as primary key, over integer primary keys, in lieu of minimal loss of read/write performance, and marginally more data on disk bloat.
SQLite's automatic integer rowID primary key is a no-brainer, when the SQLite application is local-only, such as application storage format (mobile and desktop). Or is never intended to grow beyond a single server instance. Basically, where each SQLite file is private to a singular instance of the application.
However, if there is even an outside chance of needing to cooperate across application instances, e.g. the minimal limit case of a personal knowledge base that should seamlessly sync across a person's devices, as well as a hosted service, then a high-quality sequential random ID starts to make a lot more sense. (No-brainer arbitrary table merges / splits / remerges, de-duplication, etc.)
Random ID primary key is a bad idea period, whether it be the UU kind or the SQ kind, or any other kind. As far as my DB knowledge goes, this class of ID destroys all tree-algorithms, and we are stuck with the fact that there is no practically better way, than an appropriate tree-structure, to group and organise a meaningful amount of data, efficiently and effectively.
Thanks to its oh so convenient automatic integer rowIDs, I believe one can amortise some of the other overheads of UUIDv7s for "in-between" queries, viz. indices, joins, ctes, virtual tables etc., with appropriate schema / query design.
UUID v7 so far seems like the best solution if you want UUID benefits and ordering.
Why would you force database to order rows on the drive according to random id?
I get saving 8 bytes per row seems attractive, but the tradeoff is not explained.
The Integer id is used for joins and looks ups and such but that's it. If I need to send anything to the frontend or outside of the app/DB then that's the UUID.
How much trouble does SQLite reysing rowid's actually cause?
Regular rowids are definitely the way to go if you can use them.
Why would you store it as as str column and not the inbuilt type for this?
https://www.postgresql.org/docs/current/datatype-uuid.html
If you are using SQLite well I guess that doesn't work.
I disagree. I tried this once. Now you need a client access layer to touch the DB in any context. All your console tools no longer work well or at all. If they show up in URLs you need to deoptimize them for transport.
You give up a lot of convenience for this optimization. You should be absolutely sure your design requires it before using it.
Contention and coordination are real killers, concurrent writes (that require coordination like postgres) often underdeliver.
(defonce db
(d/init-db! "db/db.db"
{:pool-size 4 :pragma {:synchronous "FULL"}}))
That's writing to disk.There's only one index so there's no real write amplification. The numbers will go down as you add more data and indexes.
Also INSERT speed instead of SELECT? Typically most time is spend in SELECT or UPDATE.