upvote
I'm not sure I follow. Fetching the data in one query should almost always be faster and more efficient - the same work is being done regardless, except now you have the additional overhead involved with a second query (network and connection overhead, parsing etc.)
reply
Depends on how the database engine sets up the join. I've seen queries where the join ends up spilling into a temp table and it takes a lot longer to do it that way. IIRC, this can happen especially if you've asked the database engine to sort things.

Same with UNION vs IN. If you union 10 queries for one row each by id, it'll hit the index every time; but if you do it with IN, maybe it decides to do an index scan which takes longer.

You could say well the database engine is broken if client-side join works better than database-side join, and sure it probably is, but given the choice of fix the database engine or do a client-side join, I know which one is feasible in the short term.

reply
In that case, I'd suggest using EXPLAIN and CREATE INDEX on whatever requires full table scans.

Worst case, you could CREATE MATERIALIZED VIEW whatever would've happened in that temp table.

At the end of the day, your single query will be faster.

reply
> database engine sets up the join

Every major database uses nested loop, hash, or sort-merge joins. If the data is small enough to fetch in a second query, I don't see any scenario where it would make a query spill to disk when it otherwise wouldn't have (except those pesky OR's).

Most JOIN issues can be resolved by composing using sub-queries/CTE's to defer a join to a smaller intermediary result, and the only time this generally can't be done is if you need a predicate on the joined table.

> you've asked the database engine to sort things

I've only found this to be true when there's an OR condition where the single query ends up doing a bitmap against every row, in which case a UNION will be faster since it's just appending two already-index-ordered streams.

> Same with UNION vs IN

A union is almost always going to be worse - most query planners cannot optimize across queries in a UNION, so each query is going to have separate ops vs. a single op with the IN. The only case I've seen this be true is for multi-column predicates with an OR clause against a composite index. I just tested the former in both Postgres and MSSQL and the UNION query cost was 2x the IN clause.

> maybe it decides to do an index scan which takes longer

This has not been my experience unless the table statistics are bad.

reply