undefined

points

[-]

Hi, I work on Lakebase (but not on storage), here's how I understand it.

For Lakebase and Neon, our architecture needs the caching layer regardless (what we call Pageservers). Performing reads from S3 directly is too slow so we reconstruct pages and keep them on an nvme server for faster querying. Changing the format on S3 to be Parquet effectively introduces no additional copies over our existing architecture

by dsauerbrun4 hours ago|

parent|

[-]

Hmm, if the caching layer doesn't change(I assume it was optimized for olap style queries), and the new parquet format is better for olap... I'm still not understanding how it performs well for oltp reads.

I'll give the article another read... Maybe I missed something. Thank you for the response! Really nice to be able to get info straight from people who work on the product

by nikita4 hours ago|

parent|

[-]

Recent data plus working set is always in Postgres page format.

Historical data when pushed to s3 is in parquet. This happens async - not on the transaction hot path.

So older data below certain LSN is on s3 in parquet available to all analytics processing. Hot data is on page servers in page format for OLTP.

You can be smart in querying both representations for real time analytical queries

by viccis3 hours ago|

prev|

[-]

From what I have seen, it's basically a Lambda architecture.