I was quite impressed with the impact of specifying the order column (when there is limit) in your index in #postgresql
> An important special case is ORDER BY in combination with LIMIT n: an explicit sort will have to process all the data to identify the first n rows, but if there is an index matching the ORDER BY, the first n rows can be retrieved directly, without scanning the remainder at all.
https://www.postgresql.org/docs/current/indexes-ordering.html
@benoit this reminds me of a fun mongodb "feature". When you had both a sort and a limit, it would not return correct results. We had to ask the db for the whole sorted result set, and then only keep the top ones.
@clementd oh no.
YOU HAD ONE JOB!
@benoit Data ordering is under appreciated in general because so many applications/domains get it "for free”. Data arrives in an order that is very close to how it is consumed.
You really notice this when your distributed database destroys this order during a shuffle. Suddenly your table is 2x larger on disk! But order preservation is actually really hard and opens you up to terrible skew problems.
@joeharris76 Appreciate your comment :)