
β
β
For asset managers, a handful of analytical tables sit at the center of the operating model: daily fund results, composite returns, NAV history, performance attribution facts, and other datasets that support reporting and decision-making across the firm.
These are the datasets behind morning checks, monthly fact sheets, quarterly reporting, compliance reviews, institutional client requests, and analyst research. They are also some of the most expensive datasets to operate.
The reason is straightforward: they grow every business day,across every fund, portfolio, share class, vehicle, and return period. At the same time, more teams depend on them as reporting, analytics, AI, and self-service use cases expand.
In a modern Lakehouse environment, this growth is expected.The challenge is that cost often scales with data volume and consumption - not necessarily with business value.
That was the challenge we evaluated with a leading global asset manager. Their data services team had built a modern and well-architected Lakehouse: open formats on S3, Databricks for processing, and governed self-service access for downstream teams. The platform worked. The issue was not architecture. The issue was cost trajectory.
As fund analytics workloads grew, the firm needed a way to reduce compute cost and improve performance without disrupting existing pipelines, dashboards, or analyst workflows.
Qbeast was evaluated for exactly that purpose.
β
This is the problem we set out to evaluate with a leading global asset manager. Their data services team had built exactly the right kind of Lakehouse: open formats on S3, Databricks for processing, governed self-service for downstream teams. It worked. It was just getting more expensive every quarter, and the trajectory wasn't bending.
β
The evaluation focused on five representative queries from the asset manager's real analytical workload.
These were not synthetic benchmarks. They were production-shaped SQL queries reflecting how analysts, reporting jobs, and downstream systems actually interact with fund-performance data.
The common thread: highly selective filters on portfolio identifiers, vehicle classes, share classes, and date ranges, joined to small reference tables, often grouped by month or by share class for the final aggregate. This is what the daily life of a fund-analytics Lakehouse looks like.
We compared the existing Delta tables against Qbeast-indexed equivalents on the same Databricks Photon runtime, with the Spark environment restarted between runs to eliminate caching effects. Tables, queries, pipelines, dashboards β all unchanged. The only difference was the layout underneath.
β
The largest query produced the most significant result.
On the 22-year, 25-billion-row fact table, Qbeast reduced data scanned from 82 GB to 877 MB - a 99% reduction.
That result matters because it shows what is possible on the types of large, frequently queried tables that drive fund analytics costs.
But the impact was not limited to the largest query. Across the smaller fund daily results workloads, the pattern remained consistent:significantly less data scanned, lower executor runtime, and improved cost efficiency.
A note on Query 4: Query 4 read slightly more bytes than the baseline because the original query was already extremely selective, scanning only 4.1 MB. Executor runtime still dropped by 71%, showing the value of better-organized data as well as lower scan volume.
β
The immediate benefit is lower compute cost. But the broader value is strategic.
When core analytics queries become faster and cheaper, the operating model changes. Teams can ask more questions, run deeper analysis, and support more downstream use cases without requiring a proportional increase in infrastructure spend.
β
Asset management analytics has a distinctive query pattern,and that pattern is well suited to multi-dimensional indexing.
β
An analyst rarely filters on only one field.
A typical query may ask for a specific portfolio, share class, vehicle, fee treatment, and date range. Traditional partitioning often optimizes for one primary column. Sorting and clustering can improve access along a chosen order, but filters outside that order may still require unnecessary scanning.
Qbeast organizes data across multiple dimensions, allowing the query planner to prune more effectively across combinations of filters.
That is particularly valuable for fund analytics, where different teams use the same tables in different ways.
β
Performance reporting, compliance, fact sheet generation,exploratory analysis, and AI workflows may all depend on the same fund-result tables.
Optimizing for only one query shape can create tradeoffs elsewhere. A sustainable layout strategy needs to work across the broader filter space, not just for one reporting workload.
Qbeast is designed for that type of multi-dimensional access pattern.
β
One of the most important aspects of this evaluation was what did not change.

β
Qbeast operates beneath the table, preserving the existing Lakehouse architecture while improving how data is physically organized.
For a governed self-service environment, that matters. The data services team can improve performance and cost efficiency centrally, while downstream consumers continue using the tools and queries they already know.
Because Qbeast operates beneath the table β the format stays open, the queries don't change, the pipelines keep running β the gains accrue to every downstream team without coordination. There's no "migrate your dashboard to the new system" conversation. There's no new SDK. The team that tunes the table and the teams that consume it can stay decoupled, which is the whole point of governed self-service.
The most important takeaway is not that one query became faster.
The important takeaway is that the cost profile of a critical fund analytics workload changed.
For the largest query, data scanned fell by 99% and executor runtime dropped by 88%. Across representative daily fund-result queries,executor runtime fell by 50-71%.
That level of improvement does more than reduce the current Databricks bill. It creates headroom.
And it does so without forcing the firm into an architectural migration.
For an asset manager that has already invested in a modern Lakehouse, this is an important distinction. The architecture was already sound. What changed was the physical layout layer underneath it.
That is why the evaluation was so compelling: it showed that meaningful cost reduction and performance improvement can be achieved without disrupting the systems, workflows, and governance model already in place.
β
Fund analytics workloads are only becoming larger, more complex, and more widely consumed. For asset managers, the question is not whether these datasets will grow. They will.
The question is whether infrastructure cost must grow at the same rate.
This evaluation showed that it does not have to.
By applying Qbeast's multi-dimensional indexing to existing Lakehouse tables, a leading global asset manager significantly reduced data scanned, lowered executor runtime, and improved query performance - without changing pipelines, dashboards, or SQL.
For firms running similar workloads across fund administration, NAV history, performance reporting, fact sheet generation,compliance, or AI-driven analytics, the opportunity is clear:
β
Reach to us to request a demo,and explore how Qbeast can help reduce the cost of your fund analytics workloads:Β https://qbeast.io/request-demo
β