As your Anaplan workspace allowance is tied to your subscription rate, it pays to build efficient models — a model might meet your business analysis and planning needs very adequately, but cost you due to its excessive size. Anaplan's cloud solution offers enviable levels of data cube capacity and calculation performance across multiple dimensions, but this doesn't mean you should ignore size when designing and building models. Anaplan offers several robust modeling features to keep models lean and efficient. Use this section to learn how to manage your model size and avoid overheads, such as unnecessary data duplication or multiplication of empty data cells (sparsity).
There are some quick checks you can make to ensure you're optimizing model size as you build models and add modules. Also, there are some very useful built-in features and modeling techniques you can use, all designed to help you avoid unnecessary model size increases:
If you want to review a model to find out which of its modules are the largest, you can export the model to Excel and then sort modules on their cell count value:
Here's some tricks on how to optimize models for size as you build a model and add modules:
- Line Item Checks
- Review the summary setting on line items. If a sum is not being used, then it should be set to None. See Summary Methods.
- Review the dimensionality of line items. Often, line items don't need to have the same dimensions as the module. Also, look at the timescale and versions dimensions to see whether some line items need not have these dimensions applied, even though the module does.
- Line Item Duplication
- Try to limit duplication between modules. If a line item is duplicated between modules, then make sure there's a valid reason (such as different permissions).
- Granularity of Hierarchies
- Review any hierarchies you have built into your model to ensure the level of granularity is adequate and appropriate to represent the data. If you have hierarchies which allow a finer granularity than the data requires, this will prove costly in terms of model size—keep hierarchy granularity as coarse as possible to meet data requirements.
Line item subsets let you group together a set of line items that belong to one or many modules. When you create a line item subset, you can use it as a normal list — either as a dimension in another module or for list formatting a line item.
Use a line item subset to:
- Re-use existing line items in other modules and avoid duplication of those line items. See Line Item Subsets for details of how to create and use line item subsets.
- Use a line item subset as a dimension on a module. You can then use a COLLECT function to pull in data values from the source modules to which the original line items belong and avoid unnecessary data duplication. See the COLLECT page for details of how to do this with examples.
When you create a numbered list, each list item is assigned a unique, system-generated ID number — the item's display name is optional. Numbered lists allow you to avoid data sparsity in your models. Sparsity occurs when data cells that will always remain empty appear in your modules. These empty cells increase the size of your model. For example, a common use case that can generate sparsity is a module that tracks Product Sales by Sales Rep by Customer. Most Sales Reps will sell more than one Product to more than one Customer, but it is very unlikely that any single Sales Rep will sell all Products to all Customers. Many cells will always be empty — those that sit at the possible unused intersections of Product Sales/Sales Rep/Customer.
You can use numbered list to prevent this data-sparsity overhead — you can represent only those intersections that you know will hold data and will be valid combinations. A numbered list enables you to ensure that your module's line items realize only the valid data combinations.
Using numbered lists to prevent sparsity works by representing in a hierarchical structure only the valid combinations — those that carry a value — of what would have been intersections of multiple dimensions. This is one strategy to limit dimensions when adding modules and building out your model. Managing dimensionality to limit dimensions and preserving data-density is a key part of creating efficient models:
- When deciding which modules you need in a model, try to design your data modules to respect a natural dimensionality, in that they are used to express the intrinsic relationships between data-sets. For example, cost-center data and employee data.
- If you find you are having to use more than five dimensions in a module, it is worth reviewing the purpose of the module and asking if your really need two separate modules instead.
- Design modules to serve a general role within an overall model structure and to serve data management:
- Source modules, by carrying relatively few dimensions, can be kept data-dense and avoid sparsity.
- Results modules can calculate specific outcomes tailored to specific purposes and based on data gathered from source modules.
- Data management across a model's lifecycle can all be done at the level of the baseline data-source modules and data updates to baseline data are immediately propagated through to results modules.
The SUM aggregation function and the LOOKUP function are designed to facilitate this modeling approach, because you can use them to pull and aggregate data from data-dense source modules into results modules. See the SUM and LOOKUP pages for details of how to use these functions with examples.
You can check the relative footprint of modules in a model at Settings > Modules. Under Cell Count, you can read-off each module's size:
To identify where modules have the most significant impact on model size, export the module. To do this, go to Settings > Modules and export to Excel. Once exported, you can sort from highest to lowest on cell count.