Create a Predictive Insights dataset to train models, score, and enrich your data.
Before you start
Before you can create a dataset, you must have a model in Anaplan that contains a module with a list of accounts. This list consists of account names (companies) for existing and prospective customers, and must respect these rules:
- This list must be the only list in the module.
- The list must contain at least 100 unique accounts, but no more than 500,000.
- The list cannot have any gaps.
- The module cannot have any hierarchies, multiple lists, or any form of multi-dimensionality.
Create a Predictive Insights dataset
To set up a Predictive Insights dataset:
- In Home > Predictive Insights, select Datasets in the left-side panel, then New dataset.
- Enter a unique Name for your dataset. Name the dataset in a way that captures the essence or nature of the accounts included. For example, Midmarket Prospects, or North America Target Accounts.
- Select a Usage for your dataset. There are two types of usage for your dataset: the first is to score or enrich accounts. The second is for predictive model training, in addition to scoring and enrichment.
- To define which module contains your data, select in turn: a Workspace, a Model, a Module, and a Module view. Note that only views that meet the prerequisite requirements for a valid module display in the dropdown. If no custom views exist, only the default view displays.
- In Account name select the line item, in your module, that contains the list of accounts.
- Optionally, to obtain a better match with the Predictive Insights database, select line items from your module for: URL or Email. These line items must be formatted to these data types:
    - URL – a text data type with the type Link.
- Email – a text data type with the type Email.
 
- Optionally, you can select an Account CRM ID, a number data type. This can later be used as a unique identifier to match with data contained elsewhere in Anaplan.
- Select Confirm.
The dataset creation process may take time to complete, as it depends on the number of accounts that are matched with the Predictive Insights database. The dataset will be available when its status displays as Ready in the Status column of the Datasets screen.