**Attributes**

Attributes contain categorical information that groups historical data items by their shared characteristics (for example, product category). Attributes are not numeric. These attributes help PlanIQ discover patterns across similar items.

They often represent item hierarchy elements or business dimensions (such as SKU by store). Attributes are used to generate more accurate forecasts.

**Backtest**

A backtest is a standard technique used to validate ML and time series forecast models. Historical data is split into a train set and a test set. The train set is used to train a forecast model and produce forecasts for the timeframe represented in the test set. The forecast model is then evaluated by comparing the forecasts from the test set to the actuals, or known observed values.

The backtest time period used is equal to the forecast horizon. The model training process withholds historical data for this period and then uses the same period to generate forecasts.

Backtesting results are used to assess forecast model quality.

**Data collection**

A data collection includes historical data, and optionally, related data and attributes. It's used to train forecast models to generate forecast results.

**Decomposition**

A forecasting technique where the historical data is split out into different components or variables. Each variable in turn is then used in the forecast. This allows you to assess how much influence a variable has on your forecast.

For example, you might want to see whether the price per unit has less influence on profit than your monthly advertising costs.

**Drivers**

One or more time-series numerical variables that correlate with or sometimes affect the forecasted time series. Examples might include a competitive sales price per unit, or the number of product promotions.

**Explainability**

Models and methods that make the decisions and behavior of machine learning tools more understandable.

**Forecast action**

A forecast action generates a prediction from the most recent actuals. The action imports the forecast results and explainability information (for supported algorithms) from PlanIQ into the Anaplan model.

**Forecast horizon**

The length of time into the future for which the forecast generates predictions. The maximum forecast horizon is set by the depth of historical data and the selected algorithm.

**Forecast model**

An algorithm trained from a data collection to generate forecasts.

**Forecast time interval**

The interval for the forecasted data points, as defined by the input data (historical and related), and the selected algorithm. The forecast time interval can be either daily, weekly, or monthly.

**Historical data (actuals)**

The primary data type used in a forecast. This data is mandatory. It must be numerical and include a time dimension. Examples: Units sold, Expenditure.

**Hyperparameter**

A parameter whose value is used to control the learning process of the forecast model. Hyperparameter tuning is part of the model configuration procedure within PlanIQ.

**Lag**

The number of historical time periods used to forecast a future time period. For example, PlanIQ needs 3 months of historical data - March, April, May - to calculate a trend for June. Similarly, to obtain a single yearly trend datapoint for March 2021, PlanIQ needs 12 months of historical data from March 2020 to March 2021.

**Noise**

Also known as residuals, noise is where some data points deviate from the forecasting trend. In time series forecasting, noise is unpredictable random data that deviates from the typical behavior of the time series. Noise can be a useful indicator of model quality.

**Related data (drivers)**

An optional dataset that provides additional drivers (business factors and values) to your forecast. Related data can help PlanIQ improve the forecast accuracy. Examples are historical data and forward-looking promotions. Related data includes:

- A time dimension (for example: historical and forward-looking promotions).
- Historical data points leading up to the forecast horizon.
- Future data points to cover the length of the forecast horizon (optional but recommended for increased forecast accuracy).

**Sparse data**

A dataset where many of the values are 0. For example, if your dataset represents demand for a product over time, and there are periods where demand is zero.

Before you include a dataset in a data collection, you can choose to distinguish between true zeros and zeros that represent missing data. This ensures your forecast result is not skewed. For more information, see Exclude values.

**Time series**

A sequence of successive data points (observations) in time, that occur at regular time intervals.

**Time series forecasting**

Prediction of a time series’ future values based on historical and other data types.