Models

# DaskXGBForecast

dask XGBoost forecaster

Wrapper of `xgboost.dask.DaskXGBRegressor`

that adds a `model_`

property
that contains the fitted model and is sent to the workers in the
forecasting step.

source

### DaskXGBForecast

`DaskXGBForecast (max_depth:Optional[int]=None, max_leaves:Optional[int]=None, max_bin:Optional[int]=None, grow_policy:Optional[str]=None, learning_rate:Optional[float]=None, n_estimators:int=100, verbosity:Optional[int]=None, obje ctive:Union[str,Callable[[numpy.ndarray,numpy.ndarray],T uple[numpy.ndarray,numpy.ndarray]],NoneType]=None, booster:Optional[str]=None, tree_method:Optional[str]=None, n_jobs:Optional[int]=None, gamma:Optional[float]=None, min_child_weight:Optional[float]=None, max_delta_step:Optional[float]=None, subsample:Optional[float]=None, sampling_method:Optional[str]=None, colsample_bytree:Optional[float]=None, colsample_bylevel:Optional[float]=None, colsample_bynode:Optional[float]=None, reg_alpha:Optional[float]=None, reg_lambda:Optional[float]=None, scale_pos_weight:Optional[float]=None, base_score:Optional[float]=None, random_state:Union[nump y.random.mtrand.RandomState,int,NoneType]=None, missing:float=nan, num_parallel_tree:Optional[int]=None, monotone_constraints:Union[Dict[str,int],str,NoneType]=N one, interaction_constraints:Union[str,Sequence[Sequence [str]],NoneType]=None, importance_type:Optional[str]=None, gpu_id:Optional[int]=None, validate_parameters:Optional[bool]=None, predictor:Optional[str]=None, enable_categorical:bool=False, feature_types:Sequence[str]=None, max_cat_to_onehot:Optional[int]=None, max_cat_threshold:Optional[int]=None, eval_metric:Union[str,List[str],Callable,NoneType]=None, early_stopping_rounds:Optional[int]=None, callbacks:Opti onal[List[xgboost.callback.TrainingCallback]]=None, **kwargs:Any)`

*Implementation of the Scikit-Learn API for XGBoost.*

Type | Default | Details | |
---|---|---|---|

max_depth | Optional | None | Maximum tree depth for base learners. |

max_leaves | Optional | None | Maximum number of leaves; 0 indicates no limit. |

max_bin | Optional | None | If using histogram-based algorithm, maximum number of bins per feature |

grow_policy | Optional | None | Tree growing policy. 0: favor splitting at nodes closest to the node, i.e. grow depth-wise. 1: favor splitting at nodes with highest loss change. |

learning_rate | Optional | None | Boosting learning rate (xgb’s “eta”) |

n_estimators | int | 100 | Number of gradient boosted trees. Equivalent to number of boosting rounds. |

verbosity | Optional | None | The degree of verbosity. Valid values are 0 (silent) - 3 (debug). |

objective | Union | None | Specify the learning task and the corresponding learning objective or a custom objective function to be used (see note below). |

booster | Optional | None | |

tree_method | Optional | None | |

n_jobs | Optional | None | Number of parallel threads used to run xgboost. When used with other Scikit-Learn algorithms like grid search, you may choose which algorithm to parallelize and balance the threads. Creating thread contention will significantly slow down both algorithms. |

gamma | Optional | None | (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. |

min_child_weight | Optional | None | Minimum sum of instance weight(hessian) needed in a child. |

max_delta_step | Optional | None | Maximum delta step we allow each tree’s weight estimation to be. |

subsample | Optional | None | Subsample ratio of the training instance. |

sampling_method | Optional | None | Sampling method. Used only by `gpu_hist` tree method.- `uniform` : select random training instances uniformly.- `gradient_based` select random training instances with higher probability whenthe gradient and hessian are larger. (cf. CatBoost) |

colsample_bytree | Optional | None | Subsample ratio of columns when constructing each tree. |

colsample_bylevel | Optional | None | Subsample ratio of columns for each level. |

colsample_bynode | Optional | None | Subsample ratio of columns for each split. |

reg_alpha | Optional | None | L1 regularization term on weights (xgb’s alpha). |

reg_lambda | Optional | None | L2 regularization term on weights (xgb’s lambda). |

scale_pos_weight | Optional | None | Balancing of positive and negative weights. |

base_score | Optional | None | The initial prediction score of all instances, global bias. |

random_state | Union | None | Random number seed. .. note:: Using gblinear booster with shotgun updater is nondeterministic as it uses Hogwild algorithm. |

missing | float | nan | Value in the data which needs to be present as a missing value. |

num_parallel_tree | Optional | None | |

monotone_constraints | Union | None | Constraint of variable monotonicity. See :doc:`tutorial </tutorials/monotonic>` for more information. |

interaction_constraints | Union | None | Constraints for interaction representing permitted interactions. The constraints must be specified in the form of a nested list, e.g. `[[0, 1], [2,<br/>3, 4]]` , where each inner list is a group of indices of features that areallowed to interact with each other. See :doc: `tutorial<br/></tutorials/feature_interaction_constraint>` for more information |

importance_type | Optional | None | |

gpu_id | Optional | None | Device ordinal. |

validate_parameters | Optional | None | Give warnings for unknown parameter. |

predictor | Optional | None | Force XGBoost to use specific predictor, available choices are [cpu_predictor, gpu_predictor]. |

enable_categorical | bool | False | .. versionadded:: 1.5.0 .. note:: This parameter is experimental Experimental support for categorical data. When enabled, cudf/pandas.DataFrame should be used to specify categorical data type. Also, JSON/UBJSON serialization format is required. |

feature_types | Sequence | None | .. versionadded:: 1.7.0 Used for specifying feature types without constructing a dataframe. See :py:class: `DMatrix` for details. |

max_cat_to_onehot | Optional | None | .. versionadded:: 1.6.0 .. note:: This parameter is experimental A threshold for deciding whether XGBoost should use one-hot encoding based split for categorical data. When number of categories is lesser than the threshold then one-hot encoding is chosen, otherwise the categories will be partitioned into children nodes. Also, `enable_categorical` needs to be set to havecategorical feature support. See :doc: `Categorical Data<br/></tutorials/categorical>` and :ref:`cat-param` for details. |

max_cat_threshold | Optional | None | .. versionadded:: 1.7.0 .. note:: This parameter is experimental Maximum number of categories considered for each split. Used only by partition-based splits for preventing over-fitting. Also, `enable_categorical` needs to be set to have categorical feature support. See :doc: `Categorical Data<br/></tutorials/categorical>` and :ref:`cat-param` for details. |

eval_metric | Union | None | .. versionadded:: 1.6.0 Metric used for monitoring the training result and early stopping. It can be a string or list of strings as names of predefined metric in XGBoost (See doc/parameter.rst), one of the metrics in :py:mod: `sklearn.metrics` , or any otheruser defined metric that looks like `sklearn.metrics` .If custom objective is also provided, then custom metric should implement the corresponding reverse link function. Unlike the `scoring` parameter commonly used in scikit-learn, when a callableobject is provided, it’s assumed to be a cost function and by default XGBoost will minimize the result during early stopping. For advanced usage on Early stopping like directly choosing to maximize instead of minimize, see :py:obj: `xgboost.callback.EarlyStopping` .See :doc: `Custom Objective and Evaluation Metric </tutorials/custom_metric_obj>` for more. .. note:: This parameter replaces `eval_metric` in :py:meth:`fit` method. The old onereceives un-transformed prediction regardless of whether custom objective is being used. .. code-block:: python from sklearn.datasets import load_diabetes from sklearn.metrics import mean_absolute_error X, y = load_diabetes(return_X_y=True) reg = xgb.XGBRegressor( tree_method=“hist”, eval_metric=mean_absolute_error, ) reg.fit(X, y, eval_set=[(X, y)]) |

early_stopping_rounds | Optional | None | .. versionadded:: 1.6.0 Activates early stopping. Validation metric needs to improve at least once in every early_stopping_rounds round(s) to continue training. Requires at leastone item in eval_set in :py:meth:`fit` .The method returns the model from the last iteration (not the best one). If there’s more than one item in eval_set, the last entry will be used for earlystopping. If there’s more than one metric in eval_metric, the last metricwill be used for early stopping. If early stopping occurs, the model will have three additional fields: :py:attr: `best_score` , :py:attr:`best_iteration` and:py:attr: `best_ntree_limit` ... note:: This parameter replaces `early_stopping_rounds` in :py:meth:`fit` method. |

callbacks | Optional | None | List of callback functions that are applied at end of each iteration. It is possible to use predefined callbacks by using :ref: `Callback API <callback_api>` ... note:: States in callback are not preserved during training, which means callback objects can not be reused for multiple training sessions without reinitialization or deepcopy. .. code-block:: python for params in parameters_grid: # be sure to (re)initialize the callbacks before each run callbacks = [xgb.callback.LearningRateScheduler(custom_rates)] xgboost.train(params, Xy, callbacks=callbacks) |

kwargs | Any | Keyword arguments for XGBoost Booster object. Full documentation of parameters can be found :doc: `here </parameter>` .Attempting to set a parameter via the constructor args and **kwargs dict simultaneously will result in a TypeError. .. note:: **kwargs unsupported by scikit-learn **kwargs is unsupported by scikit-learn. We do not guarantee that parameters passed via this argument will interact properly with scikit-learn. | |

Returns | None |