Models
DaskXGBForecast
dask XGBoost forecaster
Wrapper of xgboost.dask.DaskXGBRegressor
that adds a model_
property
that contains the fitted model and is sent to the workers in the
forecasting step.
source
DaskXGBForecast
Implementation of the Scikit-Learn API for XGBoost. See
:doc:/python/sklearn_estimator
for more information.
Type | Default | Details | |
---|---|---|---|
max_depth | Optional | None | Maximum tree depth for base learners. |
max_leaves | Optional | None | Maximum number of leaves; 0 indicates no limit. |
max_bin | Optional | None | If using histogram-based algorithm, maximum number of bins per feature |
grow_policy | Optional | None | Tree growing policy. - depthwise: Favors splitting at nodes closest to the node, - lossguide: Favors splitting at nodes with highest loss change. |
learning_rate | Optional | None | Boosting learning rate (xgb’s “eta”) |
n_estimators | Optional | None | Number of gradient boosted trees. Equivalent to number of boosting rounds. |
verbosity | Optional | None | The degree of verbosity. Valid values are 0 (silent) - 3 (debug). |
objective | Union | None | Specify the learning task and the corresponding learning objective or a custom objective function to be used. For custom objective, see :doc: /tutorials/custom_metric_obj and:ref: custom-obj-metric for more information, along with the end note forfunction signatures. |
booster | Optional | None | |
tree_method | Optional | None | Specify which tree method to use. Default to auto. If this parameter is set to default, XGBoost will choose the most conservative option available. It’s recommended to study this option from the parameters document :doc: tree method<br/></treemethod> |
n_jobs | Optional | None | Number of parallel threads used to run xgboost. When used with other Scikit-Learn algorithms like grid search, you may choose which algorithm to parallelize and balance the threads. Creating thread contention will significantly slow down both algorithms. |
gamma | Optional | None | (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. |
min_child_weight | Optional | None | Minimum sum of instance weight(hessian) needed in a child. |
max_delta_step | Optional | None | Maximum delta step we allow each tree’s weight estimation to be. |
subsample | Optional | None | Subsample ratio of the training instance. |
sampling_method | Optional | None | Sampling method. Used only by the GPU version of hist tree method.- uniform : Select random training instances uniformly.- gradient_based : Select random training instances with higher probabilitywhen the gradient and hessian are larger. (cf. CatBoost) |
colsample_bytree | Optional | None | Subsample ratio of columns when constructing each tree. |
colsample_bylevel | Optional | None | Subsample ratio of columns for each level. |
colsample_bynode | Optional | None | Subsample ratio of columns for each split. |
reg_alpha | Optional | None | L1 regularization term on weights (xgb’s alpha). |
reg_lambda | Optional | None | L2 regularization term on weights (xgb’s lambda). |
scale_pos_weight | Optional | None | Balancing of positive and negative weights. |
base_score | Optional | None | The initial prediction score of all instances, global bias. |
random_state | Union | None | Random number seed. .. note:: Using gblinear booster with shotgun updater is nondeterministic as it uses Hogwild algorithm. |
missing | float | nan | Value in the data which needs to be present as a missing value. Default to :py:data: numpy.nan . |
num_parallel_tree | Optional | None | |
monotone_constraints | Union | None | Constraint of variable monotonicity. See :doc:tutorial </tutorials/monotonic> for more information. |
interaction_constraints | Union | None | Constraints for interaction representing permitted interactions. The constraints must be specified in the form of a nested list, e.g. [[0, 1], [2,<br/>3, 4]] , where each inner list is a group of indices of features that areallowed to interact with each other. See :doc: tutorial<br/></tutorials/feature_interaction_constraint> for more information |
importance_type | Optional | None | |
device | Optional | None | .. versionadded:: 2.0.0 Device ordinal, available options are cpu , cuda , and gpu . |
validate_parameters | Optional | None | Give warnings for unknown parameter. |
enable_categorical | bool | False | See the same parameter of :py:class:DMatrix for details. |
feature_types | Optional | None | .. versionadded:: 1.7.0 Used for specifying feature types without constructing a dataframe. See :py:class: DMatrix for details. |
max_cat_to_onehot | Optional | None | .. versionadded:: 1.6.0 .. note:: This parameter is experimental A threshold for deciding whether XGBoost should use one-hot encoding based split for categorical data. When number of categories is lesser than the threshold then one-hot encoding is chosen, otherwise the categories will be partitioned into children nodes. Also, enable_categorical needs to be set to havecategorical feature support. See :doc: Categorical Data<br/></tutorials/categorical> and :ref:cat-param for details. |
max_cat_threshold | Optional | None | .. versionadded:: 1.7.0 .. note:: This parameter is experimental Maximum number of categories considered for each split. Used only by partition-based splits for preventing over-fitting. Also, enable_categorical needs to be set to have categorical feature support. See :doc: Categorical Data<br/></tutorials/categorical> and :ref:cat-param for details. |
multi_strategy | Optional | None | .. versionadded:: 2.0.0 .. note:: This parameter is working-in-progress. The strategy used for training multi-target models, including multi-target regression and multi-class classification. See :doc: /tutorials/multioutput formore information. - one_output_per_tree : One model for each target.- multi_output_tree : Use multi-target trees. |
eval_metric | Union | None | .. versionadded:: 1.6.0 Metric used for monitoring the training result and early stopping. It can be a string or list of strings as names of predefined metric in XGBoost (See doc/parameter.rst), one of the metrics in :py:mod: sklearn.metrics , or anyother user defined metric that looks like sklearn.metrics .If custom objective is also provided, then custom metric should implement the corresponding reverse link function. Unlike the scoring parameter commonly used in scikit-learn, when a callableobject is provided, it’s assumed to be a cost function and by default XGBoost will minimize the result during early stopping. For advanced usage on Early stopping like directly choosing to maximize instead of minimize, see :py:obj: xgboost.callback.EarlyStopping .See :doc: /tutorials/custom_metric_obj and :ref:custom-obj-metric for moreinformation. .. code-block:: python from sklearn.datasets import load_diabetes from sklearn.metrics import mean_absolute_error X, y = load_diabetes(return_X_y=True) reg = xgb.XGBRegressor( tree_method=“hist”, eval_metric=mean_absolute_error, ) reg.fit(X, y, eval_set=[(X, y)]) |
early_stopping_rounds | Optional | None | .. versionadded:: 1.6.0 - Activates early stopping. Validation metric needs to improve at least once in every early_stopping_rounds round(s) to continue training. Requires at least one item in eval_set in :py:meth: fit .- If early stopping occurs, the model will have two additional attributes: :py:attr: best_score and :py:attr:best_iteration . These are used by the:py:meth: predict and :py:meth:apply methods to determine the optimalnumber of trees during inference. If users want to access the full model (including trees built after early stopping), they can specify the iteration_range in these inference methods. In addition, other utilitieslike model plotting can also use the entire model. - If you prefer to discard the trees after best_iteration , consider using thecallback function :py:class: xgboost.callback.EarlyStopping .- If there’s more than one item in eval_set, the last entry will be used for early stopping. If there’s more than one metric in eval_metric, the last metric will be used for early stopping. |
callbacks | Optional | None | List of callback functions that are applied at end of each iteration. It is possible to use predefined callbacks by using :ref: Callback API <callback_api> ... note:: States in callback are not preserved during training, which means callback objects can not be reused for multiple training sessions without reinitialization or deepcopy. .. code-block:: python for params in parameters_grid: # be sure to (re)initialize the callbacks before each run callbacks = [xgb.callback.LearningRateScheduler(custom_rates)] reg = xgboost.XGBRegressor(**params, callbacks=callbacks) reg.fit(X, y) |
kwargs | Any | Keyword arguments for XGBoost Booster object. Full documentation of parameters can be found :doc: here </parameter> .Attempting to set a parameter via the constructor args and **kwargs dict simultaneously will result in a TypeError. .. note:: **kwargs unsupported by scikit-learn **kwargs is unsupported by scikit-learn. We do not guarantee that parameters passed via this argument will interact properly with scikit-learn. | |
Returns | None |