Wrapper of xgboost.dask.DaskXGBRegressor that adds a model_ property that contains the fitted model and is sent to the workers in the forecasting step.


source

DaskXGBForecast

 DaskXGBForecast (max_depth:Optional[int]=None,
                  max_leaves:Optional[int]=None,
                  max_bin:Optional[int]=None,
                  grow_policy:Optional[str]=None,
                  learning_rate:Optional[float]=None,
                  n_estimators:Optional[int]=None,
                  verbosity:Optional[int]=None, objective:Union[str,xgboos
                  t.sklearn._SklObjWProto,Callable[[Any,Any],Tuple[numpy.n
                  darray,numpy.ndarray]],NoneType]=None,
                  booster:Optional[str]=None,
                  tree_method:Optional[str]=None,
                  n_jobs:Optional[int]=None, gamma:Optional[float]=None,
                  min_child_weight:Optional[float]=None,
                  max_delta_step:Optional[float]=None,
                  subsample:Optional[float]=None,
                  sampling_method:Optional[str]=None,
                  colsample_bytree:Optional[float]=None,
                  colsample_bylevel:Optional[float]=None,
                  colsample_bynode:Optional[float]=None,
                  reg_alpha:Optional[float]=None,
                  reg_lambda:Optional[float]=None,
                  scale_pos_weight:Optional[float]=None,
                  base_score:Optional[float]=None, random_state:Union[nump
                  y.random.mtrand.RandomState,numpy.random._generator.Gene
                  rator,int,NoneType]=None, missing:float=nan,
                  num_parallel_tree:Optional[int]=None, monotone_constrain
                  ts:Union[Dict[str,int],str,NoneType]=None, interaction_c
                  onstraints:Union[str,Sequence[Sequence[str]],NoneType]=N
                  one, importance_type:Optional[str]=None,
                  device:Optional[str]=None,
                  validate_parameters:Optional[bool]=None,
                  enable_categorical:bool=False,
                  feature_types:Optional[Sequence[str]]=None,
                  max_cat_to_onehot:Optional[int]=None,
                  max_cat_threshold:Optional[int]=None,
                  multi_strategy:Optional[str]=None,
                  eval_metric:Union[str,List[str],Callable,NoneType]=None,
                  early_stopping_rounds:Optional[int]=None, callbacks:Opti
                  onal[List[xgboost.callback.TrainingCallback]]=None,
                  **kwargs:Any)

Implementation of the Scikit-Learn API for XGBoost. See :doc:/python/sklearn_estimator for more information.

TypeDefaultDetails
max_depthOptionalNoneMaximum tree depth for base learners.
max_leavesOptionalNoneMaximum number of leaves; 0 indicates no limit.
max_binOptionalNoneIf using histogram-based algorithm, maximum number of bins per feature
grow_policyOptionalNoneTree growing policy.

- depthwise: Favors splitting at nodes closest to the node,
- lossguide: Favors splitting at nodes with highest loss change.
learning_rateOptionalNoneBoosting learning rate (xgb’s “eta”)
n_estimatorsOptionalNoneNumber of gradient boosted trees. Equivalent to number of boosting
rounds.
verbosityOptionalNoneThe degree of verbosity. Valid values are 0 (silent) - 3 (debug).
objectiveUnionNoneSpecify the learning task and the corresponding learning objective or a custom
objective function to be used.

For custom objective, see :doc:/tutorials/custom_metric_obj and
:ref:custom-obj-metric for more information, along with the end note for
function signatures.
boosterOptionalNone
tree_methodOptionalNoneSpecify which tree method to use. Default to auto. If this parameter is set to
default, XGBoost will choose the most conservative option available. It’s
recommended to study this option from the parameters document :doc:tree method<br/></treemethod>
n_jobsOptionalNoneNumber of parallel threads used to run xgboost. When used with other
Scikit-Learn algorithms like grid search, you may choose which algorithm to
parallelize and balance the threads. Creating thread contention will
significantly slow down both algorithms.
gammaOptionalNone(min_split_loss) Minimum loss reduction required to make a further partition on
a leaf node of the tree.
min_child_weightOptionalNoneMinimum sum of instance weight(hessian) needed in a child.
max_delta_stepOptionalNoneMaximum delta step we allow each tree’s weight estimation to be.
subsampleOptionalNoneSubsample ratio of the training instance.
sampling_methodOptionalNoneSampling method. Used only by the GPU version of hist tree method.

- uniform: Select random training instances uniformly.
- gradient_based: Select random training instances with higher probability
when the gradient and hessian are larger. (cf. CatBoost)
colsample_bytreeOptionalNoneSubsample ratio of columns when constructing each tree.
colsample_bylevelOptionalNoneSubsample ratio of columns for each level.
colsample_bynodeOptionalNoneSubsample ratio of columns for each split.
reg_alphaOptionalNoneL1 regularization term on weights (xgb’s alpha).
reg_lambdaOptionalNoneL2 regularization term on weights (xgb’s lambda).
scale_pos_weightOptionalNoneBalancing of positive and negative weights.
base_scoreOptionalNoneThe initial prediction score of all instances, global bias.
random_stateUnionNoneRandom number seed.

.. note::

Using gblinear booster with shotgun updater is nondeterministic as
it uses Hogwild algorithm.
missingfloatnanValue in the data which needs to be present as a missing value. Default to
:py:data:numpy.nan.
num_parallel_treeOptionalNone
monotone_constraintsUnionNoneConstraint of variable monotonicity. See :doc:tutorial </tutorials/monotonic>
for more information.
interaction_constraintsUnionNoneConstraints for interaction representing permitted interactions. The
constraints must be specified in the form of a nested list, e.g. [[0, 1], [2,<br/>3, 4]], where each inner list is a group of indices of features that are
allowed to interact with each other. See :doc:tutorial<br/></tutorials/feature_interaction_constraint> for more information
importance_typeOptionalNone
deviceOptionalNone.. versionadded:: 2.0.0

Device ordinal, available options are cpu, cuda, and gpu.
validate_parametersOptionalNoneGive warnings for unknown parameter.
enable_categoricalboolFalseSee the same parameter of :py:class:DMatrix for details.
feature_typesOptionalNone.. versionadded:: 1.7.0

Used for specifying feature types without constructing a dataframe. See
:py:class:DMatrix for details.
max_cat_to_onehotOptionalNone.. versionadded:: 1.6.0

.. note:: This parameter is experimental

A threshold for deciding whether XGBoost should use one-hot encoding based split
for categorical data. When number of categories is lesser than the threshold
then one-hot encoding is chosen, otherwise the categories will be partitioned
into children nodes. Also, enable_categorical needs to be set to have
categorical feature support. See :doc:Categorical Data<br/></tutorials/categorical> and :ref:cat-param for details.
max_cat_thresholdOptionalNone.. versionadded:: 1.7.0

.. note:: This parameter is experimental

Maximum number of categories considered for each split. Used only by
partition-based splits for preventing over-fitting. Also, enable_categorical
needs to be set to have categorical feature support. See :doc:Categorical Data<br/></tutorials/categorical> and :ref:cat-param for details.
multi_strategyOptionalNone.. versionadded:: 2.0.0

.. note:: This parameter is working-in-progress.

The strategy used for training multi-target models, including multi-target
regression and multi-class classification. See :doc:/tutorials/multioutput for
more information.

- one_output_per_tree: One model for each target.
- multi_output_tree: Use multi-target trees.
eval_metricUnionNone.. versionadded:: 1.6.0

Metric used for monitoring the training result and early stopping. It can be a
string or list of strings as names of predefined metric in XGBoost (See
doc/parameter.rst), one of the metrics in :py:mod:sklearn.metrics, or any
other user defined metric that looks like sklearn.metrics.

If custom objective is also provided, then custom metric should implement the
corresponding reverse link function.

Unlike the scoring parameter commonly used in scikit-learn, when a callable
object is provided, it’s assumed to be a cost function and by default XGBoost
will minimize the result during early stopping.

For advanced usage on Early stopping like directly choosing to maximize instead
of minimize, see :py:obj:xgboost.callback.EarlyStopping.

See :doc:/tutorials/custom_metric_obj and :ref:custom-obj-metric for more
information.

.. code-block:: python

from sklearn.datasets import load_diabetes
from sklearn.metrics import mean_absolute_error
X, y = load_diabetes(return_X_y=True)
reg = xgb.XGBRegressor(
tree_method=“hist”,
eval_metric=mean_absolute_error,
)
reg.fit(X, y, eval_set=[(X, y)])
early_stopping_roundsOptionalNone.. versionadded:: 1.6.0

- Activates early stopping. Validation metric needs to improve at least once in
every early_stopping_rounds round(s) to continue training. Requires at
least one item in eval_set in :py:meth:fit.

- If early stopping occurs, the model will have two additional attributes:
:py:attr:best_score and :py:attr:best_iteration. These are used by the
:py:meth:predict and :py:meth:apply methods to determine the optimal
number of trees during inference. If users want to access the full model
(including trees built after early stopping), they can specify the
iteration_range in these inference methods. In addition, other utilities
like model plotting can also use the entire model.

- If you prefer to discard the trees after best_iteration, consider using the
callback function :py:class:xgboost.callback.EarlyStopping.

- If there’s more than one item in eval_set, the last entry will be used for
early stopping. If there’s more than one metric in eval_metric, the last
metric will be used for early stopping.
callbacksOptionalNoneList of callback functions that are applied at end of each iteration.
It is possible to use predefined callbacks by using
:ref:Callback API <callback_api>.

.. note::

States in callback are not preserved during training, which means callback
objects can not be reused for multiple training sessions without
reinitialization or deepcopy.

.. code-block:: python

for params in parameters_grid:
# be sure to (re)initialize the callbacks before each run
callbacks = [xgb.callback.LearningRateScheduler(custom_rates)]
reg = xgboost.XGBRegressor(**params, callbacks=callbacks)
reg.fit(X, y)
kwargsAnyKeyword arguments for XGBoost Booster object. Full documentation of parameters
can be found :doc:here </parameter>.
Attempting to set a parameter via the constructor args and **kwargs
dict simultaneously will result in a TypeError.

.. note:: **kwargs unsupported by scikit-learn

**kwargs is unsupported by scikit-learn. We do not guarantee
that parameters passed via this argument will interact properly
with scikit-learn.
ReturnsNone