xgboost.spark.SparkXGBRegressor that adds an
extract_local_model method to get a local version of the trained model
and broadcast it to the workers.
source
SparkXGBForecast
*SparkXGBRegressor is a PySpark ML estimator. It implements the XGBoost regression algorithm based on XGBoost python library, and it can be used in PySpark Pipeline and PySpark ML meta algorithms like - :py:class:
~pyspark.ml.tuning.CrossValidator/ -
:py:class:~pyspark.ml.tuning.TrainValidationSplit/ -
:py:class:~pyspark.ml.classification.OneVsRest
SparkXGBRegressor automatically supports most of the parameters in
:py:class:xgboost.XGBRegressor constructor and most of the parameters
used in :py:meth:xgboost.XGBRegressor.fit and
:py:meth:xgboost.XGBRegressor.predict method.
To enable GPU support, set device to cuda or gpu.
SparkXGBRegressor doesn’t support setting base_margin explicitly as
well, but support another param called base_margin_col. see doc below
for more details.
SparkXGBRegressor doesn’t support validate_features and
output_margin param.
SparkXGBRegressor doesn’t support setting nthread xgboost param,
instead, the nthread param for each xgboost worker will be set equal
to spark.task.cpus config value.*
