If the distribution is quantile, the response column must be numeric. If the distribution is gamma, the response column must be numeric. If the distribution is huber, the response column must be numeric. If the distribution is gaussian, the response column must be numeric. If the distribution is tweedie, the response column must be numeric. If the distribution is laplace, the response column must be numeric. If the distribution is poisson, the response column must be numeric. If the distribution is multinomial, the response column must be categorical. If the distribution is quasibinomial, the response column must be numeric and binary. If the distribution is bernoulli, the the response column must be 2-class categorical. The options are AUTO (default), bernoulli, multinomial, gaussian, poisson, gamma, laplace, quantile, huber, or tweedie. This value defaults to 1.ĭistribution: Specify the distribution (i.e., the loss function). This method would converge much faster with almost the same accuracy. For example, instead of using **learn_rate=0.01, you can now try learn_rate=0.05 and learn_rate_annealing=0.99. So for N trees, GBM starts with learn_rate and ends with learn_rate * learn_rate_annealing**^*N*. Learn_rate_annealing: Specifies to reduce the learn_rate by this factor after every tree. The range is 0.0 to 1.0, and the default value is 0.1. This value defaults to -1 (time-based random number). The same starting conditions in alternative configurations. The seed isĬonsistent for each H2O instance so that you can create models with
Seed: Specify the random number generator (RNG) seed forĪlgorithm components dependent on randomization. Larger values may increase runtime, especially forĭeep trees and large clusters, so tuning may be required to find the This value has a more significant impact on model fitness The levels are orderedĪlphabetically if there are more levels than bins, adjacent levels Higher values can lead to more overfitting. Of bins for the histogram to build, then split at the best point. Nbins_cats: (Categorical/enums only) Specify the maximum number The histogram to build, then split at the best point (defaults to 20). Nbins: (Numerical/real/int only) Specify the number of bins for Min_rows: Specify the minimum number of observations for a leaf Setting this value to 0 specifies no limit. Higher values will make the model more complex and can lead to overfitting. Max_depth: Specify the maximum tree depth. Ntrees: Specify the number of trees to build (defaults to 50). This option is defaults to true (enabled). Training columns, since no information can be gained from them. Ignore_const_cols: Specify whether to ignore constant To change the selections for the hidden columns, use the Select Visible or Deselect Visible buttons. To only show columns with a specific percentage of missing values, specify the percentage in the Only show columns with more than 0% missing values field. To search for a specific column, type the column name in the Search field above the column list. To remove all columns from the list of ignored columns, click the None button. To remove a column from the list of ignored columns, click the X next to the column name. To add all columns, click the All button. In Flow, click the checkbox next to a column name to add it to the list of columns excluded from the model. Ignored_columns: (Optional, Python and Flow only) Specify the column or columns to be excluded from the model. If x is missing, then all columns except y are used. X: Specify a vector containing the names or indices of the predictor variables to use when building the model. Y: (Required) Specify the column to use as the dependent variable. This value defaults to 0 (no cross-validation). Nfolds: Specify the number of folds for cross-validation. Validation_frame: (Optional) Specify the dataset used to evaluate
Parse cell, the training frame is entered automatically. NOTE: In Flow, if you click the Build a model button from the Training_frame: (Required) Specify the dataset used to build the By default, H2O automatically generates a destination Model_id: (Optional) Specify a custom name for the model to use asĪ reference.