Run predictions for multiple votes
predict_votes.Rd
This function can be used to predict the outcome of multiple votes based on a number of past vote results. It uses the machine learning models available in the caret package. To create replicable examples, use the function together with set.seed().
Usage
predict_votes(
x,
traindata,
testdata = traindata,
method = "svmRadial",
trControl = NULL,
exclude_votes = TRUE,
geovars = c("gemeinde", "v_gemwkid"),
training_prop = NA,
...
)
Arguments
- x
Column names of the dependent variables.
- traindata
Data used to train the model containing the dependent variable and the predictor columns.
- testdata
Dataset on which the prediction should be run. The data must contain all columns of the training data of the model
model$trainingData
.- method
A string specifying which classification or regression model to use. Possible values are found using
names(getModelInfo())
. See http://topepo.github.io/caret/train-models-by-tag.html. A list of functions can also be passed for a custom model function. See http://topepo.github.io/caret/using-your-own-model-in-train.html for details.- trControl
A list of values that define how this function acts. See
trainControl
and http://topepo.github.io/caret/using-your-own-model-in-train.html. (NOTE: If given, this argument must be named.)- exclude_votes
If set to TRUE, the variables to be predicted will be excluded from each others models. This makes sense on a vote Sunday due to differences in the counting processes. This means, that a lot of the votes in the data can contain NAs and should therefore be excluded. Defaults to TRUE.
- geovars
Variables containing labels and IDs of the spatial units.
- training_prop
Optional argument to define a share of observations to be randomly kept in the training data. It generates a training dataset by excluding the inverse proportion from the training data.
- ...
Optional parameters that can be passed to the caret::train() function.
Examples
# Set seed for reproducibility
set.seed(42)
predict_votes(c("Eidg1", "Kant1"), vote_data)
#> # A tibble: 342 × 5
#> gemeinde v_gemwkid pred real vorlage
#> <chr> <dbl> <dbl> <dbl> <chr>
#> 1 Adlikon 21 23.4 21.5 Eidg1
#> 2 Adliswil 131 47.4 48.3 Eidg1
#> 3 Aesch 241 30.0 30.9 Eidg1
#> 4 Aeugst am Albis 1 33.2 31.5 Eidg1
#> 5 Affoltern am Albis 2 40.7 39.8 Eidg1
#> 6 Altikon 211 28.9 29.8 Eidg1
#> 7 Andelfingen 30 33.0 32.1 Eidg1
#> 8 Bachenbülach 51 38.3 39.9 Eidg1
#> 9 Bachs 81 31.4 30.5 Eidg1
#> 10 Bäretswil 111 32.6 33.3 Eidg1
#> # ℹ 332 more rows