Package 'manymodelr' reference manual

Title:	Build and Tune Several Models
Description:	Frequently one needs a convenient way to build and tune several models in one go.The goal is to provide a number of machine learning convenience functions. It provides the ability to build, tune and obtain predictions of several models in one function. The models are built using functions from 'caret' with easier to read syntax. Kuhn(2014) <doi:10.48550/arXiv.1405.6974>.
Authors:	Nelson Gonzabato [aut, cre]
Maintainer:	Nelson Gonzabato <[email protected]>
License:	GPL-2
Version:	0.3.9
Built:	2025-03-20 02:31:48 UTC
Source:	https://github.com/nelson-gon/manymodelr

Add predictions to the data set. A dplyr compatible way to add predictions to a data set.

Description

Add predictions to the data set. A dplyr compatible way to add predictions to a data set.

Usage

add_model_predictions(model = NULL, old_data = NULL, new_data = NULL)
add_model_predictions(model = NULL, old_data = NULL, new_data = NULL)

Arguments

`model`	A model object from 'fit_model'
`old_data`	The data set to which predicted values will be added.
`new_data`	The data set to use for predicting.

Value

A data.frame object with a new column for predicted values

Examples

data("yields", package="manymodelr")
yields1 <- yields[1:50,]
yields2<- yields[51:100,]
lm_model <- fit_model(yields1,"weight","height","lm") 
head(add_model_predictions(lm_model,yields1,yields2))
data("yields", package="manymodelr")
yields1 <- yields[1:50,]
yields2<- yields[51:100,]
lm_model <- fit_model(yields1,"weight","height","lm") 
head(add_model_predictions(lm_model,yields1,yields2))

Add model residuals

Description

A dplyr compatible convenience function to add residuals to a data set

Usage

add_model_residuals(model = NULL, old_data = NULL)
add_model_residuals(model = NULL, old_data = NULL)

Arguments

`model`	A model object from 'fit_model'
`old_data`	The data set to which predicted values will be added.

Value

A data.frame object with residuals added.

Examples

data("yields", package="manymodelr")
yields1 <- yields[1:50,]
yields2 <- yields[51:100,]
lm_model <- fit_model(yields1,"weight","height","lm") 
head(add_model_residuals(lm_model, yields2))
data("yields", package="manymodelr")
yields1 <- yields[1:50,]
yields2 <- yields[51:100,]
lm_model <- fit_model(yields1,"weight","height","lm") 
head(add_model_residuals(lm_model, yields2))

A convenient way to perform grouped operations

Description

This function performs operations by grouping the data.

Usage

agg_by_group(data_set = NULL, my_formula = NULL, func = NULL, ...)
agg_by_group(data_set = NULL, my_formula = NULL, func = NULL, ...)

Arguments

`data_set`	The data set for which correlations are required
`my_formula`	A formula such as A~B where B is the grouping variable(normally a factor). See examples below
`func`	The kind of operation e.g sum,mean,min,max,manymodelr::get_mode
`...`	Other arguments to 'aggregate' see ?aggregate for details

Value

A grouped data.frame object with results of the chosen operation.

Examples

head(agg_by_group(airquality,.~Month,sum))
head(agg_by_group(airquality,.~Month,sum))

Drops non numeric columns from a data.frame object

Description

Drops non numeric columns from a data.frame object

Usage

drop_non_numeric(df)
drop_non_numeric(df)

Arguments

`df`	A data.frame object for which non-numeric columns will be dropped

Examples

 drop_non_numeric(data.frame(A=1:2, B=c("A", "B")))
drop_non_numeric(data.frame(A=1:2, B=c("A", "B")))

Extract important model attributes

Description

Provides a convenient way to extract any kind of model information from common model objects

Usage

extract_model_info(model_object = NULL, what = NULL, ...)
extract_model_info(model_object = NULL, what = NULL, ...)

Arguments

`model_object`	A model object for example a linear model object, generalized linear model object, analysis of variance object.
`what`	character. The attribute you would like to obtain for instance p_value
`...`	Arguments to other functions e.g. AIC, BIC, deviance etc

Details

This provides a convenient way to extract model information for any kind of model. For linear models, one can extract such attributes as coefficients, p value("p_value"), standard error("std_err"), estimate, t value("t_value"), residuals, aic and other known attributes. For analysis of variance (aov), other attributes like sum squared(ssq), mean squared error(msq), degrees of freedom(df),p_value.

Examples

# perform analysis of variance
data("yields", package="manymodelr")
aov_mod <- fit_model(yields, "weight","height + normal","aov")
extract_model_info(aov_mod, "ssq")
extract_model_info(aov_mod, c("ssq","predictors"))
# linear regression
lm_model <-fit_model(yields, "weight","height","lm")
extract_model_info(lm_model,c("aic","bic"))
## glm
glm_model <- fit_model(yields, "weight","height","glm")
extract_model_info(glm_model,"aic")
# perform analysis of variance
data("yields", package="manymodelr")
aov_mod <- fit_model(yields, "weight","height + normal","aov")
extract_model_info(aov_mod, "ssq")
extract_model_info(aov_mod, c("ssq","predictors"))
# linear regression
lm_model <-fit_model(yields, "weight","height","lm")
extract_model_info(lm_model,c("aic","bic"))
## glm
glm_model <- fit_model(yields, "weight","height","glm")
extract_model_info(glm_model,"aic")

Fit and predict in a single function.

Description

Fit and predict in a single function.

Usage

fit_model(
  df = NULL,
  yname = NULL,
  xname = NULL,
  modeltype = NULL,
  drop_non_numeric = FALSE,
  ...
)
fit_model(
  df = NULL,
  yname = NULL,
  xname = NULL,
  modeltype = NULL,
  drop_non_numeric = FALSE,
  ...
)

Arguments

`df`	A data.frame object
`yname`	The outcome variable
`xname`	The predictor variable(s)
`modeltype`	A character specifying the model type e.g lm for linear model
`drop_non_numeric`	Should non numeric columns be dropped? Defaults to FALSE
`...`	Other arguments to specific model types.

Examples

data("yields", package="manymodelr")
fit_model(yields,"height","weight","lm")
fit_model(yields, "weight","height + I(yield)**2","lm")
data("yields", package="manymodelr")
fit_model(yields,"height","weight","lm")
fit_model(yields, "weight","height + I(yield)**2","lm")

Fit several models with different response variables

Description

Fit several models with different response variables

Usage

fit_models(
  df = NULL,
  yname = NULL,
  xname = NULL,
  modeltype = NULL,
  drop_non_numeric = FALSE,
  ...
)
fit_models(
  df = NULL,
  yname = NULL,
  xname = NULL,
  modeltype = NULL,
  drop_non_numeric = FALSE,
  ...
)

Arguments

`df`	A data.frame object
`yname`	The outcome variable
`xname`	The predictor variable(s)
`modeltype`	A character specifying the model type e.g lm for linear model
`drop_non_numeric`	Should non numeric columns be dropped? Defaults to FALSE
`...`	Other arguments to specific model types.

Value

A list of model objects that can be used later.

Examples

data("yields", package="manymodelr")
fit_models(df=yields,yname=c("height","yield"),xname="weight",modeltype="lm")
#many model types
fit_models(df=yields,yname=c("height","yield"),xname="weight",
modeltype=c("lm", "glm"))
data("yields", package="manymodelr")
fit_models(df=yields,yname=c("height","yield"),xname="weight",modeltype="lm")
#many model types
fit_models(df=yields,yname=c("height","yield"),xname="weight",
modeltype=c("lm", "glm"))

A pipe friendly way to get summary stats for exploratory data analysis

Description

A pipe friendly way to get summary stats for exploratory data analysis

Usage

get_data_Stats(
  x = NULL,
  func = NULL,
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)

get_stats(
  x = NULL,
  func = NULL,
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)
get_data_Stats(
  x = NULL,
  func = NULL,
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)

get_stats(
  x = NULL,
  func = NULL,
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)

Arguments

`x`	The data for which stats are required
`func`	The nature of function to apply
`exclude`	What kind of data should be excluded? Use for example c("character","factor") to drop character and factor columns
`na.rm`	Logical. Should NAs be removed. Defaults to FALSE.
`na_action`	If na.rm is set to TRUE, this uses na_replace to replace missing values.
`...`	Other arguments to na_replace See ?na_replace for details.

Details

A convenient wrapper especially useful for get_mode

Value

A data.frame object showing the requested stats

Examples

head(get_data_Stats(airquality,mean,na.rm = TRUE,na_action = "get_mode"))
get_stats(airquality,mean,"non_numeric",na.rm = TRUE,na_action = "get_mode")
head(get_data_Stats(airquality,mean,na.rm = TRUE,na_action = "get_mode"))
get_stats(airquality,mean,"non_numeric",na.rm = TRUE,na_action = "get_mode")

Get the exponent of any number or numbers

Description

Get the exponent of any number or numbers

Usage

get_exponent(y = NULL, x = NULL)
get_exponent(y = NULL, x = NULL)

Arguments

`y`	The number or numeric columns for which an exponent is required
`x`	The power to which y is raised

Details

Depends on the expo and expo1 functions in expo

Value

A data.frame object showing the value,power and result

Examples

df<-data.frame(A=c(1123,25657,3987))
get_exponent(df,3)
get_exponent(1:5, 2)
df<-data.frame(A=c(1123,25657,3987))
get_exponent(df,3)
get_exponent(1:5, 2)

A convenience function that returns the mode

Description

A convenience function that returns the mode

Usage

get_mode(x, na.rm = TRUE)
get_mode(x, na.rm = TRUE)

Arguments

`x`	The dataframe or vector for which the mode is required.
`na.rm`	Logical. Should 'NA's be dropped? Defaults to 'TRUE'

Details

Useful when used together with get_stats in a pipe fashion. These functions are for exploratory data analysis The smallest number is returned if there is a tie in values The function is currently slow for greater than 300,000 rows. It may take up to a minute. may work with inaccuracies. By default, NAs are discarded.

Value

a data.frame or vector showing the mode of the variable(s)

Examples

test<-c(1,2,3,3,3,3,4,5)
test2<-c(455,7878,908981,NA,456,455,7878,7878,NA)
get_mode(test)
get_mode(test2)
## Not run: 
mtcars %>%
get_data_Stats(get_mode)
get_data_Stats(mtcars,get_mode)
## End(Not run)
test<-c(1,2,3,3,3,3,4,5)
test2<-c(455,7878,908981,NA,456,455,7878,7878,NA)
get_mode(test)
get_mode(test2)
## Not run: 
mtcars %>%
get_data_Stats(get_mode)
get_data_Stats(mtcars,get_mode)
## End(Not run)

Helper function to easily access elements

Description

Helper function to easily access elements

Usage

get_this(where = NULL, what = NULL)
get_this(where = NULL, what = NULL)

Arguments

`where`	Where do you want to get it from? Currently only supports 'list's and 'data.frame'objects.
`what`	What do you want to extract from the 'data.frame' or 'list'? No quotes. See examples below.

Details

This is a helper function useful if you would like to extract data from the output of 'multi_model_1'.

Examples

my_list<-list(list(A=520),list(B=456,C=567))
get_this(what="A",my_list)
get_this(my_list,"C")
# use values
get_this(my_list, "B")
my_list<-list(list(A=520),list(B=456,C=567))
get_this(what="A",my_list)
get_this(my_list,"C")
# use values
get_this(my_list, "B")

Get correlations between variables

Description

This function returns the correlations between different variables.

Usage

get_var_corr(
  df,
  comparison_var = NULL,
  other_vars = NULL,
  method = "pearson",
  drop_columns = c("factor", "character"),
  ...
)
get_var_corr(
  df,
  comparison_var = NULL,
  other_vars = NULL,
  method = "pearson",
  drop_columns = c("factor", "character"),
  ...
)

Arguments

`df`	The data set for which correlations are required
`comparison_var`	The variable to compare to
`other_vars`	variables for which correlation with comparison_var is required. If not supplied, all variables will be used.
`method`	The method used to perform the correlation test as defined in 'cor.test'. Defaults to pearson.
`drop_columns`	A character vector specifying column classes to drop. Defaults to c("factor","character")
`...`	Other arguments to 'cor.test' see ?cor.test for details

Value

A data.frame object containing correlations between comparison_var and each of other_vars

Examples

# Get correlations between all variables
get_var_corr(mtcars,"mpg")
# Use only a few variables
get_var_corr(mtcars,"mpg", other_vars = c("disp","drat"), method = "kendall",exact=FALSE)
# Get correlations between all variables
get_var_corr(mtcars,"mpg")
# Use only a few variables
get_var_corr(mtcars,"mpg", other_vars = c("disp","drat"), method = "kendall",exact=FALSE)

Get correlations for combinations

Description

Get correlations for combinations

Usage

get_var_corr_(
  df,
  subset_cols = NULL,
  drop_columns = c("character", "factor"),
  ...
)
get_var_corr_(
  df,
  subset_cols = NULL,
  drop_columns = c("character", "factor"),
  ...
)

Arguments

`df`	A 'data.frame' object for which correlations are required in combinations.
`subset_cols`	A 'list' of length 2. The values in the list correspond to the comparison and other_Var arguments in 'get_var_corr'. See examples below.
`drop_columns`	A character vector specifying column classes to drop. Defaults to c("factor","character")
`...`	Other arguments to 'get_var_corr'

Details

This function extends get_var_corr by providing an opportunity to get correlations for combinations of variables. It is currently slow and may take up to a minute depending on system specifications.

Value

A data.frame object with combinations.

Examples

get_var_corr_(mtcars,method="pearson")
#use only a subset of the data.
 get_var_corr_(mtcars,
             subset_cols = list(c("mpg","vs"),
                                c("disp","wt")),
             method="spearman",exact=FALSE)
get_var_corr_(mtcars,method="pearson")
#use only a subset of the data.
 get_var_corr_(mtcars,
             subset_cols = list(c("mpg","vs"),
                                c("disp","wt")),
             method="spearman",exact=FALSE)

Simultaneously train and predict on new data.

Description

This function provides a convenient way to train several model types. It allows a user to predict on new data and depending on the metrics, the user is able to decide which model predictions to finally use. The models are built based on Max Kuhn's models in the caret package.

Usage

multi_model_1(
  old_data,
  yname,
  xname,
  method = NULL,
  metric = NULL,
  control = NULL,
  new_data = NULL,
  ...
)
multi_model_1(
  old_data,
  yname,
  xname,
  method = NULL,
  metric = NULL,
  control = NULL,
  new_data = NULL,
  ...
)

Arguments

`old_data`	The data holding the training dataset
`yname`	The outcome variable
`xname`	The predictor variable(s)
`method`	A vector containing methods to be used as defined in the caret package
`metric`	One of several metrics. Accuracy,RMSE,MAE,etc
`control`	See caret ?trainControl for details.
`new_data`	A data set to validate the model or for which predictions are required
`...`	Other arguments to caret's train function

Details

Most of the details of the parameters can be found in the caret package documentation. This function is meant to help in exploratory analysis to make an informed choice of the best models

Value

A list containing two objects. A tibble containing a summary of the metrics per model, a tibble containing predicted values and information concerning the model

References

Kuhn (2014), "Futility Analysis in the Cross-Validation of Machine Learning Models" http://arxiv.org/abs/1405.6974,

Kuhn (2008), "Building Predictive Models in R Using the caret" (http://www.jstatsoft.org/article/view/v028i05/v28i05.pold_data)

Examples

data("yields", package="manymodelr")
train_set<-createDataPartition(yields$normal,p=0.8,list=FALSE)
valid_set<-yields[-train_set,]
train_set<-yields[train_set,]
ctrl<-trainControl(method="cv",number=5)
set.seed(233)
m<-multi_model_1(train_set,"normal",".",c("knn","rpart"),
"Accuracy",ctrl,new_data =valid_set)
m$Predictions
m$Metrics
m$modelInfo
data("yields", package="manymodelr")
train_set<-createDataPartition(yields$normal,p=0.8,list=FALSE)
valid_set<-yields[-train_set,]
train_set<-yields[train_set,]
ctrl<-trainControl(method="cv",number=5)
set.seed(233)
m<-multi_model_1(train_set,"normal",".",c("knn","rpart"),
"Accuracy",ctrl,new_data =valid_set)
m$Predictions
m$Metrics
m$modelInfo

Fit and predict in one function

Description

Fit and predict in one function

Usage

multi_model_2(old_data, new_data, yname, xname, modeltype, ...)
multi_model_2(old_data, new_data, yname, xname, modeltype, ...)

Arguments

`old_data`	The data set to which predicted values will be added.
`new_data`	The data set to use for predicting.
`yname`	The outcome variable
`xname`	The predictor variable(s)
`modeltype`	A character specifying the model type e.g lm for linear model
`...`	Other arguments to specific model types.

Examples

# fit a linear model and get predictions
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length","Petal.Length","lm")
# multilinear
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length",
    "Petal.Length + Sepal.Width","lm")
# glm
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length","Petal.Length","glm")
# fit a linear model and get predictions
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length","Petal.Length","lm")
# multilinear
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length",
    "Petal.Length + Sepal.Width","lm")
# glm
multi_model_2(iris[1:50,],iris[50:99,],"Sepal.Length","Petal.Length","glm")

Replace missing values

Description

Replace missing values

Usage

na_replace(df, how = NULL, value = NULL)
na_replace(df, how = NULL, value = NULL)

Arguments

`df`	The data set(data.frame or vector) for which replacements are required
`how`	How should missing values be replaced? One of ffill, samples,value or any other known method e.g mean, median, max ,min. The default is NULL meaning no imputation is done. For character vectors, the use of 'get_mode' is also supported. No implementation for class factor(yet).
`value`	If how is set to value, this allows the user to provide a specific fill value for the NAs.

Details

This function currently does not support grouping although this may be achieved with some inaccuracies using grouping functions from other packages.

Value

A data.frame object with missing values replaced.

Examples

head(na_replace(airquality,how="value", value="Missing"))
head(na_replace(airquality,how="value", value="Missing"))

Replace NAs by group

Description

A convenient way to replace NAs by group.

Usage

na_replace_grouped(df, group_by_cols = NULL, ...)
na_replace_grouped(df, group_by_cols = NULL, ...)

Arguments

`df`	A data.frame object for which grouped NA replacement is desired.
`group_by_cols`	The column(s) used to use for the grouping.
`...`	Other arguments to 'na_replace'

Value

A 'data.frame' object with 'NA's replaced.

Examples

test2 <- data.frame(A=c("A","A","A","B","B","B"),
B=c(NA,5,2,2,NA,2))
head(na_replace_grouped(test2,"A",how="value","Replaced"))
test2 <- data.frame(A=c("A","A","A","B","B","B"),
B=c(NA,5,2,2,NA,2))
head(na_replace_grouped(test2,"A",how="value","Replaced"))

Plot a correlations matrix

Description

This function plots the results produced by 'get_var_corr_'.

Usage

plot_corr(
  df,
  x = "comparison_var",
  y = "other_var",
  xlabel = "comparison_variable",
  ylabel = "other_variable",
  title = "Correlations Plot",
  plot_style = "circles",
  title_just = 0.5,
  round_which = NULL,
  colour_by = NULL,
  decimals = 2,
  show_which = "corr",
  size = 12.6,
  value_angle = 360,
  shape = 16,
  value_size = 3.5,
  value_col = "black",
  width = 1.1,
  custom_cols = c("indianred2", "green2", "gray34"),
  legend_labels = waiver(),
  legend_title = NULL,
  signif_cutoff = 0.05,
  signif_size = 7,
  signif_col = "gray13",
  ...
)
plot_corr(
  df,
  x = "comparison_var",
  y = "other_var",
  xlabel = "comparison_variable",
  ylabel = "other_variable",
  title = "Correlations Plot",
  plot_style = "circles",
  title_just = 0.5,
  round_which = NULL,
  colour_by = NULL,
  decimals = 2,
  show_which = "corr",
  size = 12.6,
  value_angle = 360,
  shape = 16,
  value_size = 3.5,
  value_col = "black",
  width = 1.1,
  custom_cols = c("indianred2", "green2", "gray34"),
  legend_labels = waiver(),
  legend_title = NULL,
  signif_cutoff = 0.05,
  signif_size = 7,
  signif_col = "gray13",
  ...
)

Arguments

`df`	The data to be plotted. A 'data.frame' object produced by 'get_var_corr_'
`x`	Value for the x axis. Defaults to "comparison_var"
`y`	Values for the y axis. Defaults to "other_var."
`xlabel`	label for the x axis
`ylabel`	label for the y axis
`title`	plot title.
`plot_style`	One of squares and circles(currently).
`title_just`	Justification of the title. Defaults to 0.5, title is centered.
`round_which`	Character. The column name to be rounded off.
`colour_by`	The column to use for coloring. Defaults to "correlation". Colour strength thus indicates the strength of correlations.
`decimals`	Numeric. To how many decimal places should the rounding be done? Defaults to 2.
`show_which`	Character. One of either corr or signif to control whether to show the correlation values or significance stars of the correlations. This is case sensitive and defaults to corr i.e. correlation values are shown.
`size`	Size of the circles for plot_style set to circles
`value_angle`	What angle should the text be?
`shape`	Values for the shape if plot_style is circles
`value_size`	Size of the text.
`value_col`	What colour should the text in the squares/circles be?
`width`	width value for plot_style set to squares.
`custom_cols`	A vector(length 2) of colors to use for the plot. The first colour specifies the lower end of the correlations. The second specifies the higher end.
`legend_labels`	Text to use for the legend labels. Defaults to the default labels produced by the plot method.
`legend_title`	Title to use for the legend.
`signif_cutoff`	Numeric. If show_signif is TRUE, this defines the cutoff point for significance. Defaults to 0.05.
`signif_size`	Numeric. Defines size of the significance stars.
`signif_col`	Character. Defines the col for the significance stars.
`...`	Other arguments to get_var_corr_

Details

This function uses 'ggplot2' backend. 'ggplot2' is thus required for the plots to work. Since the correlations are obtained by 'get_var_corr_', the default is to omit correlation between a variable and itself. Therefore blanks in the plot would indicate a correlation of 1.

Value

A 'ggplot2' object showing the correlations plot.

Examples

plot_corr(mtcars,show_which = "corr",
round_values = TRUE,
round_which = "correlation",decimals = 2,x="other_var",
y="comparison_var",plot_style = "circles",width = 1.1,
custom_cols = c("green","blue","red"),colour_by = "correlation")
plot_corr(mtcars,show_which = "corr",
round_values = TRUE,
round_which = "correlation",decimals = 2,x="other_var",
y="comparison_var",plot_style = "circles",width = 1.1,
custom_cols = c("green","blue","red"),colour_by = "correlation")

Create a simplified report of a model's summary

Description

Create a simplified report of a model's summary

Usage

report_model(model_object = NULL, response_name = "Score")
report_model(model_object = NULL, response_name = "Score")

Arguments

`model_object`	A model object
`response_name`	Name of the response variable. Defaults to "Score".

Value

A data.frame object showing a simple model report that includes the effect of each predictor variable on the response.

Examples

models<-fit_models(df=yields,yname=c("height","yield"),xname="weight",
modeltype=c("lm", "glm"))
report_model(models[[2]][[1]])
models<-fit_models(df=yields,yname=c("height","yield"),xname="weight",
modeltype=c("lm", "glm"))
report_model(models[[2]][[1]])

Get row differences between values

Description

This function returns the differences between rows depending on the user's choice.

Usage

rowdiff(
  df,
  direction = "forward",
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)
rowdiff(
  df,
  direction = "forward",
  exclude = NULL,
  na.rm = FALSE,
  na_action = NULL,
  ...
)

Arguments

`df`	The data set for which differences are required
`direction`	One of forward and reverse. The default is forward meaning the differences are calculated in such a way that the difference between the current value and the next is returned
`exclude`	A character vector specifying what classes should be removed. See examples below
`na.rm`	Logical. Should missing values be removed? The missing values referred to are those introduced during the calculation ie when subtracting a row with itself. Defaults to FALSE.
`na_action`	If na.rm is TRUE, how should missing values be replaced? Depending on the value as set out in ‘na_replace', the value can be replaced as per the user’s requirement.
`...`	Other arguments to 'na_replace'.

Value

A data.frame object of row differences

Examples

# Remove factor columns
data("yields", package="manymodelr")
rowdiff(yields,exclude = "factor",direction = "reverse")
rowdiff(yields[1:5,], exclude="factor", na.rm = TRUE, 
na_action = "get_mode",direction = "reverse")
# Remove factor columns
data("yields", package="manymodelr")
rowdiff(yields,exclude = "factor",direction = "reverse")
rowdiff(yields[1:5,], exclude="factor", na.rm = TRUE, 
na_action = "get_mode",direction = "reverse")

A convenient selector gadget

Description

A convenient selector gadget

Usage

select_col(df, ...)
select_col(df, ...)

Arguments

`df`	The data set from which to select a column
`...`	columns to select, no quotes

Details

A friendly way to select a column or several columns. Mainly for non-pipe usage It is recommended to use known select functions to do pipe manipulations. Otherwise convert to tibble

Value

Returns a dataframe with selected columns

Examples

select_col(yields,height,weight,normal)
# A pipe friendly example
## Not run: 
library(dplyr)
as_tibble(yields) %>%
select_col(height, weight, normal)

## End(Not run)
select_col(yields,height,weight,normal)
# A pipe friendly example
## Not run: 
library(dplyr)
as_tibble(yields) %>%
select_col(height, weight, normal)

## End(Not run)

Get the row corresponding to a given percentile

Description

Get the row corresponding to a given percentile

Usage

select_percentile(df = NULL, percentile = NULL, descend = FALSE)
select_percentile(df = NULL, percentile = NULL, descend = FALSE)

Arguments

`df`	A 'data.frame' object for which a percentile is required. Other data structures are not yet supported.
`percentile`	The percentile required eg 10 percentile
`descend`	Logical. Should the data be arranged in descending order? Defaults to FALSE.

Details

Returns the value corresponding to a percentile. Returns mean values if the position of the percentile is whole number. Values are sorted in ascending order. You can change this by setting descend to TRUE.

Value

A dataframe showing the row corresponding to the required percentile.

Examples

data("yields", package="manymodelr")
select_percentile(yields,5)
data("yields", package="manymodelr")
select_percentile(yields,5)

Plant yields

Description

A simulated data set of plant yields, height, weight, and a binary class

Usage

yields
yields

Author(s)

Nelson Gonzabato

Package 'manymodelr'

Help Index

Add predictions to the data set. A dplyr compatible way to add predictions to a data set.

Description

Usage

Arguments

Value

See Also

Examples

Add model residuals

Description

Usage

Arguments

Value

Examples

A convenient way to perform grouped operations

Description

Usage

Arguments

Value

Examples

Drops non numeric columns from a data.frame object

Description

Usage

Arguments

Examples

Extract important model attributes

Description

Usage

Arguments

Details

Examples

Fit and predict in a single function.

Description

Usage

Arguments

Examples

Fit several models with different response variables

Description

Usage

Arguments

Value

Examples

A pipe friendly way to get summary stats for exploratory data analysis

Description

Usage

Arguments

Details

Value

Examples

Get the exponent of any number or numbers

Description

Usage

Arguments

Details

Value

Examples

A convenience function that returns the mode

Description

Usage

Arguments

Details

Value

Examples

Helper function to easily access elements

Description

Usage

Arguments

Details

Examples

Get correlations between variables

Description

Usage

Arguments

Value

Examples

Get correlations for combinations

Description

Usage

Arguments