The main TangledFeatures function
TangledFeatures.Rd
The main TangledFeatures function
Usage
TangledFeatures(
Data,
Y_var,
Focus_variables = list(),
corr_cutoff = 0.85,
RF_coverage = 0.95,
plot = FALSE,
fast_calculation = FALSE,
cor1 = "pearson",
cor2 = "polychoric",
cor3 = "spearman"
)
Arguments
- Data
The imported Data Frame
- Y_var
The dependent variable
- Focus_variables
The list of variables that you wish to give a certain bias to in the correlation matrix
- corr_cutoff
The correlation cutoff variable. Defaults to 0.8
- RF_coverage
The Random Forest coverage of explainable. Defaults to 95 percent
- plot
Return if plotting is to be done. Binary True or False
- fast_calculation
Returns variable list without many Random Forest iterations by simply picking a variable from a correlated group
- cor1
The correlation metric between two continuous features. Defaults to pearson correlation
- cor2
The correlation metric between one categorical feature and one continuous feature. Defaults to bi serial correlation correlation
- cor3
The correlation metric between two categorical features. Defaults to Cramer's V.
Examples
TangledFeatures(Data = TangledFeatures::Advertisement, Y_var = 'Sales')
#> Warning: length(LHS)==0; no columns to delete or assign RHS to.
#> $Final_Variables
#> [1] "tv" "radio"
#>
#> $Variable_groups
#> NULL
#>
#> $Correlation_heatmap
#> NULL
#>
#> $Graph_plot
#> NULL
#>