Monday, July 9, 2018

Basics of R- Session 11- Factor Analysis

Factor Analysis

library(psych)
library(Hmisc)

#import data set

factoranalysis<-read.csv("E:/2 presentation for class/R/1 R Open elective/data set/FACTOR ANALYSIS.csv")

# check the structure of the data set, here we are using only 10 Variables

# labeling the variables
label(factoranalysis$ï..Resp)<-"Respondent"
label(factoranalysis$X1)<-"refreshing"
label(factoranalysis$X2)<-"bad for health"
label(factoranalysis$X3)<-"very convenient to serve"
label(factoranalysis$X4)<-"avoided with age"
label(factoranalysis$X5)<-"very tasty"
label(factoranalysis$X6)<-"not good for children"
label(factoranalysis$X7)<-"consumed occasionally"
label(factoranalysis$X8)<-"not be taken in large quantity"
label(factoranalysis$X9)<-"not as good as energy drinks"
label(factoranalysis$X10)<-"better than fruit juices"
label(factoranalysis$S)<-"Recommending aerated drinks to others"

str(factoranalysis)

fix(factoranalysis)
View(factoranalysis)

#-----------------#
#--------------------#
#----------------------#
# here the library(psych) will be used

# bartlet test of speriocity 
# the file should contain only those variables for which we have to apply factor analysis
# remove those variables, which are not included

fix(factoranalysis)
factoranalysis<-factoranalysis[,c(-1,-12)]
fix(factoranalysis)

cortest.bartlett(factoranalysis)  # data set


#KMO test 
KMO(factoranalysis)

#-----------------#
#--------------------#
#----------------------#

#code for principal in factor analysis

principal(r, nfactors = 1, residuals = FALSE,rotate="varimax",n.obs=NA, covar=FALSE,
 scores=TRUE,missing=FALSE,impute="median",oblique.scores=TRUE,method="regression",...)

Rotation-->
"none", "varimax", "quartimax", "promax", "oblimin", "simplimax", and "cluster"

PCA2<-principal(factoranalysis)
PCA2$rotation
PCA2$values
PCA2$communality
PCA2$factors
PCA2$scores

# for screeplot
X11<-PCA2$values
Y11<-1:length(PCA2$values)
plot(X11,Y11, type="l")

# scree plot
VSS.scree(factoranalysis)

The column h2 is a measure of communalities, and u2 is uniqueness; 

Communalities refer to shared variance with the other items, while uniqueness is variance not explained by the other items, but that could be explained by the latent variable as well as measurement error. 



No comments:

Post a Comment