加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

Lavaan&R

(2019-04-05 03:52:41)
标签:

研究

杂谈

分类: 知无涯
Catalog: 3. notes on Bauer & Curran, 2019; 2. lavaan; 1. R functions

3[personal notes on Bauer & Curran, 2019]
Bauer, D.J. & Curran, P.J. (2019). Structural equation modeling: R demonstration notes (Version 2019.3). Curran‐Bauer Analytics, Durham: NC. URL https://curranbauer.org/sem‐r‐notes‐2019‐3/
pdf and R can be downloaded for free
https://curranbauer.org/wp-content/uploads/2019/04/SEM-R-notes-2019-3.pdf

"visual =~ NA*visperc +c(a,a)*visperc. The NA modifier overrides the lavaan default to set the fix the first loading for each factor to one. These loadings will instead be estimated. "

"fit.1 <- sem(mod.1, sample.cov=mip.cov, sample.mean=mip.mns, sample.nobs=132,
meanstructure=TRUE, std.lv=TRUE)"#Chapter 5, data in summary format

"coverage <- lavInspect(fit.int, what = "coverage")
coverage*405
#generate the coverage matrix, % of data available for each element of the covariance matrix

-to conduct inferential tests of indirect effects using bootstrapped confidence intervals p.50
-to quickly plot simple slope for latent growth curve model, go to ch08.R (for notation, search "simple slope" in the pdf)
-categorical indicator SEM using Diagonally Weighted Least Squares Estimation (DWLS) p.113, Plotting Item Characteristic Curves p.123, not useful for myself in the near future

2[lavaan]
library(lavaan)
model<-paste("apple+orange~gender+cage",      
               "+race_3+race_2+race_5+race_4+race_1+race_6")
#the object "model" is a character vector. If the model argument is too long to be in one line, don't hit "enter" to break codes, instead, use "paste" to connect all the codes for specifying this model
fit <- sem(model,data = dataset, missing="fiml",cluster= "cluster")#to avoid listwise deletion and do robust cluster estimation, not always necessary
parameterEstimates(fit)
summary(fit, rsquare=TRUE)
fitMeasures(fit, c("chisq", "df", "pvalue", "cfi","tli", "rmsea", "srmr"))#to display fit stats I need 

-modification index(MI) and display MI according to certain criterion
mod_ind <- modindices(fit)
mod_ind <- mod_ind[mod_ind$op == "~~", ]
head(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], 10)
subset(mod_ind[order(mod_ind$mi, decreasing=TRUE), ], mi > 30)

-get certain part of output
Est <- parameterEstimates(fit1, ci = FALSE, standardized = TRUE)
cfatable<-subset(Est, op == "=~")

Other
Devw2~c(v1,v2)*Devw1+c(v3,v3)*apple+c(v5,v6)*orange #for multi-group path constrain
fit <- sem(model,data = dataset, group="pd2", missing="fiml"
           ,optim.method="L-BFGS-B")
anova(fit1,fit3)[2,7]#if only want to get the p-value for model test
#"new option missing = "ml.x" or missing = "fiml.x" will not delete cases with missing values on exogenous covariates, even if fixed.x = TRUE (this restores the behavior of < 0.6); this can be useful for models with a large number of exogenous covariates, which you can treat as stochastic, but where fixed.x = TRUE is just more convenient" (http://lavaan.ugent.be/history/dot6.html)
#The anova function for lavaan objects simply calls the lavTestLRT function, which has a few additional arguments.

lavOptions(x = NULL, default = NULL, mimic = "lavaan")
lavOptions("std.lv")

1[R functions]
dataset[sapply(dataset, is.factor)] <- data.matrix(dataset[sapply(dataset, is.factor)])
功能:把所有dataset中有内置值的factor型的variable都变成numeric
慎用

library(skimr)
library(dplyr)
histogram<-skim(group_by(dataset, apple) )
功能:快速的看dataset所有的变量的基本信息,包括简版直方图,还可以group_by分组看
apple是用来分组的那个变量名称,histogram可以是任何名字,把它用write.table输出后可以在excel进行各种操作,简版直方图也在其中

注:其他带类似功能package Review请见 https://dabblingwithdata.wordpress.com/2018/01/02/my-favourite-r-package-for-summarising-data/

library(multilevel)
UNIV.GROW<-make.univ(dataset,dataset[,3:5])
功能:把dataset从wide转long
[,3:5]列的值会堆叠起来构成最后一列新变量MULTDV中的值,配有一个标记变量TIME,其余变量自动复制。如果要进行更多精细的操作详见
Multilevel Modeling in R (2.6)by Paul Bliese https://cran.r-project.org/doc/contrib/Bliese_Multilevel.pdf  p.68-69

#
library(fastDummies)
dumvar <- dummy_cols(dataset,
                     select_columns = c("county","race"))
dataset<-dumvar
功能:快速dummy coding一个变量。一般的linear regression对于factor型变量是自动处理的,但Lavaan那里不行,需要先处理一下。以county为例,哑变量(dummy variable)的变量默认这样的命名规律county_1, county_2,会把NA也弄成一个哑变量county_NA。新的dataframe自动包括了这些新的哑变量。以上样例之所以有最后一行,而不是一步到位,是希望生成哑变量后先检查一下它们的情况再覆盖原数据


Note初衷:经历了读博的洗礼后,开心的触发点也变得很简单,比如说发现一个不错的R的function就能爽一天。关于让心情变好,以前看过一篇叫《The Awesomest 7-Year Postdoc》文章,作者说过一个点子:给自己设个“感觉好”的邮件分类,有什么成果,好事发生,以及别人的赞美表扬就存下来,郁闷的时候回看。这个点子确实是不错,不过对目前的我来说,还是存在更好的方式。比如略花点时间赞美一下某些自己偏好的R function,分享一下喜悦,所以就想在这里加一个entry不定期更新。郁闷的时候也可以回来看一看,想想,这个世上还有这些有意思的人,愿意不拿钱的贡献一些东西,或者说追求一些东西。

上面提到的文章链接https://blogs.scientificamerican.com/guest-blog/the-awesomest-7-year-postdoc-or-how-i-learned-to-stop-worrying-and-love-the-tenure-track-faculty-life/

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有