En esta práctica:
if (!require("pacman")) install.packages("pacman") # solo la primera vez
## Loading required package: pacman
pacman::p_load(lme4,
reghelper,
haven,
stargazer,
ggplot2, # gráficos
dplyr, # manipulacion de datos
texreg # tablas lme4
) # paquetes a cargar
Para esta práctica utilizaremos la librería lme4
, en particular la función lmer
, así como también la librería haven
, para lectura de base de datos externa. Además stargazer
y texreg
son paquetes para visualizar outputs en procesadores de texto como \(Latex\) y también en Rmarkdown.
Como en sesión anterior
mlm = read_dta("http://www.stata-press.com/data/mlmus3/hsb.dta")
“mlm”" es el nombre que le daremos a la base de datos “High School and Beyond”
Variables relevantes para ejercicios:
Seleccionar variables
mlm=mlm %>% select(minority,female,ses,mathach,size,sector,pracad,disclim,himinty,mnses,schoolid) %>% as.data.frame()
names(mlm)
## [1] "minority" "female" "ses" "mathach" "size" "sector"
## [7] "pracad" "disclim" "himinty" "mnses" "schoolid"
#Tabla estadisticos descriptivos
stargazer(mlm,title="Estadísticos descriptivos", type = "text")
##
## Estadísticos descriptivos
## ===================================================================
## Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
## -------------------------------------------------------------------
## minority 7,185 0.275 0.446 0 0 1 1
## female 7,185 0.528 0.499 0 0 1 1
## ses 7,185 0.0001 0.779 -3.758 -0.538 0.602 2.692
## mathach 7,185 12.748 6.878 -2.832 7.275 18.317 24.993
## size 7,185 1,056.862 604.172 100 565 1,436 2,713
## sector 7,185 0.493 0.500 0 0 1 1
## pracad 7,185 0.534 0.251 0.000 0.320 0.700 1.000
## disclim 7,185 -0.132 0.944 -2.416 -0.817 0.460 2.756
## himinty 7,185 0.280 0.449 0 0 1 1
## mnses 7,185 0.0001 0.414 -1.194 -0.323 0.327 0.825
## schoolid 7,185 5,277.898 2,499.578 1,224 3,020 7,342 9,586
## -------------------------------------------------------------------
Modelo 0: Null model
results_0 = lmer(mathach ~ 1 + (1 | schoolid), data = mlm)
summary(results_0)
## Linear mixed model fit by REML ['lmerMod']
## Formula: mathach ~ 1 + (1 | schoolid)
## Data: mlm
##
## REML criterion at convergence: 47116.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0631 -0.7539 0.0267 0.7606 2.7426
##
## Random effects:
## Groups Name Variance Std.Dev.
## schoolid (Intercept) 8.614 2.935
## Residual 39.148 6.257
## Number of obs: 7185, groups: schoolid, 160
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 12.6370 0.2444 51.71
screenreg(results_0) # de library texreg
##
## ========================================
## Model 1
## ----------------------------------------
## (Intercept) 12.64 ***
## (0.24)
## ----------------------------------------
## AIC 47122.79
## BIC 47143.43
## Log Likelihood -23558.40
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 8.61
## Var: Residual 39.15
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
En este modelo es posible identificar \(\gamma_{00}\) (intercept - fixed effects), así como también \(\tau_{00}\) (Var:id) y \(\sigma^2\) (Var:Residual). Con estos últimos coeficientes aleatorios es posible calcular la correlación intraclase (\(\rho\)).
Correlación intra-clase = ICC = \(\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}\)
# Cálculo directo en R:
8.614/(8.614+39.148) # calculo ICC
## [1] 0.1803526
# O de manera directa con funcion ICC de reghelper
reghelper::ICC(results_0)
## [1] 0.1803518
Model 1: individual level
results_1 = lmer(mathach ~ 1 + ses + female + (1 | schoolid), data = mlm)
screenreg(results_1, naive=TRUE)
##
## ========================================
## Model 1
## ----------------------------------------
## (Intercept) 13.28 ***
## (0.20)
## ses 2.36 ***
## (0.11)
## female -1.19 ***
## (0.17)
## ----------------------------------------
## AIC 46605.73
## BIC 46640.13
## Log Likelihood -23297.86
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 4.49
## Var: Residual 36.81
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Model 2: variables nivel 2
results_2 = lmer(mathach ~ 1 + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_2)
##
## ========================================
## Model 1
## ----------------------------------------
## (Intercept) 12.13 ***
## (0.20)
## sector 1.23 ***
## (0.31)
## mnses 5.33 ***
## (0.37)
## ----------------------------------------
## AIC 46956.50
## BIC 46990.90
## Log Likelihood -23473.25
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 2.31
## Var: Residual 39.16
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Model 3: individual y grupal
results_3 = lmer(mathach ~ 1 + ses + female + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_3)
##
## ========================================
## Model 1
## ----------------------------------------
## (Intercept) 12.74 ***
## (0.21)
## ses 2.15 ***
## (0.11)
## female -1.20 ***
## (0.16)
## sector 1.25 ***
## (0.30)
## mnses 3.07 ***
## (0.37)
## ----------------------------------------
## AIC 46515.61
## BIC 46563.77
## Log Likelihood -23250.80
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 2.15
## Var: Residual 36.80
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Model 4: pendiente aleatoria
results_4 = lmer(mathach ~ 1 + ses + female + mnses + sector + (1 + ses | schoolid), data = mlm)
screenreg(results_4)
##
## ============================================
## Model 1
## --------------------------------------------
## (Intercept) 12.65 ***
## (0.21)
## ses 2.16 ***
## (0.12)
## female -1.19 ***
## (0.16)
## mnses 3.07 ***
## (0.38)
## sector 1.43 ***
## (0.30)
## --------------------------------------------
## AIC 46514.33
## BIC 46576.24
## Log Likelihood -23248.16
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 2.22
## Var: schoolid ses 0.42
## Cov: schoolid (Intercept) ses 0.27
## Var: Residual 36.59
## ============================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Model 5: pendiente aleatoria e interacción entre niveles
\(mathach= \gamma_{00}(intercepto)+\gamma_{10}ses_{ij}+\gamma_{01}sector_j+\gamma_{11}ses_{ij} *sector{j}+\mu_{0j}(intercepto)+\mu_{1j}ses_{ij}+ r_{ij}\)
results_5 = lmer(mathach ~ 1 + female + ses*sector + mnses + (1 + ses | schoolid), data = mlm)
screenreg(results_5)
##
## ============================================
## Model 1
## --------------------------------------------
## (Intercept) 12.81 ***
## (0.21)
## female -1.18 ***
## (0.16)
## ses 2.73 ***
## (0.14)
## sector 1.29 ***
## (0.29)
## mnses 3.04 ***
## (0.37)
## ses:sector -1.31 ***
## (0.21)
## --------------------------------------------
## AIC 46482.91
## BIC 46551.71
## Log Likelihood -23231.45
## Num. obs. 7185
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 2.11
## Var: schoolid ses 0.03
## Cov: schoolid (Intercept) ses 0.17
## Var: Residual 36.60
## ============================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Generar regresiones para comparar
reg_ind=lm(mathach ~ ses + female + mnses + sector, data=mlm)
agg_mlm=mlm %>% group_by(schoolid) %>% summarise_all(funs(mean))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## please use list() instead
##
## # Before:
## funs(name = f(.))
##
## # After:
## list(name = ~ f(.))
## This warning is displayed once per session.
reg_agg=lm(mathach ~ ses + female + mnses + sector, data=agg_mlm)
Observar: ¿Qué sucede con los coeficientes y errores estándar cuando se comparan los coeficientes y los errores estándar?
screenreg(list(reg_ind, reg_agg, results_3))
##
## =================================================================
## Model 1 Model 2 Model 3
## -----------------------------------------------------------------
## (Intercept) 12.80 *** 13.13 *** 12.74 ***
## (0.13) (0.35) (0.21)
## ses 2.15 *** 5.19 *** 2.15 ***
## (0.11) (0.37) (0.11)
## female -1.34 *** -1.97 *** -1.20 ***
## (0.15) (0.56) (0.16)
## mnses 2.90 *** 3.07 ***
## (0.22) (0.37)
## sector 1.32 *** 1.25 *** 1.25 ***
## (0.16) (0.31) (0.30)
## -----------------------------------------------------------------
## R^2 0.18 0.67
## Adj. R^2 0.18 0.67
## Num. obs. 7185 160 7185
## RMSE 6.23 1.80
## AIC 46515.61
## BIC 46563.77
## Log Likelihood -23250.80
## Num. groups: schoolid 160
## Var: schoolid (Intercept) 2.15
## Var: Residual 36.80
## =================================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05
Generación de tabla para publicar en HTML
htmlreg(list(reg_ind, reg_agg, results_3),
custom.model.names = c("Individual","Agregado","Multinivel"),
custom.coef.names = c("Intercepto", "$SES_{ij}$","$Mujer_{ij}$", "$SES_{j}$", "$Sector_{j}$"),
custom.gof.names=c(NA,NA,NA,NA,NA,NA,NA,NA,
"Var:id ($\\tau_{00}$)","Var: Residual ($\\sigma^2$)"),
custom.note = "%stars. Errores estándar en paréntesis",
caption="Comparación de modelos Individual, Agregado y Multinivel",
caption.above=TRUE,
doctype = FALSE)
Individual | Agregado | Multinivel | ||
---|---|---|---|---|
Intercepto | 12.80*** | 13.13*** | 12.74*** | |
(0.13) | (0.35) | (0.21) | ||
\(SES_{ij}\) | 2.15*** | 5.19*** | 2.15*** | |
(0.11) | (0.37) | (0.11) | ||
\(Mujer_{ij}\) | -1.34*** | -1.97*** | -1.20*** | |
(0.15) | (0.56) | (0.16) | ||
\(SES_{j}\) | 2.90*** | 3.07*** | ||
(0.22) | (0.37) | |||
\(Sector_{j}\) | 1.32*** | 1.25*** | 1.25*** | |
(0.16) | (0.31) | (0.30) | ||
R2 | 0.18 | 0.67 | ||
Adj. R2 | 0.18 | 0.67 | ||
Num. obs. | 7185 | 160 | 7185 | |
RMSE | 6.23 | 1.80 | ||
AIC | 46515.61 | |||
BIC | 46563.77 | |||
Log Likelihood | -23250.80 | |||
Num. groups: schoolid | 160 | |||
Var:id (\(\tau_{00}\)) | 2.15 | |||
Var: Residual (\(\sigma^2\)) | 36.80 | |||
p < 0.001, p < 0.01, p < 0.05. Errores estándar en paréntesis |