En esta práctica:

1. Cargar/instalar librerías**

if (!require("pacman")) install.packages("pacman") # solo la primera vez
## Loading required package: pacman
pacman::p_load(lme4,
               reghelper,
               haven,
               stargazer,
               ggplot2, # gráficos
               dplyr, # manipulacion de datos
               texreg # tablas lme4
               ) # paquetes a cargar

Para esta práctica utilizaremos la librería lme4, en particular la función lmer, así como también la librería haven, para lectura de base de datos externa. Además stargazer y texreg son paquetes para visualizar outputs en procesadores de texto como \(Latex\) y también en Rmarkdown.

2. Leer datos base High School and Beyond (HSB)**

Como en sesión anterior

mlm = read_dta("http://www.stata-press.com/data/mlmus3/hsb.dta")

“mlm”" es el nombre que le daremos a la base de datos “High School and Beyond”

Variables relevantes para ejercicios:

  1. Nivel 1:
  1. Nivel 2
  1. Cluster variable: schoolid

Seleccionar variables

mlm=mlm %>% select(minority,female,ses,mathach,size,sector,pracad,disclim,himinty,mnses,schoolid) %>% as.data.frame()
names(mlm)
##  [1] "minority" "female"   "ses"      "mathach"  "size"     "sector"  
##  [7] "pracad"   "disclim"  "himinty"  "mnses"    "schoolid"
#Tabla estadisticos descriptivos
stargazer(mlm,title="Estadísticos descriptivos", type = "text")
## 
## Estadísticos descriptivos
## ===================================================================
## Statistic   N     Mean    St. Dev.   Min   Pctl(25) Pctl(75)  Max  
## -------------------------------------------------------------------
## minority  7,185   0.275     0.446     0       0        1       1   
## female    7,185   0.528     0.499     0       0        1       1   
## ses       7,185  0.0001     0.779   -3.758  -0.538   0.602   2.692 
## mathach   7,185  12.748     6.878   -2.832  7.275    18.317  24.993
## size      7,185 1,056.862  604.172   100     565     1,436   2,713 
## sector    7,185   0.493     0.500     0       0        1       1   
## pracad    7,185   0.534     0.251   0.000   0.320    0.700   1.000 
## disclim   7,185  -0.132     0.944   -2.416  -0.817   0.460   2.756 
## himinty   7,185   0.280     0.449     0       0        1       1   
## mnses     7,185  0.0001     0.414   -1.194  -0.323   0.327   0.825 
## schoolid  7,185 5,277.898 2,499.578 1,224   3,020    7,342   9,586 
## -------------------------------------------------------------------

3. Modelos

Modelo 0: Null model

results_0 = lmer(mathach ~ 1 + (1 | schoolid), data = mlm)
summary(results_0)
## Linear mixed model fit by REML ['lmerMod']
## Formula: mathach ~ 1 + (1 | schoolid)
##    Data: mlm
## 
## REML criterion at convergence: 47116.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.0631 -0.7539  0.0267  0.7606  2.7426 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  schoolid (Intercept)  8.614   2.935   
##  Residual             39.148   6.257   
## Number of obs: 7185, groups:  schoolid, 160
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  12.6370     0.2444   51.71
screenreg(results_0) # de library texreg
## 
## ========================================
##                            Model 1      
## ----------------------------------------
## (Intercept)                    12.64 ***
##                                (0.24)   
## ----------------------------------------
## AIC                         47122.79    
## BIC                         47143.43    
## Log Likelihood             -23558.40    
## Num. obs.                    7185       
## Num. groups: schoolid         160       
## Var: schoolid (Intercept)       8.61    
## Var: Residual                  39.15    
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

En este modelo es posible identificar \(\gamma_{00}\) (intercept - fixed effects), así como también \(\tau_{00}\) (Var:id) y \(\sigma^2\) (Var:Residual). Con estos últimos coeficientes aleatorios es posible calcular la correlación intraclase (\(\rho\)).

Correlación intra-clase = ICC = \(\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}\)

# Cálculo directo en R:
8.614/(8.614+39.148) # calculo ICC 
## [1] 0.1803526
# O de manera directa con funcion ICC de reghelper
reghelper::ICC(results_0)
## [1] 0.1803518

Model 1: individual level

results_1 = lmer(mathach ~ 1 + ses + female + (1 | schoolid), data = mlm)
screenreg(results_1, naive=TRUE)
## 
## ========================================
##                            Model 1      
## ----------------------------------------
## (Intercept)                    13.28 ***
##                                (0.20)   
## ses                             2.36 ***
##                                (0.11)   
## female                         -1.19 ***
##                                (0.17)   
## ----------------------------------------
## AIC                         46605.73    
## BIC                         46640.13    
## Log Likelihood             -23297.86    
## Num. obs.                    7185       
## Num. groups: schoolid         160       
## Var: schoolid (Intercept)       4.49    
## Var: Residual                  36.81    
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Model 2: variables nivel 2

results_2 = lmer(mathach ~ 1 + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_2)
## 
## ========================================
##                            Model 1      
## ----------------------------------------
## (Intercept)                    12.13 ***
##                                (0.20)   
## sector                          1.23 ***
##                                (0.31)   
## mnses                           5.33 ***
##                                (0.37)   
## ----------------------------------------
## AIC                         46956.50    
## BIC                         46990.90    
## Log Likelihood             -23473.25    
## Num. obs.                    7185       
## Num. groups: schoolid         160       
## Var: schoolid (Intercept)       2.31    
## Var: Residual                  39.16    
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Model 3: individual y grupal

results_3 = lmer(mathach ~ 1 + ses + female + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_3)
## 
## ========================================
##                            Model 1      
## ----------------------------------------
## (Intercept)                    12.74 ***
##                                (0.21)   
## ses                             2.15 ***
##                                (0.11)   
## female                         -1.20 ***
##                                (0.16)   
## sector                          1.25 ***
##                                (0.30)   
## mnses                           3.07 ***
##                                (0.37)   
## ----------------------------------------
## AIC                         46515.61    
## BIC                         46563.77    
## Log Likelihood             -23250.80    
## Num. obs.                    7185       
## Num. groups: schoolid         160       
## Var: schoolid (Intercept)       2.15    
## Var: Residual                  36.80    
## ========================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Model 4: pendiente aleatoria

results_4 = lmer(mathach ~ 1 + ses + female + mnses + sector + (1 + ses | schoolid), data = mlm)
screenreg(results_4)
## 
## ============================================
##                                Model 1      
## --------------------------------------------
## (Intercept)                        12.65 ***
##                                    (0.21)   
## ses                                 2.16 ***
##                                    (0.12)   
## female                             -1.19 ***
##                                    (0.16)   
## mnses                               3.07 ***
##                                    (0.38)   
## sector                              1.43 ***
##                                    (0.30)   
## --------------------------------------------
## AIC                             46514.33    
## BIC                             46576.24    
## Log Likelihood                 -23248.16    
## Num. obs.                        7185       
## Num. groups: schoolid             160       
## Var: schoolid (Intercept)           2.22    
## Var: schoolid ses                   0.42    
## Cov: schoolid (Intercept) ses       0.27    
## Var: Residual                      36.59    
## ============================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Model 5: pendiente aleatoria e interacción entre niveles

\(mathach= \gamma_{00}(intercepto)+\gamma_{10}ses_{ij}+\gamma_{01}sector_j+\gamma_{11}ses_{ij} *sector{j}+\mu_{0j}(intercepto)+\mu_{1j}ses_{ij}+ r_{ij}\)

results_5 = lmer(mathach ~ 1 + female + ses*sector + mnses + (1 + ses | schoolid), data = mlm)
screenreg(results_5)
## 
## ============================================
##                                Model 1      
## --------------------------------------------
## (Intercept)                        12.81 ***
##                                    (0.21)   
## female                             -1.18 ***
##                                    (0.16)   
## ses                                 2.73 ***
##                                    (0.14)   
## sector                              1.29 ***
##                                    (0.29)   
## mnses                               3.04 ***
##                                    (0.37)   
## ses:sector                         -1.31 ***
##                                    (0.21)   
## --------------------------------------------
## AIC                             46482.91    
## BIC                             46551.71    
## Log Likelihood                 -23231.45    
## Num. obs.                        7185       
## Num. groups: schoolid             160       
## Var: schoolid (Intercept)           2.11    
## Var: schoolid ses                   0.03    
## Cov: schoolid (Intercept) ses       0.17    
## Var: Residual                      36.60    
## ============================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Comparación individual, agregado y multinivel

Generar regresiones para comparar

reg_ind=lm(mathach ~ ses + female + mnses + sector, data=mlm)
agg_mlm=mlm %>% group_by(schoolid) %>% summarise_all(funs(mean))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## please use list() instead
## 
##   # Before:
##   funs(name = f(.))
## 
##   # After: 
##   list(name = ~ f(.))
## This warning is displayed once per session.
reg_agg=lm(mathach ~ ses + female + mnses + sector, data=agg_mlm)

Observar: ¿Qué sucede con los coeficientes y errores estándar cuando se comparan los coeficientes y los errores estándar?

screenreg(list(reg_ind, reg_agg, results_3))
## 
## =================================================================
##                            Model 1      Model 2     Model 3      
## -----------------------------------------------------------------
## (Intercept)                  12.80 ***   13.13 ***      12.74 ***
##                              (0.13)      (0.35)         (0.21)   
## ses                           2.15 ***    5.19 ***       2.15 ***
##                              (0.11)      (0.37)         (0.11)   
## female                       -1.34 ***   -1.97 ***      -1.20 ***
##                              (0.15)      (0.56)         (0.16)   
## mnses                         2.90 ***                   3.07 ***
##                              (0.22)                     (0.37)   
## sector                        1.32 ***    1.25 ***       1.25 ***
##                              (0.16)      (0.31)         (0.30)   
## -----------------------------------------------------------------
## R^2                           0.18        0.67                   
## Adj. R^2                      0.18        0.67                   
## Num. obs.                  7185         160           7185       
## RMSE                          6.23        1.80                   
## AIC                                                  46515.61    
## BIC                                                  46563.77    
## Log Likelihood                                      -23250.80    
## Num. groups: schoolid                                  160       
## Var: schoolid (Intercept)                                2.15    
## Var: Residual                                           36.80    
## =================================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Generación de tabla para publicar en HTML

htmlreg(list(reg_ind, reg_agg, results_3), 
    custom.model.names = c("Individual","Agregado","Multinivel"),    
    custom.coef.names = c("Intercepto", "$SES_{ij}$","$Mujer_{ij}$", "$SES_{j}$", "$Sector_{j}$"), 
    custom.gof.names=c(NA,NA,NA,NA,NA,NA,NA,NA, 
                   "Var:id ($\\tau_{00}$)","Var: Residual ($\\sigma^2$)"),
    custom.note = "%stars. Errores estándar en paréntesis",
    caption="Comparación de modelos Individual, Agregado y Multinivel",
    caption.above=TRUE,
    doctype = FALSE)
Comparación de modelos Individual, Agregado y Multinivel
Individual Agregado Multinivel
Intercepto 12.80*** 13.13*** 12.74***
(0.13) (0.35) (0.21)
\(SES_{ij}\) 2.15*** 5.19*** 2.15***
(0.11) (0.37) (0.11)
\(Mujer_{ij}\) -1.34*** -1.97*** -1.20***
(0.15) (0.56) (0.16)
\(SES_{j}\) 2.90*** 3.07***
(0.22) (0.37)
\(Sector_{j}\) 1.32*** 1.25*** 1.25***
(0.16) (0.31) (0.30)
R2 0.18 0.67
Adj. R2 0.18 0.67
Num. obs. 7185 160 7185
RMSE 6.23 1.80
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. groups: schoolid 160
Var:id (\(\tau_{00}\)) 2.15
Var: Residual (\(\sigma^2\)) 36.80
p < 0.001, p < 0.01, p < 0.05. Errores estándar en paréntesis