Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust

James G. MacKinnon, Morten Ørregaard Nielsen, Matthew D. Webb

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

5 Citations (Scopus)
13 Downloads (Pure)

Abstract

We introduce a new command, summclust, that summarizes the cluster structure of the dataset for linear regression models with clustered disturbances. The key unit of observation for such a model is the cluster. We therefore propose cluster-level measures of leverage, partial leverage, and influence and show how to compute them quickly in most cases. The measures of leverage and partial leverage can be used as diagnostic tools to identify datasets and regression designs in which cluster–robust inference is likely to be challenging. The measures of influence can provide valuable information about how the results depend on the data in the various clusters. We also show how to calculate two jackknife variance matrix estimators efficiently as a by-product of our other computations. These estimators, which are already available in Stata, are generally more conservative than conventional variance matrix estimators. The summclust command computes all the quantities that we discuss.

Original languageEnglish
JournalStata Journal
Volume23
Issue4
Pages (from-to)942-982
Number of pages41
ISSN1536-867X
DOIs
Publication statusPublished - Dec 2023

Keywords

  • clustered data
  • cluster–robust variance estimator
  • CRVE
  • grouped data
  • high-leverage clusters
  • influential clusters
  • jackknife
  • partial leverage
  • robust inference
  • st0733
  • summclust

Fingerprint

Dive into the research topics of 'Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust'. Together they form a unique fingerprint.

Cite this