Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avis › Tidsskriftartikel › Forskning › peer review
Functional Sequential Treatment Allocation. / Kock, Anders Bredahl; Preinerstorfer, David; Veliyev, Bezirgen.
I: Journal of the American Statistical Association, Bind 117, Nr. 539, 2022, s. 1311-1323.Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avis › Tidsskriftartikel › Forskning › peer review
}
TY - JOUR
T1 - Functional Sequential Treatment Allocation
AU - Kock, Anders Bredahl
AU - Preinerstorfer, David
AU - Veliyev, Bezirgen
PY - 2022
Y1 - 2022
N2 - Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the effectiveness of the treatments through their mean, much less is known about other targets. This is restrictive, because a cautious decision maker may prefer to target a robust location measure such as a quantile or a trimmed mean. Furthermore, socio-economic decision making often requires targeting purpose specific characteristics of the outcome distribution, such as its inherent degree of inequality, welfare or poverty. In the present article, we introduce and study sequential learning algorithms when the distributional characteristic of interest is a general functional of the outcome distribution. Minimax expected regret optimality results are obtained within the subclass of explore-then-commit policies, and for the unrestricted class of all policies. for this article are available online.
AB - Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the effectiveness of the treatments through their mean, much less is known about other targets. This is restrictive, because a cautious decision maker may prefer to target a robust location measure such as a quantile or a trimmed mean. Furthermore, socio-economic decision making often requires targeting purpose specific characteristics of the outcome distribution, such as its inherent degree of inequality, welfare or poverty. In the present article, we introduce and study sequential learning algorithms when the distributional characteristic of interest is a general functional of the outcome distribution. Minimax expected regret optimality results are obtained within the subclass of explore-then-commit policies, and for the unrestricted class of all policies. for this article are available online.
KW - Distributional characteristics
KW - Minimax optimal expected regret
KW - Multi-armed bandits
KW - Randomized controlled trials
KW - Robustness
KW - Sequential treatment allocation
KW - REGRET TREATMENT CHOICE
KW - INEQUALITY
KW - SIZE
KW - MAXIMIZATION
KW - ECONOMETRICS
KW - STATISTICAL-INFERENCE
KW - POVERTY
KW - MODELS
U2 - 10.1080/01621459.2020.1851236
DO - 10.1080/01621459.2020.1851236
M3 - Journal article
VL - 117
SP - 1311
EP - 1323
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
SN - 0162-1459
IS - 539
ER -