Trustworthy Online Controlled Experiments: A Practical Guide to A/B TestingCambridge University Press, 2 באפר׳ 2020 - 288 עמודים Getting numbers is easy; getting numbers you can trust is hard. This practical guide by experimentation leaders at Google, LinkedIn, and Microsoft will teach you how to accelerate innovation using trustworthy online controlled experiments, or A/B tests. Based on practical experiences at companies that each run more than 20,000 controlled experiments a year, the authors share examples, pitfalls, and advice for students and industry professionals getting started with experiments, plus deeper dives into advanced topics for practitioners who want to improve the way they make data-driven decisions. Learn how to • Use the scientific method to evaluate hypotheses using controlled experiments • Define key metrics and ideally an Overall Evaluation Criterion • Test for trustworthiness of the results and alert experimenters to violated assumptions • Build a scalable platform that lowers the marginal cost of experiments close to zero • Avoid pitfalls like carryover effects and Twyman's law • Understand how statistical issues play out in practice. |
תוכן
Introduction and Motivation | 3 |
Necessary Ingredients for Running Useful Controlled Experiments | 10 |
Examples of Interesting Online Controlled Experiments | 16 |
Additional Reading | 24 |
Designing the Experiment | 32 |
Twymans Law and Experimentation Trustworthiness | 39 |
Threats to External Validity | 48 |
Experimentation Platform and Culture | 58 |
advanced topics for building | 151 |
Instrumentation | 162 |
14 | 166 |
Trading Off Speed | 171 |
16 | 177 |
advanced topics for analyzing | 183 |
Pitfalls | 193 |
The AA Test | 200 |
selected topics for everyone | 79 |
Organizational Metrics | 90 |
Metrics for Experimentation and the Overall | 102 |
Institutional Memory and MetaAnalysis | 111 |
techniques to controlled experiments | 125 |
10 | 127 |
Observational Causal Studies | 137 |
20 | 209 |
Sample Ratio Mismatch and Other TrustRelated | 219 |
Leakage and Interference between Variants | 226 |
Measuring LongTerm Treatment Effects | 235 |
246 | |
266 | |
מהדורות אחרות - הצג הכל
מונחים וביטויים נפוצים
A/A tests analysis assignment assume assumption average Bing building called causal cause Chapter clicks client common compute consider controlled experiments create critical decision delta discuss distribution driver effect engagement engine ensure errors establish estimate et al evaluate example experimentation Figure focus goal Google happen hypothesis ideas identify impact implementation important improve increase indicate key metrics Kohavi lead learning long-term look mean measure ment method move multiple observational organization overall p-value performance phase platform practical query questions ramp randomization recommend requires risk sample scale sensitive server shared significant single specific statistical studies success teams term traffic Treatment Treatment effect triggered trustworthy typically understand unit user experience users usually validate variant