Interpretacio curta

Transcripción

Interpretacio curta
15/10/12 The role of statistics
“Thus statistical methods are no
substitute for common sense and
objectivity. They should never aim to
confuse the reader, but instead should
be a major contributor to the clarity of
a scientific argument.”
p
The role of statistics. Pocock SJ . Br J Psychiat 1980; 137:188-190
[email protected]
[email protected]
1
Population and Samples
Extrapolation
Sample
Sample
Study Results
Inferential analysis
Population of the Study
Statistical Tests
Confidence Intervals
Target Population
[email protected]
2
Population
“Conclusions”
[email protected]
3
4
P-value: an intuitive definition
P-value
• The p-value is the probability of having observed our
data when the null hypothesis is true (no differences
exist)
• The p-value is a “tool” to answer the question:
– Could the observed results have occurred by chance*?
p < .05
• Steps:
1)  Calculate the treatment differences in the sample (A-B)
2)  Assume that both treatments are equal (A=B) and then…
3)  …calculate the probability of obtaining a magnitude of at least
the observed differences, given the assumption 2
4)  We conclude according the probability:
“statistically significant”
– Remember:
•  Decision given the observed results in a SAMPLE
•  Extrapolating results to POPULATION
a.  p<0.05: the differences are unlikely to be explained by random,
– 
[email protected]
we assume that the treatment explains the differences
b.  p>0.05: the differences could be explained by random,
*: accounts exclusively for the random error, not bias
– 
5
we assume that random explains the differences
[email protected]
6
1 15/10/12 Intervalo de Confianza
Intervalo de confianza para evaluar
ensayos de superioridad
Superioridad observada
Superioridad no observada
0
Intuitivamente: El verdadero valor se encuentra
dentro del intervalo con una confianza del 95%
[email protected]
Test better
IC95%
d<0
- effect
8
Factors influencing statistical
significance
Superiority study
Control better
[email protected]
7
d=0
No differences
•  Signal
•  Difference
•  Noise (background)
•  Variance (SD)
•  Quantity
•  Quantity of data
d>0
+ effect
[email protected]
[email protected]
9
10
Random vs Sistematic error
Diferencia observada
•  Falsa:
–  Sesgos
•  Ej: en selección (muestra no representativa)
Random
–  Error en mediciones o
transcripción de datos
–  Azar
Bias
•  Real:
↑ Sample size
↑ Sample size
–  reflejo de diferencia en población
[email protected]
11
[email protected]
12
2 15/10/12 Utilidad de Creer en la Existencia de
Dios (según Pascal)
Type I & II Error & Power
H0: Dios No Existe
H1: Dios Existe
Reality
(Population)
Conclusion
(sample)
Realidad
A=B
A≠B
“A=B” p>0.05
OK
Type II error
(β)
A≠B p<0.05
Type I error
(α)
Decisión
de Pascal
OK
[email protected]
Dios Existe
Dios No Existe
Acierto
No Penalización
Condena Eterna
Acierto
Dios Existe
Dios No Existe
[email protected]
13
Type I & II Error & Power
14
Sample Size
• Type I Error (α)
–  False positive
–  Rejecting the null hypothesis when in fact it is true
–  Standard: α=0.05
–  In words, chance of finding statistical significance when in fact
there truly was no effect
u 
The planned number of participants is calculated on the basis of:
–  Expected effect of treatment(s)
–  Variability of the chosen endpoint
–  Accepted risks in conclusion
↗ effect ↘ number ↗ variability ↗ number ↗ risk ↘ number • Type II Error (β)
–  False negative
–  Accepting the null hypothesis when in fact alternative is true
–  Standard: β=0.20 or 0.10
–  In words, chance of not finding statistical significance when in
fact there was an effect
[email protected]
[email protected]
15
Sample Size
Sample Size
•  The planned number of participants is calculated on
the basis of:
–  Expected effect of treatment(s)
300
200
100
100
–  Variability of the chosen endpoint
↗ effect ↘ number ↗ variability ↗ number ↗ risk ↘ number –  Accepted risks in conclusion
↗ risk ↘ number Reality
(Population)
ALTURA
ALTURA
ALTURA
200
–  Expected effect of treatment(s)
↗ variability ↗ number –  Accepted risks in conclusion
300
•  The planned number of participants is calculated on the basis of:
↗ effect ↘ number –  Variability of the chosen endpoint
16
120
100
A=B
A≠B
“A=B” p>0.05
OK
Type II error
(β)
A≠B p<0.05
Type I error
(α)
POWER
80
N = 2000.00
Frecuencia
Media = 165.0
N = 2000.00
0
0.0
22
0.0
21
0.0
20
0.0
19
0.0
18
0.0
17
0.0
16
0.0
15
0.0
14
0.0
13
0.0
12
0.0
11
2.5
20 .5
7
19 .5
2
19 .5
7
18 .5
2
18 .5
7
17 .5
2
17 .5
7
16 .5
2
16 .5
7
15 .5
2
15 .5
7
14 .5
2
14 .5
7
13 .5
2
13 .5
7
12 .5
2
12
ALTURA
Frecuencia
Desv. típ. = 25.54
Media = 165.1
0
40
Desv. típ. = 32.27
20
Conclusion
(sample)
Media = 165.1
N = 2000.00
0
0.0
25 .0
0
24 .0
0
23 .0
0
22 .0
0
21 .0
0
20 .0
0
19 .0
0
18 .0
0
17 .0
0
16 .0
0
15 .0
0
14 .0
0
13 .0
0
12 .0
0
11 .0
0
10
.0
90
.0
80
Frecuencia
60
Desv. típ. = 26.94
ALTURA
ALTURA
[email protected]
17
[email protected]
18
3 15/10/12 Torneo Roland Garros 1999
1ª Ronda
Carlos Moyá vs Markus Hipfl
Juegos Totales Ganados
Puntos Totales Ganados
1er Servicio
Aces
Doble Faltas
% Ganadores con el 1er Servicio
% Ganadores con el 2º Servicio
Ganadores (incluyendo el Servicio)
Errores No Forzados
Puntos de Break Ganados
Aproximaciones a la red
Velocidad del Servicio más Rápido
Promedio Velocidad 1er Servicio
Promedio Velocidad 2º Servicio
MULTIPLICITY
[email protected]
say it
colloquially,
torture the data
until they speak...
Hipfl
22
147
62%
5
4
63 de 95 = 66%
25 de 58 = 43%
30
62
6 of 21 = 29%
48 of 71 = 68%
200 KPH
157 KPH
132 KPH
24
146
69%
3
5
61 de 96 = 64%
20 de 44 = 45%
56
75
6 of 27 = 22%
29 of 41 = 71%
193 KPH
141 KPH
126 KPH
Set
1
2
3
4
5
Carlos Moyá
Markus Hipfl
3
6
1
6
6
4
6
4
6
4
[email protected]
19
u  To
Moyá
20
Torturing data…
–  Investigators examine additional endpoints,
manipulate group comparisons, do many subgroup
analyses, and undertake repeated interim
analyses.
–  Investigators should report all analytical
comparisons implemented. Unfortunately, they
sometimes hide the complete analysis,
handicapping the reader’s understanding of the
results.
Lancet 2005; 365: 1591–95
[email protected]
Lancet 2005; 365: 1591–95
[email protected]
21
22
Multiplicity
Design Conduc@on K independent hypothesis : H01 , H02 , ... , H0K
S significant results ( p<α )
Results Pr (S ≥ 1 | H01 ∩ H02 ∩ ... ∩ H0K = H0.) = 1 - Pr (S=0|H0.)
= 1- (1 - α)K
[email protected]
23
K
Pr(S>=1|Ho.)
K
Pr(S>=1|Ho.)
1
0.0500
10
0.4013
2
0.0975
15
0.5367
3
0.1426
20
0.6415
4
0.1855
25
0.7226
5
0.2262
30
0.7854
[email protected]
24
4 15/10/12 Same examples
Handling Multiplicity in Variables
Variables
Times
Subgroups
Comparisons
case A
2
2
2
1
case B
5
4
3
1
case C
5
4
3
3
total
False positive rate
8
33.66%
60
96.61%
180
99.99%
[email protected]
•  Scenario 1: One Primary Variable
–  Identify one primary variable -- other
variables are secondary
–  Trial is positive if and only if primary
variable shows significant (p < 0.05),
positive results
[email protected]
25
26
Multiplicity
•  Bonferroni correction
(simplified version)
–  K tests with level of signification of α
–  Each test can be tested at the α/k level
•  Example:
–  5 independent tests
–  Global level of significance=5%
–  Each test shoud be tested at the 1% level
5% /5
=> 1%
[email protected]
27
[email protected]
28
Subgroups
•  Indiscriminate subgroup analyses pose
serious multiplicity concerns. Problems
reverberate throughout the medical
literature. Even after many warnings,
some investigators doggedly persist in
undertaking excessive subgroup
analyses.
SUBGROUPS
Lancet 2000; 355: 1033–34
Lancet 2005; 365: 1657–61
[email protected]
29
[email protected]
30
5 15/10/12 Interacción
Factores de confusión
d=5%
d=6%
Edad >= 45 años
Edad < 45 años
d=11.5%
d=0.7%
[email protected]
[email protected]
31
FEMALE
Succes
Failure
100
Control
n (%)
60
(60%)
40
(40%)
ALL
100
[email protected]
32
Subgroups & Simpson’s Paradox
MALE
ALL
d=0%
d=0%
Subgroups & Simpson’s Paradox
Experimental
n (%)
70
(70%)
30
(30%)
Fumadores
No fumadores
33
Succes
Failure
Experimental
n (%)
10 (33%)
20 (67%)
Control
n (%)
24 (40%)
36 (60%)
Succes
Failure
60 (86%)
10 (14%)
36 (90%)
4 (10%)
Succes
Failure
30
70
Experimental
n (%)
70 (70%)
30 (30%)
100
cont.
60
40
Control
n (%)
60
(60%)
40
(40%)
100
[email protected]
34
Changes from ISIS-2 results
Subgroups
ISIS-2: Vascular death by Star signs
Geminis/Libra
Other Star Signs
Aspirin Placebo
Vascular Death
Total
p=0.42045
150
147
1357
1442
11.1%
10.2%
d=-0.9
Aspirin Placebo
Vascular Death
Total
654
868
7228
7157
9.0%
p<0.0001
12.1%
d=3.1
Interacction p = 0.019
Lancet 1988; 2: 349–60.
[email protected]
35
Lancet 2005; 365: 1657–61
[email protected]
36
6 15/10/12 •  “The answer to a randomized controlled
trial that does not confirm one’s beliefs
is not the conduct of several
subanalyses until one can see what one
believes. Rather, the answer is to reexamine one’s beliefs carefully.”
–  BMJ 1999; 318: 1008–09.
Lancet 2005; 365: 176–86
[email protected]
37
[email protected]
38
Seamos críticos
•  En ocasiones las cosas no son lo que
parecen
[email protected]
41
[email protected]
40
[email protected]
42
7 15/10/12 Seamos críticos
¿Me fío del valor?
Seamos críticos
Otro ejemplo más
•  Afirmaciones sin especificación de
resultados
•  A un paciente se le recomienda una
intervención quirúrgica y pregunta por la
probabilidad de sobrevivir.
•  Porcentajes sin el denominador
•  El cirujano le contesta que en las 30
operaciones que ha realizado, ningún paciente
ha muerto.
•  Medias sin intervalo de confianza
•  ¿Qué valores de P(morir) son compatibles con
esta información, con una confianza del 95%?
[email protected]
Seamos críticos
Solución
•  Si se disponen de datos...
¡¡¡ p<0.05 !!!
•  La solución aproximada no sirve.
•  Solución exacta, basada en la binomial:
{0; 0,116}
•  ... No se han de desperdiciar. Unos
datos bien ‘torturados’ al final cantan.
•  Incluso si la mortalidad es de un 11,6%, en
30 intervenciones no se observará ninguna
muerte con Pr=0,025
45
... ¿Y lo del denominador?
El famoso perro fantástico
[email protected]
44
Seamos críticos
•  Límite superior del IC 95% para p=0 con n=30
Pr(X=0,n=30,ps) = 0,025
[email protected]
[email protected]
43
[email protected]
46
Por que después pasa lo que pasa
47
[email protected]
48
8 

Documentos relacionados