You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: to_explain_or_predict.Rmd
+47-19Lines changed: 47 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -114,13 +114,14 @@ ___Box (1976)___
114
114
What is your question?
115
115
</h2></center>
116
116
117
+
???
117
118
119
+
Through out this, keep asking yourself: what is your question?
118
120
119
121
---
120
122
121
123
# The two broad classes of DS/modelling question:
122
124
123
-
--
124
125
125
126
## Explain
126
127
@@ -142,8 +143,11 @@ __You can use many of the same models to fit in either context, but how you do i
142
143
143
144
???
144
145
Prof. Shmueli's paper laments that statisticians had almost exclusively on 'explanatory' models.
145
-
I'd like to suggest that, with the increasing accessibility of Data Science and Machine Learning, the focus of many
146
-
modern practitioners has swung the other way. Some of you may always be approaching a model as a prediction question.
146
+
147
+
With the increasing accessibility of Data Science and Machine Learning, the focus of many
148
+
modern practitioners has swung the other way.
149
+
150
+
Some of you may always be approaching a model as a prediction question.
147
151
148
152
What I'm presenting here today is fairly agnostic to your approach, be it bayesian / frequentist / whatever.
149
153
@@ -172,9 +176,10 @@ $$E(Y) = f(X)$$
172
176
Shmueli, G. (2010), http://www.jstor.org/stable/41058949
173
177
]
174
178
179
+
175
180
???
176
181
177
-
Firstly, don't be scared by the representation here, as I'll explain.
182
+
...don't be scared, it's not that bad...
178
183
179
184
We are trying to model how X causes something, without being constrained by what data we have.
180
185
This can be concepts such as Y = depression, and F(x) could be things like: anxiety, past trauma, physical health, stress... etc.
@@ -197,7 +202,7 @@ We can't measure them directly, so
197
202
What do I mean by 'causes?' It's not the same as 'associated with'. There is an 'exposure' to 'outcome' effect, and a temporal element: i.e. exposure before outcome.
198
203
This DAG is hypothesising the causal relationship between chemotherapy and venous thromoembolism (VTE)
199
204
200
-
The arrows indicator the direction of causal relationships. Age, sex, tumor site and tumour size are confounding this relationship and should be adjusted for in a model, but platelet count is a mediator and should not.
205
+
The arrows indicator the direction of causal relationships. Age, sex, tumour site and tumour size are confounding this relationship and should be adjusted for in a model, but platelet count is a mediator and should not.
Copy file name to clipboardExpand all lines: to_explain_or_predict.html
+48-35Lines changed: 48 additions & 35 deletions
Original file line number
Diff line number
Diff line change
@@ -65,13 +65,14 @@
65
65
What is your question?
66
66
</h2></center>
67
67
68
+
???
68
69
70
+
Through out this, keep asking yourself: what is your question?
69
71
70
72
---
71
73
72
74
# The two broad classes of DS/modelling question:
73
75
74
-
--
75
76
76
77
## Explain
77
78
@@ -93,8 +94,11 @@
93
94
94
95
???
95
96
Prof. Shmueli's paper laments that statisticians had almost exclusively on 'explanatory' models.
96
-
I'd like to suggest that, with the increasing accessibility of Data Science and Machine Learning, the focus of many
97
-
modern practitioners has swung the other way. Some of you may always be approaching a model as a prediction question.
97
+
98
+
With the increasing accessibility of Data Science and Machine Learning, the focus of many
99
+
modern practitioners has swung the other way.
100
+
101
+
Some of you may always be approaching a model as a prediction question.
98
102
99
103
What I'm presenting here today is fairly agnostic to your approach, be it bayesian / frequentist / whatever.
100
104
@@ -123,9 +127,10 @@
123
127
Shmueli, G. (2010), http://www.jstor.org/stable/41058949
124
128
]
125
129
130
+
126
131
???
127
132
128
-
Firstly, don't be scared by the representation here, as I'll explain.
133
+
...don't be scared, it's not that bad...
129
134
130
135
We are trying to model how X causes something, without being constrained by what data we have.
131
136
This can be concepts such as Y = depression, and F(x) could be things like: anxiety, past trauma, physical health, stress... etc.
@@ -148,7 +153,7 @@
148
153
What do I mean by 'causes?' It's not the same as 'associated with'. There is an 'exposure' to 'outcome' effect, and a temporal element: i.e. exposure before outcome.
149
154
This DAG is hypothesising the causal relationship between chemotherapy and venous thromoembolism (VTE)
150
155
151
-
The arrows indicator the direction of causal relationships. Age, sex, tumor site and tumour size are confounding this relationship and should be adjusted for in a model, but platelet count is a mediator and should not.
156
+
The arrows indicator the direction of causal relationships. Age, sex, tumour site and tumour size are confounding this relationship and should be adjusted for in a model, but platelet count is a mediator and should not.
0 commit comments