ReproNim · adswa · Jun 11, 2020 · Jun 11, 2020 · Jun 11, 2020 · Jun 11, 2020
diff --git a/_episodes/00-Introduction-to-module.md b/_episodes/00-Introduction-to-module.md
@@ -3,12 +3,12 @@ title: "Module overview"
 teaching: 10
 exercises: 0
 questions:
-- "Who is this module for ?"
-- "How can I get some help if I get stuck while solving an exercise or a question ?"
-- "How can I validate this module ?"
+- "Who is this module for?"
+- "How can I get some help if I get stuck while solving an exercise or a question?"
+- "How can I validate this module?"
 objectives:
 - "Teach neuroimagers about the statistical aspects of reproducibility"
-- "Have a collaborative training enterprise: you can improve this module if you know how to do a pull request or raise an issue on github:github.com/repronim/module-stat. See module 'the informatics basics of reproducibility (module 0) on how to do this."
+- "Have a collaborative training enterprise: you can improve this module if you know how to do a pull request or raise an issue on [github](https://github.com/repronim/module-stats). See module 'the informatics basics of reproducibility (module 0) on how to do this."
 - "This module should give you a critical eye on most of the current literature and the knowledge to do statistically robust work, hence more replicable"
 keypoints:
 - Reproducible analysis is strongly impacted by statistical analyses.

diff --git a/_episodes/01-lesson-statistical-basis.md b/_episodes/01-lesson-statistical-basis.md
@@ -3,11 +3,11 @@ title: "Statistical basis for neuroimaging analyses: the basics"
 teaching: "~ 60"
 exercises: "~ 60"
 questions:
-- "Sampling, notion of estimation : estimates of mean and variances"
+- "Sampling, notion of estimation: estimates of mean and variances"
 - "Distributions, relation to frequency, PDF, CDF, SF, ISF "
 - "Hypothesis testing: the basics H0 versus H1"
 - "Confidence intervals "
-- "Notion of model comparison : BIC/Akaike"
+- "Notion of model comparison: BIC/Akaike"
 - "Notion of bayesian statistics "
 objectives:
 - "After this lesson, you should have the statistical basis for understanding this course. You will know what sampling is, the fundamentals of statistical testing, etc.  "
@@ -20,7 +20,7 @@ keypoints:
 ---
 
 
-## Introduction: why do I need to know all this ?
+## Introduction: Why do I need to know all this?
 
 In our experience, you will not be able to think clearly about how "solid" the result you obtain without a sufficient background in statistics. Sometimes, you will think that this is taking you "too far". We truly do not think so, and if you bear with us long enough we hope that you will find this rewarding, and useful.
 
@@ -29,9 +29,10 @@ In our experience, you will not be able to think clearly about how "solid" the r
 In this unit, we would like you to fully understand the concept of a sample as a subset of a population.
 
 It is critical that you master that concept, as it is at heart of most scientific studies.
--sample vs population
--generalizability
--notion of sufficient sample
+
+- sample vs population
+- generalizability
+- notion of sufficient sample
 
 Please read the sections on this webpage:
 [Basics of sampling](http://www.socialresearchmethods.net/kb/sampling.php)
@@ -45,7 +46,7 @@ The key concepts are:
 3. Say you sample 30 participants from a seemingly infinite population. Say you compute an average of some characteristics, for instance participant brain volume, $$M$$ across these 30 subjects. The *sampling distribution* of $$M$$ is the distribution you would obtain if you were to repeat de sampling of 30 participants a great number of times, for instance 10,000 times. For each sampling of 30 values, you would compute the mean of the sampling, and see what is the distribution of this mean across all 10,000 samplings. These 10,000 values should be distributed about the true (unobserved) mean of the population.
 
 
-> ## Questions on sampling. --->
+> ## Questions on sampling.
 >
 >  - If you sample two times a population, and compute the two means of these two samples, will you get necessarily two close values ?
 >  - Is the difference between these two means going to be always smaller if the sample size increase to say 60?
@@ -70,7 +71,7 @@ See the [survival function](https://en.wikipedia.org/wiki/Survival_function) in
 
 The cumulative density function of a random variable X is often noted $$F_X$$ or simply $$F$$ when there is no ambiguity.
 
-> ## Questions on distribution and cumulative distribution function (CDF). --->
+> ## Questions on distribution and cumulative distribution function (CDF).
 >
 >  - Imagine that we have a normally distribution with a mean -20 and a sigma of 1. Will the CDF at 0 be flat and have a value of 0 or flat with a value of 1?
 >  - Same question as above with a mean of the normal distribution equal to 20 ?
@@ -121,11 +122,11 @@ The inertia of the community to adopt the bayesian framework is clear.  One reas
 
 ### Questions on Bayesian statistics and comparison with frequentist:
 
-* Why are Bayesian statistics not very used in medical or life science journals ?
-* What are the key advantage of Bayesian statistics ?
+* Why are Bayesian statistics not very used in medical or life science journals?
+* What are the key advantage of Bayesian statistics?
 * What do Bayesian statistics require before you can apply them - do we have this
 
-## Notion of model comparison : BIC/Akaike
+## Notion of model comparison: BIC/Akaike
 
 Model comparison is **fundamental**. Here are a few links on this topic:
 

diff --git a/_episodes/02-Effect-size.md b/_episodes/02-Effect-size.md
@@ -24,9 +24,11 @@ Here, we start with an explanation of the effect size using a t-test. First, hav
 
 Effect sizes come with many flavors, Wikipedia also lists a series of type of effect size, such as correlation, variance explained, difference of means, etc. When normalized, they are suppose to capture in some sense the difficulty of detecting such an effect. When not normalized, they can give us a sense of the underlying biology. For instance, you would in general not believe that the volumes of the front lobes of a population diagnosed with autism are on average twice bigger than a control population, but a few cubic mm would be believable (although not necessarily true).
 
-Is the overview clear to you? Just to give you a concrete example if it is not, say we want to test the difference of the means of two populations, for instance the brain activity in the visual cortex for the normal versus patient population. We *sample* 30 normals and 30 patients, and we compute *estimated* means of the two populations using our *samples* of 30 + 30 participants. Let's measure the BOLD response in the visual cortex for all participants. The average of the 30 participants in the control group (CG) is 5\%. The average of the 30 participants in the patient group (PG) is 8\%. The standard deviation of the data (not of the average that we just referred to) in the CG is 1\%  and the standard deviation of the CG is 2\%.
+$$\sqrt{\frac{a}{b}}$$
 
-Now, let's say we are studying the activity of the visual cortex in the CG. You want to know if this is different from zero. The *estimated* (or sampled) effect size for the CG is 5%. The **normalized** effect size will be divided by the **standard deviation of the data**, so 5/1 = 5 for CG, and 8/2=4 for the PG, while the corresponding t-tests will be t=5/(1/sqrt(30-1)) and t=8/(2/sqrt(30-1)). The true effect size (the one Wikipedia would write in greek letter) would be the value of the BOLD responses for the **whole populations** of control and patients.
+Is the overview clear to you? Just to give you a concrete example if it is not, say we want to test the difference of the means of two populations, for instance the brain activity in the visual cortex for the normal versus patient population. We *sample* 30 normals and 30 patients, and we compute *estimated* means of the two populations using our *samples* of 30 + 30 participants. Let's measure the BOLD response in the visual cortex for all participants. The average of the 30 participants in the control group (CG) is 5%. The average of the 30 participants in the patient group (PG) is 8%. The standard deviation of the data (not of the average that we just referred to) in the CG is 1%  and the standard deviation of the CG is 2%.
+
+Now, let's say we are studying the activity of the visual cortex in the CG. You want to know if this is different from zero. The *estimated* (or sampled) effect size for the CG is 5%. The **normalized** effect size will be divided by the **standard deviation of the data**, so $$\frac{5}{1} = 5$$ for CG, and $$\frac{8}{2}=4$$ for the PG, while the corresponding t-tests will be $$t=\frac{5}{\frac{1}{\sqrt{30-1}}}$$ and $$t=\frac{8}{\frac{2}{\sqrt{30-1}}}$$. The true effect size (the one Wikipedia would write in greek letter) would be the value of the BOLD responses for the **whole populations** of control and patients.
 
 How would _you_ define the *estimated* effect size of the difference of the two population? The "raw" / "not normalized" effect size would simply be (8-5)%. To define the **normalized** we need to estimate the variability of the data, and this can be done using the pooled standard deviation (something close to the average of the two standard deviations weighted by the group sizes). See "Cohen's d" in the "Difference family: Effect sizes based on differences between means" section of  [wikipedia](https://en.wikipedia.org/wiki/Effect_size).  See also the use of the [Welch's](https://en.wikipedia.org/wiki/Welch%27s_t-test) and its estimation of the statistics degrees of freedom, which could provide another way to define the normalized effect size.
 
@@ -46,8 +48,8 @@ This [article](http://staff.bath.ac.uk/pssiw/stats2/page2/page14/page14.html) is
 ---
 
 > ## Questions
->     - Is the coefficient of correlation an effect size ?
->     - Can I always compare normalized effect sizes?
+>- Is the coefficient of correlation an effect size?
+>- Can I always compare normalized effect sizes?
 >
 {: .challenge}
 
@@ -65,9 +67,9 @@ Coming back to the effect size, we now can understand the Cohen's f2 effect size
 
 
 > ## Exercise
->     - Is the R2 a "normalized effect size" ?
->     - If you have an experiment with three samples of different populations, and some other covariables. Can you write what would be the R2 for the part of the model that corresponds to the difference of the means of the three groups? Can you generalize the wikipedia page on R2 to account for the case of an F test ? See how this is explained in a non-simple linear model in [wikipedia](https://en.wikipedia.org/wiki/Coefficient_of_determination).
->			- If I am computing an effect size from the general model with neuroimaging data, should it be a standardized one? Why?
+>- Is the R2 a "normalized effect size"?
+>- If you have an experiment with three samples of different populations, and some other covariables. Can you write what would be the R2 for the part of the model that corresponds to the difference of the means of the three groups? Can you generalize the wikipedia page on R2 to account for the case of an F test ? See how this is explained in a non-simple linear model in [wikipedia](https://en.wikipedia.org/wiki/Coefficient_of_determination).
+>- If I am computing an effect size from the general model with neuroimaging data, should it be a standardized one? Why?
 >
 {: .challenge}
 
@@ -86,10 +88,10 @@ In particular, read the section "Effect size and confidence interval".
 Please read it till paragraph "(2) Covariates, multiple regression, GLM and effect size calculations": this paper sumarize a lot of the information that we need to work efficiently. This should take you around one to two hours. Try then to answer the following
 
 > ## Questions
->     - What was the meaning of "effect size" in the Wikipedia page?
->     - Can you point to some neuroimaging work that make obvious what is the effect size of the results?
->			- The authors refer to three type of effect sizes on page 595 in the section "How to obtain and interpret effect sizes". Can you think of a possibly missing effect size?
->			- Why are the authors advocating for non-normalized effect (eg top of p 597, right column)?
+>- What was the meaning of "effect size" in the Wikipedia page?
+>- Can you point to some neuroimaging work that make obvious what is the effect size of the results?
+>- The authors refer to three type of effect sizes on page 595 in the section "How to obtain and interpret effect sizes". Can you think of a possibly missing effect size?
+>- Why are the authors advocating for non-normalized effect (eg top of p 597, right column)?
 {: .challenge}
 
 

diff --git a/_episodes/03-p-values.md b/_episodes/03-p-values.md
@@ -3,25 +3,26 @@ title: "P-values and their issues"
 teaching: "~ 60"
 exercises: "~ 60"
 questions:
-- "What is a p-value ?"
-- "What should I be aware of when I see a 'significant' p-value ?"
+- "What is a p-value?"
+- "What should I be aware of when I see a 'significant' p-value?"
 objectives:
 - "After this lesson, you should know what is a p-value and interpret it appropriately. You will know about the caveats of p-values."
 keypoints:
 - "A p-value does not give you an idea of the importance of the result"
 - "A p-value should always be complemented by other information (effect size, confidence interval)"
 ---
 
-## Introduction: are p-values entirely evil ?
+## Introduction: are p-values entirely evil?
 
 As often, any headline with a question mark is answered with a "no". But p-values have been seriously mis-used by scientists, especially in the life science and medical fields, such that they require specific attention, hence this lesson.
 
 ## P-value
 
-### Starting with a little challenge !
+### Starting with a little challenge!
 
-> ## Can you answer these questions? Even if yes, you may want to read the p-value section --->
+> ## Can you answer these questions? Even if yes, you may want to read the p-value section
 >
+> Which of the following statements is/are true?
 >  - A p-value is telling me that my alternative hypothesis is likely (H1 is probably true)
 >  - A p-value is telling me that my null hypothesis is unlikely (H0 is probably false)
 >  - A p-value is telling me that my null hypothesis is less likely than my alternative hypothesis
@@ -69,37 +70,36 @@ to the exercise is given [here](https://github.com/ReproNim/module-stats/blob/gh
   interested in the mean of our samples.
 
 * Test if the mean is significantly greater than zero with a type I error rate
-  of 5\%. If it is, what was the chance of this happening?  If it is not
+  of 5%. If it is, what was the chance of this happening?  If it is not
   "significant", repeat the sampling and test again until you find something
-  significant. How many times did you need to sample again ? What would you have
-  expected ?
+  significant. How many times did you need to sample again? What would you have
+  expected?
 
 * Now, say we have some signal. Simulate the case where the mean of our
-  sampling distribution is 1.64/$$\sqrt(30) $$ and the sigma is one in one
-  case, and the mean is .164/$$\sqrt(30) $$ and the sigma is .1 in another case.
-  How many times is the test significant in both cases if you do 100 simulations
-  ? what would you expect ?
+  sampling distribution is $$\frac{1.64}{\sqrt{30}}$$ and the sigma is one in one
+  case, and the mean is $$\frac{.164}{\sqrt{30}}$$ and the sigma is .1 in another case.
+  How many times is the test significant in both cases if you do 100 simulations? what would you expect?
 
 * You should find that roughly, the number of times these two tests are
   "significant" is about the same, because the signal to noise ratio is the
   same. But there is a fundamental difference: if the mean was representing a
-  biological value, what is the fundamental difference ?
+  biological value, what is the fundamental difference?
 
 ### Multiple Comparison problem:
 
 One of the best way to understand the problem is to look at the [xkcd view of it](https://xkcd.com/882/). The cartoon is great not only because it exposes the issue, but because it also exposes the *consequence* of the issue in peer review publications.
 
-Please also go through the wikipedia of [multiple comparisons](https://en.wikipedia.org/wiki/Multiple_comparisons_problem#Classification_of_multiple_hypothesis_tests)
+Please also go through the wikipedia of [multiple comparisons](https://en.wikipedia.org/wiki/Multiple_comparisons_problem#Classification_of_multiple_hypothesis_tests).
 
-### Exercise on multiple comparison issue :
+### Exercise on multiple comparison issue:
 
-> ## Can you answer these questions? --->
+> ## Can you answer these questions?
 >
 >  - You look at the correlation between the size of the nose of individuals
 >    and the size of their car. You have data from 100 cities. Is it likely
 >    that you will find a correlation significant in at least one city?
 >  - If I do 10 statistical significance tests, to have a false positive rate
->    of 5\%, I should use 5/10\% for each individual test
+>    of 5%, I should use $$\frac{5\%}{10}$$ for each individual test
 >  - If the 10 statistics tested (eg, 10 t-statistics) are positively
 >    correlated, is this correction too harsh ?
 {: .challenge}
@@ -125,21 +125,22 @@ Here is a great article about it:
 
 This takes as an example the number of drugs tested on individuals, or the mamography test for cancer, but easily generalize to number of voxels or ROIs tested. It introduces you to very important concepts, read carefully and make sure you understand what is the base rate fallacy.  After reading, you should know more not only Type I and Type II errors, but importantly, on what the issue of the having low prior probability for the alternative hypothesis.
 
-> ## Can you answer these questions? --->
+> ## Can you answer these questions?
 >
->  - Can you think of a situation where the base rate fallacy occurs in brain imaging ?
->  - Can you propose a way to avoid the fallacy ?
+>  - Can you think of a situation where the base rate fallacy occurs in brain imaging?
+>  - Can you propose a way to avoid the fallacy?
 {: .challenge}
 
 
 
-## Going further : what is the distribution of a p-value?
+## Going further: what is the distribution of a p-value?
 
 You should have now a good idea of what is a distribution, and what is the cumulative density function of a random variable.
 
 An interesting fact is that p-values, which are random variable because they are just a function of the data, and the data are random (since you got these specific data by sampling eg subjects).
 
 So, say you sample from a normal N(0,1) distribution, what is the distribution of a p-value for a test T (for instance the test T is simply a z-score for a sample of N(0,1) variables). We show that this distribution is uniform, where all values are equally probable (loosely speaking).
+Here is a visualization to play with: [rpsychologist.com/d3/pdist/](https://rpsychologist.com/d3/pdist/).
 
 >  **Warning: this is *more advanced* material, you may want to skip it if you don't have some mathematical background**
 >