Introduction- Research involving data in a single group are usually fact finding exercises,
and this workshop will be divided into 5 sections, dealing with the following subjects.
- Survey Research : To establish the
**mean**of a measurement of interest in individuals from a sample - Survey Research : To establish the
**proportion**of positive cases in a property of interest in individuals from a sample. **Correlation**: To study the relationship between two measurements in individuals from a sample- Pearson's Correlation Coefficient for parametric data
- Spearman's Correlation Coefficient for nonparametric data
**Regression**: To study how one measurement is predicted or controlled by another in individuals from a sample- Exercises in these subjects
- Survey Research : To establish the
- The computer programs performing calculations for these subjects are
- StatPgm_2a_Survey.php for mean and proportion in a sample
- StatPgm_2b_CorReg.php for correlation and regression
- Contents with detailed discussions are
- Contents_2a_Survey_Mean_Prop.php : Single Group Survey Statistics for mean and proportion
- Contents_2b_Correlation_Regression.php : Correlation and Regression
Survey to find a mean- Calculations sample size during planning
- parameters are
- Anticipated Standard Deviation
- Precision : 95% confidence interval of estimated mean
- Example from the program : Estimating mean birth weight
- Anticipated Standard Deviation = 450
- We required a precision of 95% confidence interval = ±100g
- Sample size required = 81
- parameters are
- Calculating precision and 95% confidence interval after data is collected
- Data obtained are
- sample size
- mean and Standard Deviation
- Example from the program : Estimating mean birth weight
- sample size used = 81
- Standard Deviation found = 450
- precision for 95% confidence interval = 99.5, rounded upwards to 100g
- Estimated mean birth weight = mean ±100g
- Data obtained are
Survey to find a proportion- Calculations in StatPgm_2a_Survey.php
- Calculations sample size during planning
- parameters are
- Anticipated proportion
- Precision : 95% confidence interval of estimated proportion
- Example from the program : Estimating Caesarean Section rate
- Anticipated C.S. rate = 20% (0.2)
- We required a precision of 95% confidence interval = ±5% (0.05)
- Sample size required = 246 deliveries
- parameters are
- Calculating precision and 95% confidence interval after data is collected
- Data obtained are
- sample size
- proportion found
- Example from the program : Estimating Caesarean Section rate
- sample size used = 250
- C.S. rate found = 22% (0.22)
- precision for 95% confidence interval = 0.051 (5.1%)
- Estimated C.S. rate = 22±5.1 = 16.9% to 27.1%
- Data obtained are
Correlation Coefficient
- Relationship between two parametric measurements
- Sample size estimation during planning requires
- Program in StatPgm_2b_CorReg.php
- Requires the following parameters
- The Type I Error used to decide statistical significance. Default value p=0.05
- The power to detect the correlation. Default value 0.8 (80%)
- The size of the correlation coefficient that is meaningful to the researcher
- Results are sample size required for the 1 and 2 tail model
- 1 tail model if direction already determined, and if only statistical significance is relevant
- 2 tail model if upper and lower bounds of 95% confidence interval is relevant, and if results are intended for use in future comparisons or meta-analysis
- Data input
- Two columns representing the pair of measurement
- Each row a pair from a subject
- Result output
- Pearson's Correlation Coefficient ρ
- Standard test of statistical significance (old)
- Standard Error of ρ
- Degrees of freedom
- Student's t, and probability of t (Type I Error)
- 95% confidence interval of coefficient ρ
- Fisher's Z Transformation and the Standard Error of Z
- 95% confidence interval of Z
- reverse transformation of confidence interval to interval for ρ
- One tail to test if it overlaps null (0)
- Two tail for comparison and meta-analysis
- Example in the program
- 20 pairs of measurements : crown heel length and head circumference at birth
- Standard results r=0.6, SE=0.19, p (α)=0.002, statistically highly significant
- Fisher's Z Transformation = 0.70, SE=24, 95% CI (one tail)=0.3 to ∞,
converted back to 95% CI for correlation coefficient (one tail) = 0.29 to 1
- Relationship between two nonparametric measurements
- Nonparametric measurements are those where assumptions of normal distribution cannot be made
- Sample size estimation :
- no precise calculation exists as the distribution of data is unstated
- General estimate, nonparametric method has 95% the power of parametric methods
- Rule of thumb : use sample size estimation for Pearson's Correlation, then add 10%
- Data input
- Two columns representing the pair of measurement
- Each row a pair from a subject
- Result output
- Statistical significance in terms of probability of Type I Error (p, α)
- Example in program
- 8 pairs of Likert Scale measurements
- Table of counts
- r = 0.80 p<0.05
Pearson's Correlation CoefficientSpearman's Correlation CoefficientRegression Analysis- Hierarchical relationship between two measurements in a formula
- y = a + bx, where a is the constant, and b the regression coefficient
- x = independent variable
- y = dependent variable as its value can be calculated from x
- Sample size estimation : same as for correlation coefficient between x and y
- Data input
- Two columns representing the pair of measurement.
- column 1 = x, and x must be at least ordinal
- Column 2 = y, and y must be parametric (normally distributed)
- Two columns representing the pair of measurement.
- Result output : y = a + bx
- a = constant, value of y when x = 0
- b = regression coefficient, change in y for each unit of change in x
- Standard Error, and 95% confidence interval of b
- Example in StatPgm_2b_CorReg.php
- 22 pairs of measurements
- Col 1 : x = gestation in weeks
- Col 2 : y = birth weight in grams
- Result : Birthweight in grams = -5585 + 230 (gestation in weeks)
- Within the data range (gestation between 33 and 41 weeks) birth weight increases by 230g per week
- The average birth weight can be estimated from gestation. e.g., at 40 weeks, birth weight = -5585 + 230 * 40 = 3615 g
- Plot : Data and result regression line are plotted by Macroplot and can be edited
- 22 pairs of measurements
Exercises for Estimating Mean or ProportionCalculations require StatPgm_2a_Survey.php
We need 18, 65, and 249 babies
The 95% confidence intervals are 3489g-3911g, 3572g-3828g, 3611g-3789g, and 3638g-3762g
We need to observe 99, 13, and 7 visits
The 95% confidence intervals of waiting times are 16min-54min, 24min-46min, 28min-42min, and
31min-39min
We need 73, 289, 1801, and 7203 deliveries
- Hosp 1 : 1000 deliveries, with 180 CSs.
- Hosp 2 : 100 deliveries, with 40 CSs.
- Hosp 3 : 500 deliveries, with 100 CSs.
- Hosp 4 : 900 deliveries, with 225 CSs.
A 6. Click to show contents
The 95% confidence intervals of CS rates are :
- Hosp 1 : 16%-20%
- Hosp 2 : 30%-50%
- Hosp 3 : 17%-24%
- Hosp 4 : 22%-28%
We need to observe :
- 7299, 1825, and 457 primiparous deliveries
- 1522, 381, and 96 multiparous deliveries
The 95% confidence intervals of the rate of torn perineum are
- 5.9%-9.1% for primiparas
- 0.5% to 1.9% in multiparas
Exercises for Correlation and Regression
Calculations for these exercises require StatPgm_2b_CorReg.php
More Exercises
Perform the exercises in Exercises_1.php One Group Perform the exercises in Exercises_2.php Correlation / regression |