# Lab 4: Statistical Interpretation of Uncertainty {-}

Name:
    
## Skills
1. Use Python to plot statistical distributions and histograms.
2. Use Python to calculate the mean, standard deviation, and standard deviation of the mean for a data set.
3. Use Python to read data from a file.
4. Explain the difference between standard deviation and standard deviation of the mean.
5. Use statistical information to report uncertainties.



## Activity I Pendulum Period (33 points)

### Equipment needed
1. 1.5-meter-long pendulum.
2. Stopwatch (the one on your phone will do)

### Goal (Overview)
Each student will measure the period of a 1.5-meter-long pendulum 10 times. This will result in a set of ~150 data points for the whole class.  We will then calculate the acceleration due to gravity for each measurement and calculate the  i) mean, ii) standard deviation, and iii) standard deviation of the mean and use these results to report an appropriate uncertainty.

### Procedure
1. A pendulum that is 1.5 meters long will be set up at the front of the class. Each person should measure the period of the pendulum with a stopwatch 10 times. Data should be recorded to the nearest 0.01 s.  For the best results try to keep the pendulum's amplitude consistent for all of the measurements, even by restarting as necessary.  The amplitude should be relatively small (less than about $10^\circ$).  The pendulum should be swinging (rather than at rest) when the time measurement begins.
2. Record your 10 measurements in the google sheets link provided below.
3. Once everyone has collected their measurements someone needs to move all of the data into a single column.  Then download the file to your computer, saving it as a .csv file.  Then use the `np.loadtxt()` function to read this data into Python. Do this on line 7 of the code cell below.  (Remember: The file must be located in the same directory as this Jupyter notebook.)
4. The acceleration due to gravity can be calculated from $$ g = {4 \pi^2 L \over T^2}$$ On line 9 of the code cell below calculate the acceleration due to gravity for each period measurement in the data set. To include $\pi$ in your calculation use `np.pi`.
5. Use the equations given in the reading above to calculate the mean, standard deviation, and standard deviation of the mean.  Perform these calculations on lines 10, 11, and 12 of the code cell below.  Use `np.sum()` to perform summations and `np.sqrt()` to perform square root functions.
6. On lines 14,15, and 16 in the code cell below use the functions `stats.mean()` and `stats.stdev()` to calculate the mean, standard deviation, and standard deviation of the mean.  
7. Construct a dataframe of data with the period located in the first column and the values of g in the second column.
8. A statistical analysis of the data can be generated using `dataframe.describe()`, which creates a new dataframe containing often used statistics.  Do this on line.  
7. Add a formatted print statement  to report an appropriate value for $g$ with its uncertainty.
8. Plot a histogram of the g values.  
9. The normal distribution is given by $$ N(x) = {1\over \sigma \sqrt{2 \pi}} e^{-(x- \mu)^2\over 2 \sigma^2}$$. Plot this distribution on top of the histogram. Verify that the limiting distribution matches the histogram.  
10. Calculate the p-value for this dataset using $g= 9.8$ as the expected value.  State whether the deviation of your experimental results are significant enough (at the $5\%$ level ) to reject the experiment. Do this on line 22 in the code cell below.

[Pendulum Data](https://docs.google.com/spreadsheets/d/1A2Xb7dDaDxjSH8YzRn0-Lb2er6qxWySYjmuLK-yiP4w/edit?usp=sharing)

In [None]:
#| eval: false
#| echo: true

import numpy as np
import statistics as stats
import math as mt
from scipy.stats import ttest_1samp

%matplotlib inline

T =   # Line 7
N = len(T)     # Find the number of data points.
g =            # Line 9
meanG =        # Line 10
stdG =         # Line 11
stdMeanG =     # Line 12

μ =            # Line 14 
σ =            # Line 15
σ_μ =          # Line 16

df =  # Construct a dataframe with periods in the first column and g-values in the second. (Line 18)
display(df)
summary =  #Construct a summary dataframe containing often-used statistics (Line 20)
display(summary)
p_value =      # Line 22



# Plot the histogram and Normal distribution below.

## Activity II - Radioactivity (33 points)

### Equipment needed
1. Geiger Counter
2. Radioactive sample (don't worry, they aren't active enough to be dangerous)
3. LabQuest mini box with USB cable.
4. Logger Pro software.  You can download a copy for your personal machine ([windows](https://www.vernier.com/d/yxoaa) or [mac](https://www.vernier.com/d/u7caf) ) or use one of the lab computers which have Logger Pro installed already.

### Goal (Overview)
We will use the electronic radiation monitor to measure the number of decays for a radiocative sample. We will calculate the mean and standard deviation of the data and make a histogram of the data.  The Poisson distribution will be plotted on top of the histogram to show it as the limiting distribution.

### Procedure
1. Using the Vernier Radiation Monitor, measure the decays of a radioactive sample.  Below you will find the instructions for setting up the Vernier Radiation Monitor to your laptop.  Use 10-second intervals and make 100 measurements.  
2. Copy the data from Logger Pro to an Excel spreadsheet and save the worksheet.(The worksheet should have only one column of data.  Don't include the time data in the Excel worksheet.)
3. Use the equation above for the mean ($\mu$) to calculate the mean of the data and the equation below to calculate the standard deviation of the data. $$\sigma = \sqrt{\mu}$$. Perform these calculations on lines 7 and 8 in the code cell below.
4. Add a formatted print statement to report the appropriate count number with its associated uncertainty.
5. One line 11 of the code cell below, produce a histogram just as we did together in activity I. Choose `bins = ` so that the histogram displays a sufficient amount of detail.
6. The limiting distribution for this kind of data is called a Poisson distribution (equation given below).  It is only defined for integer values of the argument and it is hard to plot using standard plotting techniques.  I have provided the code to plot this function in the code cell below.  You may have to modify the numbers on line 14 to plot over the appropriate range.  $$P(x) = e^{-\mu} {\mu^x \over x!} $$

>Instructions for using the Vernier Radiation Monitor:

>1. Connect the Vernier Radiation Monitor to the DIG 1 port of the LabQuest Mini.
>2. Connect the USB cord from the LabQuest Mini to your laptop.
>3. Within the LoggerPro software, do the following:
>   1. Select Experiment -> Data Collection and change the duration to 1000 seconds.
   
   
   
   <img src="https://github.com/lancejnelson/PH121/raw/gh-pages/files/LoggerProRadiation.png" alt="drawing" width="600px"/>

In [None]:
#| eval: false
#| echo: true
from scipy.stats import poisson,norm
from numpy import arange
from matplotlib import pyplot as plt

data =  #Load the data from file (Line 5)

μ =   # Calculate mean (Line 7)
σ =   # Calculate standard deviation (Line 8)

              
              # Construct Histogram (Line 11)

dist = poisson(μ)
x = arange(0,40)    # Line 14: May have to modify these numbers.
ax.plot(x,dist.pmf(x))
plt.show()


## Activity III - Projectile Range (33 points)

### Goal (Overview)
This activity is identical in nature to Activity I.  Instead of measuring the period of a pendulum, we will measure the range of a projectile.  The analysis of the data will be very similar to that performed in Activity I.  

### Setup
1. As a class, pick some launch conditions.  These must include the launch angle, launch power setting, and initial height. 
2. As a group, use your code from Activity I from lab 3 to calculate the exit speed of the ball.
3. Write the launch conditions (initial height, launch angle, and exit speed) on the board at the front of the class.
4. Setup the cannon in a location where everyone in the class can access it and perform measurements.  

### Procedure
1. Using the launch conditions chosen in step 1, predict the range of the steel ball.  Perform the calculations in the code cell below. 
2. Using the projectile launcher, fire a steel ball 10 times per student and measure the range in meters.  
3. Again, we will pool our data together as a class to increase the size of the dataset.  Therefore, it is very important that every student launches **under the same conditions** and every students' measurements are free of bias.  Add your data to the Google sheets document linked below.
4. When all the data is collected, have Python read the data set using `np.genfromtxt()` just as you did in activity I.
5. Calculate the mean, standard deviation, and standard deviation of the mean using your preferred calculation method. (Either with `stats.mean()` and `stats.stdev()` ,using the equations given above, or using a python dataframe.)
6. Plot a histogram of the data.  Choose `bins =` appropriately so the histogram displays with sufficient detail.
7. Plot the normal distribution on top to verify that it is the limiting distribution.
8. Add a formatted print statement to report the particle's range with its associated uncertainty.
9. Calculate the p-value for your data set using the predicted range from part 1 as your expected value. State whether the deviation of your experimental results are significant enough (at the $5\%$ level ) to reject the experiment. 

No template code cell will be provided for this activity (hopefully you are feeling more and more comfortable writing simple code on your own), but you can copy/paste from previous activities if needed.

[Projectile Data](https://docs.google.com/spreadsheets/d/1frC8Yy7TK7fDN_z7E-1BQEOr6knhIZweFHdOTky6TQM/edit?usp=sharing)


In [None]:
# Put code here
