Which data set did you choose to display and analyze? You will enter the name of this .csv file into the Jupyter notebook
ds = pd.read_csv('FILE NAME')
In this lesson, students explore Jupyter notebook and various Python commands and functions. Students will choose a data set, plot histograms, change bin width, explore shape and the effect that outliers have on distributions.
In this lesson, you will:
Please note that you will need to have 2 tabs open for this lesson (one for Jupyter notebook and one for CT-STEM)
Open separate tab and paste this URL into the search bar:
https://mybinder.org/v2/gh/CT-STEM/Descriptive-Statistics/1.0
Then, click on "STD_Part2"
Please note that you will need to have TWO tabs open on your Chromebook - one for CT-STEM and one for Jupyter notebook.
Which data set did you choose to display and analyze? You will enter the name of this .csv file into the Jupyter notebook
ds = pd.read_csv('FILE NAME')
Write the code necessary to display descriptive statistics (count, mean, st dev, etc). You may need to open up the link from yesterday's pre-assessment, if you forgot how to display this information (https://mybinder.org/v2/gh/CT-STEM/Descriptive-Statistics/1.0).
Based on the relationship between the mean and median, make a prediction of the shape of this distribution. Explain your reasoning.
Plot the histogram for the data set you chose, take a screenshot on your chromebook and upload the image below.
Steps for taking a partial screenshot:
File | Delete |
---|---|
Describe the shape, center, spread and any possible outliers in the distribution (in context).
When you changed the number of bins to less than 8, what did you notice about the distribution? Did the shape change? Did you learn more or less about the data set? Write your observations below.
When you changed the number of bins to more than 8, what did you notice about the distribution? Did the shape change? Did you learn more or less about the data set? Write your observations below.
What is an appropriate number of bins for this data set? Why?
Plot the boxplot for your data set, take a screenshot and upload the image below.
File | Delete |
---|---|
What information does the boxplot show that the histogram does not?
What are the pros and cons of boxplots?
Summarize what you did in todays lesson.
What did you like? dislike?