Add univariate scatterplots
For small sample sizes, it is better to show all the data. This is a movement in how data is being presented in the life sciences. Right now, it is hard to make this type of graph in Excel. There is a template someone made (http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128) but it would be great to have it as an option in Excel in general (especially the online version which allows better real-time collaboration).
I’m not sure I understand. I read through the article which speaks to the need for showing discrete data in the form of scatterplots, box plots, and histograms instead of bar and line graphs. We have always supported scatter charts in Excel and have recently added box plot and histogram in Excel 2016. Could you further clarify your request?
Given your scenario, it seems like a box plot (https://en.wikipedia.org/wiki/Box_plot) with the data points turned on would serve your needs. However, let me know if I'm missing something.
In Excel 2016, we now natively support box plots and you can turn on the data points.
The difference with a univariate scatterplot is that the variable on the x-axis is categorical and NOT continuous. So let's say you had a bunch of students do 30 seconds of jumping jacks and you measured their heart rate after they did the jumping jacks (so heart rate is the dependent variable on the y-axis, this is a continuous variable or measurement variable that can be any value). 5 of the students had regular caffeinated coffee an hour before, and 5 of the students had decaffeinated coffee an hour before (so your independent data is categorical with 2 groups: regular coffee and decaff coffee). You want to test if their heart rates are significantly different. You would do a statistical test and graph to compare the heart rates of these 2 groups. The traditional way to graph this (with a continuous variable on the y-axis, and a categorical axis on the x-axis) is a bar graph (= column graph in Excel). The bars would give the mean heart rate for each group with error bars as well (often the standard deviation or standard error - we use the custom error bar option in Excel). However, when you have a very small sample size (usually 10 or less in each group), it is actually better to show ALL the values rather than just the mean (this is the point of the paper I recommended). To show all the values rather than just the mean, you would use a univariate scatterplot where you have points for each value, and the y-axis would still be the continuous variable (heart rate in this example) and the x-axis would be the independent categorical variable (in this case regular coffee and decaf coffee). If you look at the Excel templates that Weissgerber et al made, you will see how they made Excel do this. It would be really nice to have something already in Excel that people could use to make their graphs rather than rely on the Excel template made by Weissgerber at al. There are other graphing programs that can do this, but many are more expensive (GraphPad PRISM) or more difficult to learn (like R). The idea here is that you show all the data (all the dots) with the summary value (mean or median) shown there as well. Yes, it IS possible to do this in Excel (again, see the template from Weissgerber et al), but because it is currently complicated or you have to use the Weissgerber et al Excel template, it makes it harder to get people to graph their data the best way.