Monthly Archives: July 2009
Well, I guess that grant season was a bit of an optimistic time to try to do a 4 part series on frequency distributions, but I’ve got a few minutes before heading off to an all day child birth class so I thought I’d see if I could squeeze in part 2.
OK, so you have some data and you’d like to get a rough visual idea of its frequency distribution. What do you do know? There are 3 basic approaches that I’ve seen used:
- Histograms. This is certainly the simplest and easiest to understand approach and most of the time for visualizing frequency distributions it is perfectly acceptable. A histogram simply divides the range of possible values for your data into bins, counts the number of values in each bin and plots this count on the y-axis against the center value of the bin on the x-axis. Any statistics program will be able to do this or you can easily do it yourself. If all of the bins are of equal width (as is the default in stats packages) then your basically done. If you want to convert the y-axis into the probability that a value falls in a bin, just divide the counts by the total number of data points. If you want to convert it to a proper probability density estimate then you’ll also want to divide this number by the width of the bin (i.e., the upper edge of the bin minus the lower edge of the bin). If the bins are not equal width (which includes if you have transformed the data in some way) you should divide by the the linear width of the bin regardless of whether you are concerned about turning your y-axis into a probability density estimate or not. This is to make sure that you are visualizing the distribution in the way you are thinking about it (most people are thinking about the distribution of x). Of course there are good reasons for wanting to visualize the distributions of transformed data. Just make sure you have one if you’re not going to divide by the linear width of the bin.
Well, I’m out of time so I’ll go ahead and post this and come back with the other two options for visualization later.