l>Frequency Distributions and also Histograms

## Frequency Distributions and also Histograms

A frequency distribution is frequently used to group quantitative data. Data values space grouped right into classes of equal widths. The smallest and largest monitorings in each class are called class limits, while class boundaries space individual values liked to separate classes (often gift the midpoints between upper and also lower class borders of nearby classes).

You are watching: What is the difference between class limits and class boundaries

For example, the table below gives a frequency distribution for the complying with data:

$$\textrmData values: 11, 13, 15, 15, 18, 20, 21, 22, 24, 24, 25, 25, 25, 26, 28, 29, 29, 34$$$$\beginarrayc\textrmClass Limits & \textrmClass Boundaries & \textrmFrequency\\\hline10 - 14 & 9.5 - 14.5 & 2\\\hline15 - 19 & 14.5 - 19.5 & 3\\\hline20 - 24 & 19.5 - 24.5 & 5\\\hline25 - 29 & 24.5 - 29.5 & 7\\\hline30 - 34 & 29.5 - 34.5 & 1\\\hline\endarray$$Frequency distributions should commonly have between 5 and 20 classes, every one of equal width; be support exclusive; continuous; and also exhaustive.

One must use nice "round" numbers for your class limits as long as there is not a compelling reason to stop doing so. It will make your frequency circulation easier to read. For example, if her data starts with 43, 46, 48, 48, 52, 57, 58, ... You could pick a lower class limit that 40 and also a course width that 5 (provided that a reasonable number of classes resulted)

A relative frequency distribution is very similar, except instead of report how many data values autumn in a class, they report the fraction of data values that fall in a class. These are called relative frequencies and can be provided as fractions, decimals, or percents.

A cumulative frequency distribution is an additional variant that a frequency distribution. Here, instead of reporting how plenty of data values fall in some class, castle report how plenty of data worths are included in either that class or any class come its left.

The below table to compare the values seen in a frequency distribution, a relative frequency distribution, and also a accumulation frequency distribution, because that the complying with sequence the dice rolls$$\textrmDice Rolls: 7, 6, 7, 6, 7, 4, 4, 6, 10, 5, 6, 11, 4, 8, 2, 9, 6, 5, 3, 8, 3, 3, 12, 9, 10, 7, 6, 7, 4, 6$$$$\beginarrayc\textrmClass Limits & \textrmClass Boundaries & \textrmFrequency & \textrmRelative Frequency & \textrmCumulative Frequency\\\hline2 - 3 & 1.5 - 3.5 & 4 & 2/15 & 4\\\hline4 - 5 & 3.5 - 5.5 & 6 & 1/5 & 10 \\\hline6 - 7 & 5.5 - 7.5 & 12 & 2/5 & 22\\\hline8 - 9 & 7.5 - 9.5 & 4 & 2/15 & 26\\\hline10 - 11 & 9.5 - 11.5 & 3 & 1/10 & 29\\\hline12 - 13 & 11.5 - 13.5 & 1 & 1/30 & 30\endarray$$A frequency histogram is a graphical version of a frequency distribution where the width and position the rectangles are offered to show the assorted classes, v the heights that those rectangles indicating the frequency through which data fell into the linked class, as the example listed below suggests.

Frequency histograms must be labeled v either class borders (as shown below) or with course midpoints (in the middle of each rectangle).

One can, that course, similarly construct relative frequency and also cumulative frequency histograms.

The purpose of these graphs is to "see" the distribution of the data. Once using a calculator or software program to plot histograms, experiment v different selections for boundaries, subject to the over restrictions, to discover out i beg your pardon graphical properties (modality, skewness or symmetry, outliers, etc...) persist and which are just spurious results of a particular selection of boundaries. Then usage the boundaries that best reveal this persistent properties.

### Probability Histograms

A form of graph carefully related to a frequency histogram is a probability histogram, which reflects the probabilities associated with a probability circulation in a similar way.

See more: What Is 0.67 As A Fraction

Here, we have a rectangle for each value a random variable have the right to assume, where the elevation of the rectangle indicates the probability of gaining that linked value.

When the feasible values the random variable have the right to assume are consecutive integers, the left and also right sides of the rectangles are taken to it is in the midpoints between these integers -- which pressures them to all finish in $0.5$. Additionally, the width of every rectangle is climate $1$, which means that not just the elevation of the rectangle equals the probability of the corresponding value occurring, yet the area the the rectangle does together well. (These monitorings become very important later when we apply a "continuity correction" to almost right a discrete probability distribution with a continuous one.)