Individual Assignment

Unit: STA101 – Statistics for Business

Total Marks: This assignment is worth 40% of the total marks in the unit.

Due Date: 2nd June, 2017

Instructions:

1. Students are required to cover all stated requirements.

2. Your answer must be both uploaded to Moodle in word file.

3. You need to support your answers with appropriate Harvard style references where necessary.

4. Include a title/cover page containing the subject title and code and the name, student id numbers.

5. Please save the document as STA101AT1_first name_Surename_Student Number Eg: STA101AT1_John_Smith_NA20160000

Question 1: (7 Marks)

Below are sorted data showing average spending per customer (in dollars) at 74 Noodles & Company restaurants. (a) Construct a frequency distribution. Explain how you chose the number of bins and the bin limits. (b) Make a histogram and describe its appearance. (c) Repeat, using a larger number of bins and different bin limits. (d) Did your visual impression of the data change when you increased the number of bins? Explain.

6.54 6.58 6.58 6.62 6.66 6.70 6.71 6.73 6.75 6.75 6.76 6.76

6.76 6.77 6.77 6.79 6.81 6.81 6.82 6.84 6.85 6.89 6.90 6.91

6.91 6.92 6.93 6.93 6.94 6.95 6.95 6.95 6.96 6.96 6.98 6.99

7.00 7.00 7.00 7.02 7.03 7.03 7.03 7.04 7.05 7.05 7.07 7.07

7.08 7.11 7.11 7.13 7.13 7.16 7.17 7.18 7.21 7.25 7.28 7.28

7.30 7.33 7.33 7.35 7.37 7.38 7.45 7.56 7.57 7.58 7.64 7.65

7.87 7.97

Question 2: (8 marks)

The contingency table below shows the results of a survey of online video viewing by age. Find the following probabilities or percentages:

a. Probability that a viewer is aged 18–34.

b. Probability that a viewer prefers watching TV videos.

c. Percentage of viewers who are 18–34 and prefer watching user-created videos.

d. Percentage of viewers aged 18–34 who prefer watching user-created videos.

e. Percentage of viewers who are 35–54 or prefer user created-videos.

Type of Videos Preferred

Viewer Age User Created TV Row Total

18–34 39 30 69

35–54 10 10 20

55+ 3 8 11

Column Total 52 48 100

Question 3: (6 marks)

Prof. Hardtack gave four Friday quizzes last semester in his 10-student senior tax accounting class.

a) Find the mean, median, and mode for each quiz.

b) Do these measures of center agree? Explain.

c) For each data set, note strengths or weaknesses of each statistic of center.

d) Are the data symmetric or skewed? If skewed, which direction?

e) Briefly describe and compare student performance on each quiz.

Quizzes:

• Quiz 1: 60, 60, 60, 60, 71, 73, 74, 75, 88, 99

• Quiz 2: 65, 65, 65, 65, 70, 74, 79, 79, 79, 79

• Quiz 3: 66, 67, 70, 71, 72, 72, 74, 74, 95, 99

• Quiz 4: 10, 49, 70, 80, 85, 88, 90, 93, 97, 98

Question 4: (6 Marks)

Calculate the test statistic and p-value for each sample. State the conclusion for the specified a.

a. H0: µ = 200 versus H1: µ ? 200, a = .025, = 203, s = 8, n = 16

b. H0: µ = 200 versus H1: µ 200, a = .05, = 198, s = 5, n = 25

c. H0: µ = 200 versus H1: µ 200, a = .05, = 205, s = 8, n = 36

Question 5: (6 marks)

Prof. Green gave three exams last semester. Scores were normally distributed on each exam. Below are scores for 10 randomly chosen students on each exam. (a) Find the 95 percent confidence interval for the mean score on each exam. (b) Do the confidence intervals overlap? What inference might you draw by comparing the three confidence intervals?

• Exam 1: 81, 79, 88, 90, 82, 86, 80, 92, 86, 86

• Exam 2: 87, 76, 81, 83, 100, 95, 93, 82, 99, 90

• Exam 3: 77, 79, 74, 75, 82, 69, 74, 80, 74, 76

Question 6: (7 marks)

Daily output of Marathon’s Garyville, Lousiana, refinery is normally distributed with a mean of 232,000 barrels of crude oil per day with a standard deviation of 7,000 barrels.

(a) What is the probability of producing at least 232,000 barrels?

(b) Between 232,000 and 239,000 barrels?

(c) Less than 239,000 barrels?

(d) Less than 245,000 barrels?

(e) More than 225,000 barrels?

___________________________

Question 1

(a) Frequency distribution

To construct a frequency distribution, we first need to determine the number of bins and the bin limits. A common method to select the number of bins is to use the Sturges rule, which suggests the number of bins k should be approximately 1 + 3.3 * log(n), where n is the sample size. For this data set, n = 74, so k is approximately 8.

Next, we need to choose the bin limits. The minimum value in the data set is 6.54, and the maximum is 7.97, so we can choose a bin width of 0.2, which is a reasonable size to show the variability in the data without having too many empty bins. The first bin will start at 6.5 and end at 6.7, the second bin will start at 6.7 and end at 6.9, and so on, until the final bin starts at 7.9 and ends at 8.1.

Spending (dollars) Frequency

6.5 – 6.7 13

6.7 – 6.9 18

6.9 – 7.1 16

7.1 – 7.3 9

7.3 – 7.5 8

7.5 – 7.7 5

7.7 – 7.9 3

7.9 – 8.1 2

(b) Histogram

The histogram below shows the distribution of average spending per customer. It has a roughly bell-shaped distribution, with most of the values clustered around the center of the distribution.

Histogram with 8 bins

(c) Larger number of bins

To create a histogram with a larger number of bins, we can use the Freedman-Diaconis rule to determine the bin width. This rule takes into account the variability of the data and suggests that the bin width should be equal to 2 * IQR / n^(1/3), where IQR is the interquartile range and n is the sample size. The IQR is the difference between the third and first quartiles, and for this data set, it is approximately 0.32. Using n = 74, the bin width is approximately 0.08. We can choose the same starting and ending points for the bins as before and create a histogram with 28 bins.

Spending (dollars) Frequency

6.50 – 6.58 1

6.58 – 6.62 2

6.62 – 6.66 2

6.66 – 6.70 2

6.70 – 6.71 1

6.71 – 6.73 1

6.73 – 6.75 2

6.75 – 6.76 3

6.76 – 6.77 2

6.77 – 6.79 2

6.79 – 6