PART II (Second, Third and Final Year)
MSCI 212 Statistical Methods for Business
WRITTEN COURSEWORK ASSIGNMENT (DUE WEEK 12)
Please read the following instructions carefully.
a) Your coursework must be submitted in an unsealed envelope in the coursework
submission box outside LUMS A51 by 4:00pm on Monday, 21st January 2019.
b) Your coursework will NOT be accepted unless you sign and return a declaration form
(available from the Web Board) that includes the statement that you have read and
understood the University regulations relating to plagiarism. Plagiarism includes:
• Collusion, where a piece of work prepared by a group is represented as if it
were the student’s own;
• The purchase of a paper from a commercial service, including Internet sites,
whether pre-written or specially prepared for the student concerned;
• The submission of a paper written by any other person, including a friend, a
fellow student or a person who is not a member of the university;
• The submission of another student’s work, whether with or without that other
student’s knowledge or consent.
Incidents of plagiarism are recorded on a student’s file. Penalties are in line with the
institutional framework of the University.
c) In accordance with University regulations, marks are deducted from any coursework
which is not submitted by the deadline. This penalty will apply for 3 days after the
deadline and then a mark of zero will be given to any work not submitted. However,
if an extension is given then the rule applies from the date of the extension.
d) This coursework has two questions and you must answer both questions in full.
e) Both questions carry equal marks (50%) and you should be able to begin Question 1
immediately; for Question 2 some other important tools will still be covered later.
f) Each of your answers should state clearly your reasoning. Please also state clearly
any assumptions that you have made in addition to those given in the questions.
g) You are allowed to submit handwritten answers but should include carefully selected
extracts of your SPSS output to justify your answers. Also, please write neatly for if
we cannot read your handwriting your answer will NOT be marked.
Question 1 [Worth 50% of the marks]
As at workshop 2, use <Transform><Random Number Generators> to set your unique starting
point for the SPSS random number generator. For this coursework question use the last four digits
of your library card PLUS 1, i.e. if your library card ends ‘4321’ type in ‘4322’, and if your library
card ends ‘4329’ type in ‘4330’. Record these four digits at the top of your answer.
A large energy company provides a call centre to answer customers’ queries. The call centre
management is keen to understand the activity and performance levels that the call centre has
experienced over the last winter and what they might do to improve their experiences next year.
They know that activity levels vary considerably during the day and so data has been collected for
the morning, afternoon and evening ‘peak’ hours. They are hoping to understand their workloads
better, and also how they are related to the experience of their customers when calling the call
They are particularly interested in reducing the number of callers who abandon if this is possible.
During the last Winter they have been experimenting with virtual hold technology (VHT). When VHT
is ‘on’ callers who cannot be dealt with immediately are offered the chance to be rung back without
losing their place in the queue. Is there any evidence that this reduces the chance that callers
abandon their call?
The SPSS data file ‘EnergyCallCentre.sav’ contains the following data for 504 ‘peak’ hours of operation
from last Winter:
• Month – refers to three 2-month long periods of time;
• VHT – specifies whether Virtual Hold Technology was on or off;
• ToD – specifies which peak hour the data is from;
• Agents – an indication of the number of agents available to handle calls during the hour (not
• CallsOffered – number of callers calling during the hour;
• CallsAbandoned – number of calls arriving during the hour which rang off before speaking to
• CallsHandled – number of calls arriving during the hour which spoke to an agent;
• ASA – average speed of answer (minutes), i.e. average time between the caller first ringing
the call centre and speaking to an agents;
• Avehandletime – the average time that calls require from an agent, including ‘wrap-up’ time.
Draw a random sample of size 100 from this population of one hour periods and investigate your
sample using SPSS.
a) In no more than 6 pages describe the main features of your sample as if to the call centre
management. You should include main features of individual variables and of the relationships
between them. You may include SPSS numerical and graphical output and/or you may
quote values from your SPSS output. (The clarity and content of your report are both important).
[Worth about 75% of the marks for Q1]
b) Using your sample as evidence, explain what you believe that the impact of VHT was last
Winter. [No more than 2 pages. Worth about 25% of the marks for Q1]
Question 2 [Worth 50% of the marks]
The numbers of nurses employed at hospitals in the northwest of England vary widely, for various
good reasons (e.g. different hospitals have different workloads) and some not so good reasons (e.g.
staffing levels at particular hospitals have historically been relatively low or high). Data from 120
hospitals have been gathered (see file NurseStaffing.sav) to see whether regression models can
be used to determine underlying relationships between nurse staffing levels and workloads. The
available data is as follows:
• Nurses – No. of nurse-days employed during 2017 (one nurse-day is equivalent to one nurse
on duty for 24 hours)
• SurgIP – Surgical inpatients treated during 2017
• MedIP – Medical inpatients treated during 2017
• TotalOP – Total outpatients treated during 2017
• Tests – Diagnostic tests undertaken on all patients during 2017
NOTE: all figures are in 000’s.
a) Carry out a preliminary analysis of the data using Scatterplots, Correlations, and anything
else you think appropriate. Report your preliminary findings.
b) Use multiple linear regression to investigate the relationship between Nurses and the 4 available
explanatory variables. Justify your choice of model.
c) Carry out a residuals analysis to check whether or not the usual regression assumptions
seem to hold for your preferred model. Carefully justify your conclusions, noting any reservations
you have about your equation. In particular is there any evidence of hospitals which
may have miss-recorded the number of nurse-days?
d) In the light of your answer to c) carry out any further improvements to your model you think
appropriate (if any), and explain why you believe it is an improvement.
e) Use your preferred model to comment carefully on the reported staffing levels of three further
hospitals (A, B & C) which were 220,000 nursedays, 75,000 nursedays and 250,000
nursedays respectively. Their corresponding workload figures were:
SurgIP MedIP TotalOP Tests
Hospital A 22 14 370 50
B 5 3 100 10
C 5 19 350 55
Page Limit: 17 pages total (1 cover page, 6 pages for Q1 part a), 2 pages for Q1 part b), 8 pages