Analysis of US Health Insurance data – Individual Assignment
Description The purpose of this assignment is to investigate a dataset using the knowledge learned in Modules 1 and 2. This will enable conclusions to be drawn that ultimately assist in decision making. The assignment requires you to analyse a given dataset, interpret the results, and then draw conclusions such that you are able to reply to specific questions being asked of you in the form of a business report. (These questions are asked in the following email).
The aims of the assignment are to:
• provide you with some examples of the application of data analysis
• test your understanding of the material presented in the relevant topics
• test your ability to analyse data and interpret your results
• test your ability to effectively communicate your results to others
Before attempting the assignment, make sure that you have prepared yourself well by reading the relevant sections of the prescribed textbook and reviewing the materials provided in Modules 1 and 2 (i.e. Topics 1 to 7).
Specific Requirements
The UnitedHealth Group is America’s most prominent health insurance provider. They want to better understand certain population characteristics that might contribute to the high medical costs being billed to insurance providers. They have access to a random sample of US Health Insurance data containing 1338 insured personnel with their Age, Gender, Body Mass Index (BMI), Number of Children, Smoking status, Region and Charges. You are a Data Analyst working for UnitedHealth Group. Your Manager, Daisy Pearce, has asked you to conduct a preliminary analysis. In particular, you are expected to apply a series of statistical techniques and produce a report based on your findings. Daisy’s email is reproduced on the next page.
Q1. An Overall View of both “Charges” and “Smoking”
Can you provide me with overall summaries of
a) Individual medical cost billed by health insurance
b) Smoking status
Q2. Relationships
a) Is there a relationship between the age of the primary beneficiary, their body mass index (BMI), number of children and medical cost?
b) We would also like to know is there a gender bias in the smoking behaviour of the beneficiary.
c) Can you further analyse to see whether the beneficiary's residential area/region in the US affect how
health insurance provider bill their medical costs?
I realise that the US Health Insurance data contain a random sample of 1338 insured personnel, and that this
information can be used to draw inferences about the specific attributes of the whole insured population and
charges billed by health insurance providers. With that in mind, Please provide me with answers to the
following questions:
Q3. The UnitedHealth Group would like estimates of the following.
a) Average medical cost for an older beneficiary (older adulthood: 56 years and older)
b) Proportion of smokers who are obese (BMI of at least 30)
Q4. The UnitedHealth Group would like a comparison between this year’s medical cost and the industry average.
a) The industry average medical cost for a single adult (i.e. without children) is at least $10,000. Is there any evidence to support this assertion?
b) Based on the industry average, less than 50% of beneficiaries are female. Can this claim also be substantiated?
Q5. Appropriate Sample Size
One of the company’s overall goals is to estimate the average medical cost for all insured personnel to within
$1000 (±1000) and the proportion of all insured smokers to within 3%, Will a sample size of 1338 be large
enough? If not, what size sample should be taken? What other factors should be taken into account when
sampling?
complete case study is attached below
Question Attachments
1 attachments —