Skip to content

Advanced Statistics Project: Analyze datasets using Inferential Statistics to address business questions on CMSU student demographics, asphalt shingle quality, and salary trends by education and occupation, supporting data-driven decisions and operational improvements.

Notifications You must be signed in to change notification settings

SenSoumalya/Inferential-Statistics-Project

Repository files navigation

Inferential Statistics Project

Course: Advanced Statistics

Project Overview

This project analyzes three separate datasets with the following objectives:

  1. CMSU Student Survey: Use probability and conditional probability analysis to examine gender distribution, major choices, graduation intentions, and other factors among CMSU students.
  2. Shingle Moisture Analysis: Conduct hypothesis testing on moisture content in two types of asphalt shingles to ensure product quality.
  3. Salary Data Study: Use ANOVA to understand salary variations by education level and occupation type and test for interaction effects.

Data Files

The project uses the following datasets:

  • A+%26+B+shingles.csv: Contains moisture content data for two types of shingles (A and B).
  • SalaryData.csv: Records salary information, educational qualifications, and occupation levels.
  • Survey.csv: Responses from 62 CMSU undergraduate students to a 14-question survey.

Problem Descriptions and Objectives

Problem 1: CMSU Student Survey

  • Objective: Calculate probabilities for various student characteristics, such as:
    • Gender distribution and major choices among male and female students.
    • Graduation intentions, GPA distribution, and employment status.
  • Key Questions:
    • What is the probability that a CMSU student is male or female?
    • What is the conditional probability of specific majors among male and female students?
    • What are the probabilities associated with GPA, graduation intentions, and computer ownership?

Problem 2: Shingle Moisture Content Analysis

  • Objective: Determine if the moisture content in shingles A and B meets quality standards.
  • Key Questions:
    • Is the average moisture content below the permissible limit (0.35 pounds/100 sq ft)?
    • Are the mean moisture levels in shingles A and B equal?

Problem 3: Salary Data Study

  • Objective: Assess the effect of educational qualification and occupation on salary using statistical tests.
  • Key Questions:
    • Is there a significant difference in salary based on education levels?
    • Is there a difference in salary based on occupation?
    • Does an interaction exist between education and occupation affecting salary?

Analytical Methods

The project employs several statistical methods:

  • Probability & Conditional Probability: To explore gender, major choices, and other student characteristics.
  • Hypothesis Testing (t-tests): For comparing mean moisture content in shingles A and B.
  • ANOVA (Analysis of Variance): To analyze salary differences by education and occupation levels, and to assess any interaction effects.

Key Findings

Problem 1: CMSU Student Survey

  • Gender Distribution: Calculated probabilities for selecting a male or female student.
  • Major Preferences: Analyzed conditional probabilities for majors among male and female students.
  • Other Insights: Insights on graduation intentions, GPA, and employment status among students.

Problem 2: Shingle Moisture Content Analysis

  • Moisture Content: Confirmed if shingles A and B met the moisture standards using hypothesis testing.
  • Comparison: Tested for equal means between shingles A and B to ensure quality consistency.

Problem 3: Salary Data Study

  • Salary by Education Level: Found significant differences in salary based on education.
  • Salary by Occupation: Observed variation in salaries across occupation types.
  • Interaction Effect: Discovered an interaction between education and occupation affecting salary outcomes, highlighting how education level impacts salary differently across occupations.

Files in This Repository

  • A+%26+B+shingles.csv: Moisture content data for shingles A and B.
  • SalaryData.csv: Data on salary, educational qualifications, and occupation.
  • Survey.csv: Survey responses from CMSU students.
  • AS_Extended_Project_Guided+_Template_Notebook+solution.ipynb: Jupyter notebook containing the data analysis and solution code.
  • AS_EXTENDED+PROJECT (1).pdf: Business report summarizing the analysis, results, and business implications.

About

Advanced Statistics Project: Analyze datasets using Inferential Statistics to address business questions on CMSU student demographics, asphalt shingle quality, and salary trends by education and occupation, supporting data-driven decisions and operational improvements.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published