Data Analysis Tools – Testing a Potential Moderator

RESEARCH QUESTION

Does CO2 emission moderates the relationship between income per person and life expectancy from a global perspective? In other words, is income per person of the countries related to their citizens’ average life expectancy for each level of the countries’ CO2 emission?

CO2 emission is selected as a moderating variable because CO2 affects health, which is related to life expectancy. On the other hand, the rich population is believed to produce more CO2. Therefore, this study is to determine whether CO2 emission moderates the relationship between income and life expectancy.

VARIABLES

Quantitative Explanatory Variable: incomeperperson (Income Per Person in US Dollars)

Quantitative Response Variable: lifeexpectancy (Average Number of Years a Newborn Child Would Live)

Categorical Moderating Variable: co2emissions (Cumulative CO2 emission in metric tons since 1751)

Countries are divided into three groups based on levels of CO2 emission:

Group 1 = 0 – 60,000,000 metric tons

Group 2 = 60,000,001 – 1,000,000,000 metric tons

Group 3 = 1,000,000,001 – 334,221,000,000 metric tons

SAS PROGRAM FOR TESTING MODERATION IN THE CONTEXT OF PEARSON CORRELATION COEFFICIENT

/* Start the data step */
LIBNAME mydata “/courses/d1406ae5ba27fe300 ” access=readonly;
DATA new; set mydata.gapminder;

/* Assign label names for variables */
LABEL incomeperperson=”Income Per Person” /*”Income Per Person – 2010 Gross Domestic Product Per Capita in Constant 2000 US$”*/
lifeexpectancy=”Life Expectancy” /*”2011 Average Number of Years a Newborn Child Would Live”*/
co2emissions=”2006 cumulative CO2 emission (metric tons) since 1751″;

/* Group values of the third variable – moderator */
IF co2emissions LE 60000000 AND co2emissions GE 0 THEN co2emissionsgroup=1;
ELSE IF co2emissions LE 1000000000 AND co2emissions GT 60000000 THEN co2emissionsgroup=2;
ELSE IF co2emissions LE 334221000000 AND co2emissions GT 1000000000 THEN co2emissionsgroup=3;

/* PROC SORT */
PROC SORT; by COUNTRY;
PROC SORT; by co2emissionsgroup;

/* Run Pearson Correlation Coefficient and Test Moderation with the Third Variable*/
PROC CORR; VAR lifeexpectancy incomeperperson; BY co2emissionsgroup;
RUN;

/* Create Scatter Plot */

PROC GPLOT; PLOT lifeexpectancy*incomeperperson; BY co2emissionsgroup;

Post Hoc Test is NOT necessary for Pearson Correlation Coefficient and is only for categorical variable. Income level (explanatory variable) and life expectancy (response variable) in this test are quantitative variables. 

Please click for larger image:

20160110 Moderator SAS Code 001

OUTPUT 

20160110 Moderator Result 001 20160110 Moderator Result 003 20160110 Moderator Result 004 20160110 Moderator Result 005 20160110 Moderator Result 006

INTERPRETATION 

For the low CO2 emission group (Group 1), the correlation (r) between income per person and life expectancy is 0.46919 with a significant p-value at 0.0011. The relationship between income per person and life expectancy for the low CO2 emission group is statistically significant. The r reflects that the two variables have a moderate, positive relationship for the low CO2 emission group. The r² (r square) of 0.2201 suggests that if we know the income per person, we can predict 22.01% of the variability we will see in life expectancy for the low CO2 emission group.

For the moderate CO2 emission group (Group 2), the correlation (r) between income per person and life expectancy is 0.48539 with a significant p-value less than 0.0001. The relationship between income per person and life expectancy for the moderate CO2 emission group is statistically significant. The r reflects that the two variables have a moderate, positive relationship for the moderate CO2 emission group. The r² (r square) of 0.2356 suggests that if we know the income per person, we can predict 23.56% of the variability we will see in life expectancy for the moderate CO2 emission group.

For the high CO2 emission group (Group 3), the correlation (r) between income per person and life expectancy is 0.69289 with a significant p-value less than 0.0001. The relationship between income per person and life expectancy for the high CO2 emission group is statistically significant. The r reflects that the two variables have a strong, positive relationship for the high CO2 emission group. The r² (r square) of 0.4801 suggests that if we know the income per person, we can predict 48.01% of the variability we will see in life expectancy for the high CO2 emission group.

Both directions and strengths of the relationships between income per person and life expectancy are similar for the low, moderate, and high CO2 emission groups. Although the high CO2 emission group shows a slightly stronger relationship between income per person and life expectancy, the levels of CO2 emission do not moderate the relationship between income and life expectancy. For all levels of CO2 emission, countries with higher income are associated with longer lives of their citizens.