
When moving on to assumptions #3, #4, #5, #6 and #7, we suggest testing them in this order because it represents an order where, if a violation to the assumption is not correctable, you will no longer be able to use linear regression. In this guide, we show you the linear regression procedure and Stata output when both your dependent and independent variables were measured on a continuous level.įortunately, you can check assumptions #3, #4, #5, #6 and #7 using Stata. In case you are unsure, examples of categorical variables include gender (e.g., 2 groups: male and female), ethnicity (e.g., 3 groups: Caucasian, African American and Hispanic), physical activity level (e.g., 4 groups: sedentary, low, moderate and high), and profession (e.g., 5 groups: surgeon, doctor, nurse, dentist, therapist). However, if you have a categorical independent variable, it is more common to use an independent t-test (for 2 groups) or one-way ANOVA (for 3 groups or more).

#Stata tutor how to
In this guide, we show you how to carry out linear regression using Stata, as well as interpret and report the results from this test. We will refer to these as dependent and independent variables throughout this guide. Ultimately, whichever term you use, it is best to be consistent. Note: The dependent variable is also referred to as the outcome, target or criterion variable, whilst the independent variable is also referred to as the predictor, explanatory or regressor variable.

Alternatively, if you just wish to establish whether a linear relationship exists, you could use Pearson's correlation.

If you have two or more independent variables, rather than just one, you need to use multiple regression. Alternately, you could use linear regression to understand whether cigarette consumption can be predicted based on smoking duration (i.e., your dependent variable would be "cigarette consumption", measured in terms of the number of cigarettes consumed daily, and your independent variable would be "smoking duration", measured in days). For example, you could use linear regression to understand whether exam performance can be predicted based on revision time (i.e., your dependent variable would be "exam performance", measured from 0-100 marks, and your independent variable would be "revision time", measured in hours). Linear regression, also known as simple linear regression or bivariate linear regression, is used when we want to predict the value of a dependent variable based on the value of an independent variable. Linear regression analysis using Stata Introduction
