A scatterplot can identify several different types of relationships between two variables. A relationship has no correlation when the points on a scatterplot do not show any pattern. A relationship is non-linear when the points on a scatterplot follow a pattern but not a straight line.
Because visual examinations are largely subjective, we need a more precise and objective measure to define the correlation between the two variables. To quantify the strength and direction of the relationship between two variables, we use the linear correlation coefficient:
The difference between the observed data value and the predicted value (the value on the straight line) is the error or residual.
A hydrologist creates a model to predict the volume flow for a stream at a bridge crossing with a predictor variable of daily rainfall in inches.
The larger the unexplained variation, the worse the model is at prediction. A quantitative measure of the explanatory power of a model is R 2 , the Coefficient of Determination:
Even though you have determined, using a scatterplot, correlation coefficient and R 2, that x is useful in predicting the value of y, the results of a regression analysis are valid only when the data satisfy the necessary regression assumptions.
The response variable (y) is a random variable while the predictor variable (x) is assumed non-random or fixed and measured without error.