If you've collected data on a continuous variable (age, for example), there may be a temptation to bin the data before analysis as an attempt to account for the fact that the relationship between the variable and the response may not be linear or otherwise easily explained using traditional regression methods. You might consider using regression splines instead, for flexibility in the response without the associated loss of power. Stephan Kolassa in this r-help discussion about categorical variables notes that Frank Harrell has an excellent page listing the problems associated with categorizing continuous variables.
Comments