Predict who survives the Titanic disaster using Excel.
Logistic regression allows us to predict a categorical outcome using categorical and numeric data. For example, we might want to decide which college alumni will agree to make a donation based on their age, gender, graduation date, and prior history of donating. Or we might want to predict whether or not a loan will default based on credit score, purpose of the loan, geographic location, marital status, and income. Logistic regression will allow us to use the information we have to predict the likelihood of the event we're interested in. Linear Regression helps us answer the question, "What value should we expect?" while logistic regression tells us "How likely is it?"
Given a set of inputs, a logistic regression equation will return a value between 0 and 1, representing the probability that the event will occur. Based on that probability, we might then choose to either take or not take a particular action. For example, we might decide that if the likelihood that an alumni will donate is below 5%, then we're not going to ask them for a donation. Or if the probability of default on a loan is above 20%, then we might refuse to issue a loan or offer it at a higher interest rate.
How we choose the cutoff depends on a cost-benefit analysis. For example, even if there is only a 10% chance of an alumni donating, but the call only takes two minutes and the average donation is 100 dollars, it is probably worthwhile to call.