To Projects
 
CSC 423/324 -- Final Project
 
Grading Criteria
- Grading Breakdown for Final Project: Technical Report: 40%, Non-technical Summary: 
25%, PowerPoint Slides: 25%, Group Member Evaluation: 10%.
 
- Grading Criteria for Summary (Presentation, PowerPoint Slides, and Non-technical 
Summary): Accurate: 25%, Complete: 25%,  Concise: 25%,  Interesting: 25%
 
- Grading Criteria for Technical Report: Correct and Complete: 40%, Well-written: 20, Appropriate Dataset: 20%, Interesting: 20%
 
- Group Member Evaluation Form
 
Due Dates
 
	- Final Project presentations will be on the last day of class, which is 
	August 16 (Day 10). Post the members in your group and the project title by 
	Monday, 
	August 14 (Day 9).
 
Groups for Final Project
- Final projects are be completed in groups of size 2, or 3.  Online Learning students 
may work with in-class students.
 
- A discussion forum named FinalProject has been created on http://d2l.depaul.edu with two topics: 
FormProjectGroup, for inviting persons to join your group of two or three, and 
ProjectInfo, for posting the title of your final project and group members after you have formed 
your group.
 
Submission Items
- A non-technical summary, not more than one page. This 
summary should present your results to a non-technical audience, such as your 
boss, your friend, or your mother (assuming that they do not understand 
statistics in detail).
 
- A technical report (suggested length, 5 pages) with the details of your analysis, presented for a statistically literate audience.
This report must be clearly written in complete sentences with an introduction and conclusion.  You can include input datasets, source code, output, and graphs in an appendix.
 
- Power Point slides for your final presentation.  Online Learning students will not 
do an in-class 
presentation, but they will still create Power Point slides.
 
- The Group Member Evaluation Form.  Don't submit this form if you are working in a group by yourself.
 
- The three files in steps 1-3 should be submitted in a zip file: finalproj-larsson-smith-suarez.zip, 
with the names of the group members in your zip file name. Only one group 
member needs to submit the zip file with the submission items listed in this 
section. However, the other member or members should submit a comment 
stating who is in your group and who is submitting the submission items.
 
Final Project Content
Your final project report should address these points:
- The dependent variable for your regression models must be a continuous 
variable unless you are using logistic regression.
 
- Your dataset should include at least 8 variables, and preferably at least 10 observations per variable. 
Ideally, you should set aside a random selection of the data for validation, so 
if you hold out 50% of the observations for validation, you will still be left 
with at least 5 observations per variable.
 
- The exploratory data analysis may suggest a model that is adequate for fitting the data.  Do the data show a nonlinear relationship so that a data transform is needed?
 
- Check for collinearity among the independent variables and adjust your model 
accordingly.
 
	- Did you use a variable selection procedure? If so, describe it.
 
	- If several models appear equally good for predicting the data, discuss 
	them all, including, perhaps, an intuitive choice of the model that makes 
	the most sense.
 
	- Analyze the residual plots. This might indicate failures in the 
	assumptions or inadequacies in the model
 
	- Check for leverage points, influential points, and outliers in your 
	dataset. Decide if they should be deleted or if the model needs to be 
	modified to account for them.
 
	- Can your model be improved? Are you satisfied with the model you 
	selected?
 
	- Use your selected model to examine the relationships among the variables.
	Identify the strongest predictors among the independent variables?
 
	- Apply cross-validation techniques to evaluate how well your model does 
	for prediction.
 
Note: even if you do not find a satisfactory regression model for your 
dataset, you can still explain what models you tried and why they were not 
satisfactory.  If you found two models that seemed equally good, compare 
them, using graphs and goodness-of-fit statistics.