SYS6021: Statistical Modeling I Fall 2024
Basic Course Information
What are the contributing factors to the severity of train accidents? How do you predict if an e-mail is spam? How can you translate goal-directed problems such as these into actionable decisions and meaningful recommendations that can have vast societal implications? How can you harness multi-dimensional, heterogeneous data to analyze the problem? In this course, we will explore Evidence Informed Systems Engineering (EISE) practices and how they can be applied to difficult, open-ended problems.
The primary tools for EISE come from linear statistical models and this course demonstrates the use of these models for problem understanding, prediction, and control. We will learn how to formulate hypotheses, build statistical models to test them, and make recommendations based on our findings. These steps can be laden with biases; for example in the data available to test these hypotheses or the metrics used to assess success. We will learn how to identify and prevent these biases to ensure equitable outcomes.
The specific modeling tools we will cover include principal components analysis, multivariate linear regression, logistic regression, and time series analysis. In class, we will concentrate on the theory and practice of model construction, while weekly labs will assess your understanding of the theory and your ability to apply it in practice. Projects will provide open-ended problem-solving situations that illustrate the broad applicability of the methods in a setting similar to what you will encounter in the real world. We hope these projects illustrate the value of statistical modeling and that the course provides a foundation for future learning.
- Apply EISE approaches to solve real-world problems
- Formulate meaningful, testable hypotheses around those problems from associated data
- Identify appropriate statistical modeling technique(s) to test those hypotheses
- Assess the limitations of the information available to solve identified problems
- Uncover bias, errors, outliers, and influential observations in data and models
- Derive an actionable recommendation with statistical confidence using the evidential reasoning process
- Communicate the application of the EISE process to a problem through technical summaries/reports directed to a client and/or practicing engineer
- Recognize the limitations of methods learned in class
- Lay the foundation to learn more advanced modeling tools when those covered in class are insufficient
SYS 3060, SYS 3034, and APMA 3012 or equivalent. It is recommended that students have a basic command of linear algebra, calculus, and statistics. We will use R for data analysis and R Studio for our programming sessions. Student are encourage to familiarize themselves with R programming, R for Data Science, and R Studio.
No required textbook, but students are encouraged to read chapters from:
- Linear Models with R by Julian Faraway.
- An Introduction with Statistical Learning by Gareth James et al.
- Applied Linear Statistical Models by Michael H. et al.
- simpleR - Using R for Introductory Statistics, by John Verzani.
Schedule
Disclaimer: The instructor reserves to right to make changes to the syllabus, including weekly lab, project, and exam due dates. These changes will be announced as early as possible.
Date | Topic / Assignment |
---|---|
Aug. 28 (Wed) | Course Overview & Evidence-Informed Systems Engineering (EISE) |
Sep. 2 (Mon) | Visualization |
Sep. 4 (Wed) | Visualization |
Sep. 6 (Fri) | Release Lab 1 (Visualization), 9:00 am (ET) |
Sep. 9 (Mon) | Visualization |
Sep. 11 (Wed) | Visualization/Extremes |
Sep. 16 (Mon) | Principle Components Analysis (PCA) |
Sep. 18 (Wed) | Principal Components Analysis (PCA) |
Sep. 20 (Fri) | Release Lab 2 (PCA), 9:00 am (ET) |
Sep. 22 (Sun) | **Due** Lab 1 (Visualization), 11:59 pm (ET) |
Sep. 23 (Mon) | RDM Notebooks/PCA Exercise |
Sep. 25 (Wed) | Multiple Linear Regression (MLR) |
Sep. 30 (Mon) | Multiple Linear Regression (MLR) |
Oct. 2 (Wed) | Multiple Linear Regression (MLR) |
Oct. 4 (Fri) | Release Lab 3 (MLR), 9:00 am (ET) |
Oct. 6 (Sun) | **Due** Lab 2 (PCA), 11:59 pm (ET) |
Oct. 7 (Mon) | Multiple Linear Regression (MLR) / Release Group Project 1 |
Oct. 9 (Wed) | Multiple Linear Regression (MLR) |
Oct. 11 (Fri) | Release Lab 4 (Practice Midterm Review), 9:00 am (ET) |
Oct 14 (Mon) | No Class -- Fall Break |
Oct. 16 (Wed) | Project 1 Group Time |
Oct. 20 (Sun) | **Due** Lab 3 (MLR), 11:59 pm (ET) |
Oct. 21 (Mon) | Midterm Review |
Oct. 23 (Wed, 2pm) - 27 (Sun, 11:59pm) | Midterm Exam (virtual) |
Oct. 27 (Sun) | **Due** Lab 4 (Practice Midterm Review), 11:59 pm (ET) |
Oct. 28 (Mon) | Generalized Linear Models (GLM) |
Oct. 30 (Wed) | Remembrance of Prof. Bill Scherer (GLM) | Nov. 4 (Mon) | Generalized Linear Models (GLM) |
Nov. 6 (Wed) | Generalized Linear Models (GLM) |
Nov. 8 (Fri) | Release Lab 5 (GLM), 9:00 am (ET) |
Nov. 10 (Sun) | **Due** Project 1, 11:59 pm (ET) |
Nov. 11 (Mon) | Generalized Linear Models (GLM) |
Nov. 13 (Wed) | Generalized Linear Models (GLM) - Contest |
Nov. 17 (Sun) | **Due** Lab 5 (GLM), 11:59 pm (ET) |
Nov. 18 (Mon) | Time Series Analysis / Release Final Project |
Nov. 20 (Wed) | Time Series Analysis |
Nov. 22 (Fri) | Release Lab 6 (Time series), 9:00 am (ET) |
Nov. 25 (Mon) | Time Series Analysis / Release Group Project 2 |
Nov. 27 (Wed) | No Class -- Thanksgiving Recess |
Dec. 2 (Mon) | Project 2 Group Time |
Dec. 4 (Wed) | Course Recap |
Dec. 8 (Sun) | **Due** Lab 6 (Time Series), 11:59 pm (ET) |
Dec. 15 (Sun) | **Due** Group Project 2 & Final Report, 11:59 pm (ET) |
Student Evaluation and Assessment
Grading:
- Bi-weekly labs: 20% (lowest dropped)
- Hands-On Activities: 10%
- Group Projects: 30%
- Midterm Exam: 20%
- Final Project: 20%
- Class Participation: +% (extra) -- includes Piazza + office hours participation.
Labs:
On Friday morning at 9 am ET, bi-weekly laboratory assignments based upon course and laboratory notes will be posted via Canvas.
These assignments provide exercises in R programming that supplement the material covered in class and provide the foundation for the projects.
Each lab assignment requires students to program in R and analyze a supplied data set.
The assignments are designed to assess your knowledge on statistical modeling techniques and their mechanics.
These assignments must be done individually, include an honor pledge, and be completed by Sunday night at 11:59 pm ET, typically two weeks later.
While there is no time limit for these assignments, they are designed not to take more than 50 minutes. Laboratory sessions are excellent practice for exams and real-world analysis under time constraints.
At the end of the semester, the lowest grade on the lab assignments will be dropped. Additionally, if you complete the Bonus Lab, it will replace your next lowest lab.
Hands-On Activities:
Hands-on activities will be used throughout the course to allow you to practice the methods covered in class, recognize opportunities to apply them in your own work, and discover their shortcomings. Students can still participate in the activities if participating online and will submit their activity on Canvas under "Assignments" for pass/fail credit.
Projects:
The class will have two group projects on real-world topics and data sets. These exercises provide a real-world context for what we learn and are open-ended problem-solving experiences that illustrate the concepts of evidence-informed systems engineering. Hence, they provide the opportunity to demonstrate your understanding of class material using real data to solve a goal-directed problem. Projects are designed to teach students how to perform a detailed analysis as well as how to proficiently communicate results as in a technical document or client report.
Exams:
Exams are based entirely on classroom notes and discussions, readings, projects, and laboratory assignments. Each exam will contain a closed-book section with short answer questions and an open-book section requiring analytical problem-solving.
Final Project:
The final project will be a detailed data analysis of a topic and dataset of your choosing. We encourage you to do something related to your research and are happy to work with you in selecting a data source and defining a project. You can also choose to extend any one of the projects. For instance, you could choose to do an extended analysis of train accidents. You must submit your topic description and data sources for your final project at the specified date on Canvas. In your final project, you must show competence in a subset of topics discussed in the class. Specifically you must organize your work according to the principles of Evidence-Informed Systems Engineering and use two methods from the following topics: visualization, principal components, multiple linear regression, generalized linear models, time series analysis, and advanced topics.
Course Policies
Submission and Late Submission Policy:
On the day a project is due, you must submit an electronic copy in pdf (NOT doc or docx, etc.) along with source code on the Canvas site and pledge your submission. No late assignments will be accepted in this class, unless the student has procured special accommodations for warranted circumstances. In many cases you will do better to submit an incomplete assignment rather than a late one.
Use of Generative Artificial Intelligence:
Generative artificial intelligence (AI) tools—software that creates new text, images, computer code, audio, video, and other content—have become widely available. Well-known examples include ChatGPT for text and DALL E for images. This policy governs all such tools, including those released during our semester together. You may use generative AI tools on assignments in this course when I explicitly permit you to do so. Otherwise, you should refrain from using such tools. If you do use generative AI tools on assignments in this class, you MUST properly document and credit the tools themselves. Cite the tool you used, following the pattern for computer software given in the specified style guide. Additionally, please include a brief description of how you used the tool. If you choose to use generative AI tools, please remember that they are typically trained on limited datasets that may be out of date. Additionally, generative AI datasets are trained on pre-existing material, including copyrighted material; therefore, relying on a generative AI tool may result in plagiarism or copyright violations. Finally, keep in mind that the goal of generative AI tools is to produce content that seems to have been produced by a human, not to produce accurate or reliable content; therefore, relying on a generative AI tool may result in your submission of inaccurate content. It is your responsibility—not the tool's—to assure the quality, integrity, and accuracy of work you submit in any college course. If you use generative AI tools to complete assignments in this course, in ways that I have not explicitly authorized, I will apply the UVA Honor Code as appropriate to your specific case. In addition, you must be wary of unintentional plagiarism or fabrication of data. Please act with integrity, for the sake of both your personal character and your academic record.
Recording of Lectures: Every lecture will be recorded in order to accommodate students who are sick or cannot attend for some other reason. Because lectures include fellow students, you and they may be personally identifiable on the recordings. We might set aside some time at the end for questions that will not be recorded -- this will be announced when it takes place. These recordings may only be used for the purpose of individual or group study with other students enrolled in this class during this semester. You may not distribute them in whole or in part through any other platform or to any persons outside of this class, nor may you make your own recordings of this class unless written permission has been obtained from the Instructor and all participants in the class have been informed that recording will occur. If you want additional details on this, please see Provost Policy 008 and follow-up guidelines. If you notice that I have failed to activate the recording feature, please remind me!
Illness:
We try to create a safe environment, not only for our students, but also for our faculty and our staff. To that end, please stay home or in your dorm room if you are ill with or are symptomatic for any communicable disease. I would rather you stay home and work something out with me for making up work or taking an exam than for an illness to spread through the class. If you believe you are sick, please contact Student Health for appropriate treatment or testing.
Religious Accommodations:
It is the University's long-standing policy and practice to reasonably accommodate students so that they do not experience an adverse academic consequence when sincerely held religious beliefs or observances conflict with academic requirements.
Students who wish to request academic accommodation for a religious observance should submit their request to us by private message on Piazza as far in advance as possible. Students who have questions or concerns about academic accommodations for religious observance or religious beliefs may contact the University’s Office for Equal Opportunity and Civil Rights (EOCR) at UVAEOCR@virginia.edu or 434-924-3200.
Accessibility Statement:
It is our goal to create a learning experience that is as accessible as possible. If you anticipate any issues related to the format, materials, or requirements of this course, please meet with us outside of class so we can explore potential options. Students with disabilities may also wish to work with the Student Disability Access Center (SDAC) to discuss a range of options to removing barriers in this course, including official accommodations. We are fortunate to have an SDAC advisor, Courtney MacMasters, physically located in Engineering. You may email her at sdac.studenthealth.virginia.edu. If you have already been approved for accommodations through SDAC, please send us your accommodation letter and meet with us so we can develop an implementation plan together.
Academic Integrity Statement:
"The School of Engineering and Applied Science relies upon and cherishes its community of trust.
We firmly endorse, uphold, and embrace the University’s Honor principle that students will not lie,
cheat, or steal, nor shall they tolerate those who do. We recognize that even one honor infraction can
destroy an exemplary reputation that has taken years to build. Acting in a manner consistent with the principles
of honor will benefit every member of the community both while enrolled in the Engineering School and in the future.
Students are expected to be familiar with the university honor code,
including the section on academic fraud."
In summary, if assignments are individual then no two students should submit the same source code -- any overlap in source code of
sufficient similarity will be potentially flagged as failure to abide by the Honor Code. You can discuss, you can share resources, you can talk
about the assignment but not share code as this would potentially incur an honor code violation.
Regardless of circumstances we will assume that any source code, text, or images submitted alongside reports or projects are of the authorship
of the individual students unless otherwise explicitly stated through appropriate means. Any missing information regarding sources will be regarded potentially
as a failure to abide by the academic integrity statement even if that was not the intent. Please be careful clearly stating
what is your original work and what is not in all assignments.
Additional Resources
Support for Career Development:
Engaging in your career development is an important part of your student experience. For example, presenting at a research conference, attending an interview for a job or internship, or participating in an extern/shadowing experience are not only necessary steps on your path but are also invaluable lessons in and of themselves. I wish to encourage and support you in activities related to your career development. To that end, please notify me by email as far in advance as possible to arrange for appropriate accommodations.
Student Support Team:
You have many resources available to you when you experience academic or personal stresses. In addition to your professors, the School of Engineering and Applied Science has staff members located in Thornton Hall who you can contact to help manage academic or personal challenges. Please do not wait until the end of the semester to ask for help!
Learning:You may schedule time with the CAPS counselors through Student Health. When scheduling, be sure to specify that you are an Engineering student. You are also urged to use TimelyCare for either scheduled or on-demand 24/7 mental health care.
Community and Identity:The Center for Diversity in Engineering (CDE) is a student space dedicated to advocating for underrepresented groups in STEM. It exists to connect students with the academic, financial, health, and community resources they need to thrive both at UVA and in the world. The CDE includes an open study area, event space, and staff members on site. Through this space, we affirm and empower equitable participation toward intercultural fluency and provide the resources necessary for students to be successful during their academic journey and future careers.
Harrassment, Discrimination and Interpersonal Violence:
The University of Virginia is dedicated to providing a safe and equitable learning environment for all students. If you or someone you know has been affected by power-based personal violence, more information can be found on the UVA Sexual Violence website that describes reporting options and resources available - www.virginia.edu/sexualviolence.
The same resources and options for individuals who experience sexual misconduct are available for discrimination, harassment, and retaliation. UVA prohibits discrimination and harassment based on age, color, disability, family medical or genetic information, gender identity or expression, marital status, military status, national or ethnic origin, political affiliation, pregnancy (including childbirth and related conditions), race, religion, sex, sexual orientation, veteran status. UVA policy also prohibits retaliation for reporting such behavior.
If you witness or are aware of someone who has experienced prohibited conduct, you are encouraged to submit a report to Just Report It (justreportit.virginia.edu) or contact EOCR, the office of Equal Opportunity and Civil Rights.
If you would prefer to disclose such conduct to a confidential resource where what you share is not reported to the University, you can turn to Counseling & Psychological Services (“CAPS”) and Women’s Center Counseling Staff and Confidential Advocates (for students of all genders).
As your professors, know that we care about you and your well-being and stand ready to provide support and resources as we can. As faculty members, we are responsible employees, which means that we are required by University policy and by federal law to report certain kinds of conduct that you report to us to the University's Title IX Coordinator. The Title IX Coordinator's job is to ensure that the reporting student receives the resources and support that they need, while also determining whether further action is necessary to ensure survivor safety and the safety of the University community.