CSIS 344 Introduction to Data Science


Course Description

An introduction to foundational concepts in data science, including: information retrieval and storage, preprocessing, visualization, exploratory data analysis, applied machine learning, research methods, and experimental design. Students will develop solutions to computational problems spanning a variety of disciplines using state-of-the-art scientific programming tools and techniques, with an emphasis on the interpretation and presentation of experimental results.


Instructor

Brian R. Snider
Office hours: Wood-Mar 223 (see schedule)


Texts

optional


Resources


Objectives

Students will understand:

Students will gain practical experience processing, visualizing, and exploring data, and will design and implement solutions to computational problems spanning a variety of disciplines.


Course Organization

This course consists of lectures and hands-on programming and data visualization exercises. Assignments will be carried out in the Python programming language. Some instruction in the use of this language and its supporting packages will be provided during lecture; however, I expect that you will consult additional resources to supplement your knowledge.

The course will include regular homework and/or programming assignments. There will be no credit given for late assignments (without an excused absence)—turn in as much as you can. Unless otherwise specified, no handwritten work will be accepted.

Any assigned reading should be completed before the lecture covering the material per the provided schedule. Not all reading material will be covered in the lectures, but you will be responsible for the material on homework and exams. Quizzes over the assigned reading may be given at any time.


Academic Integrity and Collaboration

See the university's policy on academic honesty. See also the university's policy on the use of generative AI and related tools in an academic setting. Any suspected incidents of academic integrity violations will be investigated and reported to the Academic Affairs Office as they arise.

Unless otherwise specified (e.g., for a group assignment or project), you are expected to do your own work. This also applies to the use of online resources (e.g., solution guides), help forums (e.g., StackOverflow), and generative models (e.g., ChatGPT). Put simply: if you are representing someone (or something) else's work as your own, you are being dishonest.

For any given assignment, project, or similar, if I suspect that it is more likely than not the case that some or all of your submitted materials are not primarily your own work, I may ask you to orally explain and defend the approach or techniques used, the relevant theory or foundational concept, etc, as part of my investigative process. If the work is truly your own—or based on synthesis or application of outside resources resulting in actual learning—it should be very straightforward for you to explain things. However, if I find that you cannot sufficiently explain things, my only logical recourse will be to assume that the work is not your own. Put simply, again: you should be able to explain anything that you turn in for grading; if you cannot explain it, do not turn it in and represent it as your own work.

Most students would be surprised at how easy it is to detect inappropriate collaboration or other academic integrity violations such as plagiarism in programming, or over-reliance on generative models or similar tools without any understanding of the underlying concepts. Remember: you always have willing and legal collaborators in the faculty. I encourage you to ask questions before, during, or after class, ask for help in the CS lab, and visit my office hours for assistance.

Almost all of life is filled with collaboration (i.e., people working together). Yet in our academic system, we artificially limit collaboration. These limits are designed to force you to learn fundamental principles and build specific skills. It is very artificial, and you'll find that collaboration is a valuable skill in the working world. While some of you may be tempted to collaborate too much, others will collaborate too little. When appropriate, it's a good idea to make use of others—the purpose here is to learn. Be sure to make the most of this opportunity but do it earnestly and with integrity.


University Resources

Accessibility and Disability

If you have specific physical, psychiatric, or learning disabilities and require accommodations, please contact Disability & Accessibility Services (DAS) as early as possible so that your learning needs can be appropriately met. For more information, go to georgefox.edu/das or contact das@georgefox.edu.

My desire as a professor is for this course to be welcoming to, accessible to, and usable by everyone, including students who are English-language learners, have a variety of learning styles, have disabilities, or are new to online learning systems. Be sure to let me know immediately if you encounter a required element or resource in the course that is not accessible to you. Also, let me know of changes I can make to the course so that it is more welcoming to, accessible to, or usable by students who take this course in the future.

Academic Resource Center

The Academic Resource Center (ARC) on the Newberg campus provides all undergraduate students with free writing consultation, academic coaching, and learning strategy review (e.g., techniques to improve reading, note-taking, study, time management). The ARC offers in-person appointments; if necessary, Zoom appointments can be arranged by request. The ARC, located on the first floor of the Murdock Library, is open during the academic year from 1:00–9:00 p.m., Monday through Thursday, and 12:00–4:00 p.m. on Friday. To schedule an appointment, click on the TracCloud icon on the Canvas dashboard, go to traccloud.georgefox.edu, call 503-554-2327, email the_arc@georgefox.edu, or stop by the ARC. Visit arc.georgefox.edu for information about ARC Consultants' areas of study, instructions for scheduling an appointment, learning tips, and a list of other tutoring options on campus.

Student Support Network

George Fox University uses a robust referral and support system, Fox360, to learn about students who are experiencing various student success concerns. Students who are referred by a professor, other employee, or fellow student will be contacted by a member of our Student Support Network to explore the student's situation, develop a plan, and connect with relevant campus resources. GFU community members who have a concern about a student's well-being can submit an alert by going to fox360.georgefox.edu. Our goal is to provide 360° care for students as they navigate their college experience. For more information see ssn.georgefox.edu or contact Rick Muthiah, Director of Learning Support Services.


Grading

Grading Scale

The final course grade will be based on:

Graded course activities will be posted to Canvas. Take care to read the specifications carefully and proceed as directed. Failure to pay attention to detail will often result in few to zero points being awarded on a given activity.

Grades will be updated as often as possible; you are encouraged to use the "What-If" functionality to calculate your total grade by entering hypothetical scores for various items.

Note that some graded activities in this course will be submitted via GitLab.


Tentative Schedule

Week 1 · Tue

Introduction; Environment Setup

ReferencesConda, PyCharm

Week 1 · Thu

Filesystem-Based Data

ReferencesFilesystem, I/O, CSV

Week 2 · Tue

Python Lists, Tuples, Sets, and Dictionaries

ReferencesPython structures

Week 2 · Thu

NumPy Arrays

Referencesnumpy.ndarray, numpy.genfromtxt

Week 3 · Tue

Exploratory Data Analysis and Visualization

Referencesscipy.stats, matplotlib.pyplot

Week 3 · Thu

Plot Layout and Formatting; Plot Types

Referencesmatplotlib guide, samples

Week 4 · Tue

Outliers and Missing Values

Referencesnumpy.genfromtxt, sklearn.impute

Week 4 · Thu

Additional Data Formats and Tools

Referencesnumpy, scipy.io, json, sqlite3, skimage, skvideo

Week 5 · Tue

Data Exploration presentations

Week 5 · Thu

Mid-semester break—no classes

Week 6 · Tue

Pandas DataFrame and Series

ReferencesPandas overview, structures, I/O

Week 6 · Thu

Pandas DataFrame and Series

ReferencesPandas overview, structures, I/O

Week 7 · Tue

Hypothesis Formulation

ReferencesStatistical hypothesis testing

Week 7 · Thu

Midterm exam

Week 8 · Tue

Hypothesis Workshop

Week 8 · Thu

Hypothesis Testing; Statistical Assumptions

Referencesscipy.stats, matplotlib.pyplot.hist

Week 9 · Tue

Hypothesis presentations

Week 9 · Thu

Machine Learning

ReferencesIris dataset, sklearn.datasets.load_iris

Week 10 · Tue

Classification

Referencessklearn.svm, sklearn.model_selection.train_test_split

Week 10 · Thu

Classification Metrics

Referencessklearn.metrics, sklearn.metrics.ConfusionMatrixDisplay

Week 11 · *

Spring break—no classes

Week 12 · Tue

Feature Encoding and Discretization

Referencessklearn.preprocessing, pandas.get_dummies

Week 12 · Thu

Feature Standardization and Normalization

Referencessklearn.preprocessing

Week 13 · Tue

Cross-Validation; Class Imbalance

Referencessklearn.model_selection

Week 13 · Thu

Feature Selection

Referencessklearn.feature_selection

Week 14 · Tue

Hyper-Parameter Tuning

Referencessklearn.model_selection

Week 14 · Thu

Best Practices

Referencessklearn user guide

Week 15 · *

Project presentations

Week 16 · TBD

Final exam


This page was last modified on 2026-03-31 at 09:13:28.

Copyright © 2015–2026 George Fox University. All rights reserved.