Facilitating Student Analysis of Real Data by Creating a Flexible Python Notebook
Data Labs in the Classroom:
Teaching Tips from the Community
Dr. Tracy Quan, OOI Data Lab Fellow 2020
I am an Associate Professor in the Boone Pickens School of Geology (BPSoG) at Oklahoma State University. My current research focuses on stable isotope and organic geochemistry, but my background is in oceanography.
In the Fall of 2018, I developed Introduction to Oceanography Online, an upper-level undergraduate natural science general education course taught using a weekly asynchronous module format.
Enrollment is approximately 20-25 students per semester with a wide range of experiences and majors. Most students are juniors and seniors who are science majors but not Geology majors, though there are usually several non-science majors taking the class.
The schedule is split into three general sections: Physical/Geological Oceanography, Chemical Oceanography, and Biological Oceanography; in addition to the weekly modules, each section has a capstone project that incorporates the material in each section.
For the Chemical Oceanography section, I wanted to create a project that would require students to apply the concepts they learned in the course modules to interpret real water column data profiles.
I also wanted each student to have their own unique data set, both to add variety and minimize academic integrity issues. I originally gave each student a GEOTRACES data profile, but decided to switch to OOI data for Fall 2020 to allow students to customize their own data sets.
Given the general education course designation, I decided to create a flexible Python notebook template for students to use to query the OOI database rather than teach them how to create their own programs.
- The notebook would allow students to request water column profile data and generate depth profiles for three parameters: temperature, salinity, and dissolved oxygen.
- Students would choose both which OOI Array and Platform they wanted to investigate, and the date they requested the data from.
- The students would then write a short report to describe and explain the data trends in these graphs with respect to physical and chemical oceanography concepts such as circulation, stratification, and productivity.
The overall learning parameters for the activity were to:
- Be able to read scientific metadata and parameter information
- Be able to use a given Python notebook to query the OOI database and plot the data with depth
- Be able to interpret the resulting data profiles in terms of geochemical and circulation patterns and processes
- Be able to use internet and scientific resources to do research
- Be able to convey scientific information in graphical and written formats
With these goals in mind, BPSoG PhD student Trenity Ford and I created a Python notebook using Google Colabs.
- Students would enter the reference designator information for Array, Platform, Node, and CTD and dissolved oxygen instruments, along with their selected date.
- They would then get data plots for salinity, temperature, and dissolved oxygen with depth (as estimated from pressure data).
These plots could then be saved and included in the report. The notebook was designed in such a way that students only had to type in the correct reference designator codes and all other programming was hidden.
Since students were asked to look at parameters versus depth, there were some limitations in the choice of Array and Platform based on the presence of a Wire-Following Profiler Node that allowed data collection over a large depth range.
This Python Notebook template is available online in a Github repository.
Students were given detailed written and video instructions, along with a link to the notebook in Google Colabs and a template for the final report. The instructions contained information on the OOI program and the available Arrays and Nodes, how to generate plots using the Python notebook, and what to include for the final report.
- Array/Node selection:
- Students were directed to the OOI webpage to learn more about the locations of the different Arrays and Nodes and the profilers and instrumentation available at each location.
- Students could pick any Array and Node provided there was a Wire-Following Profiler at that location; how they made their selections was entirely up to them.
- Once a location was selected, the instructions indicated where to find the necessary reference designators for Array, Node, and CTD and dissolved oxygen instruments.
- Generating plots:
- The Python notebook has prompts for text entries corresponding to the Array, Node, and Instrument reference designators, as well as a spot to designate the date range.
- Once the reference designators are entered, the program can be run and will generate plots of temperature, salinity, and dissolved oxygen with depth (via a dropdown menu) and urls so students can recall the data without querying the OOI database again.
- Final report:
- The final report included the information on the Array/Node/date selected, the plots generated by the notebook, and the student’s interpretation of the plotted profile.
- Students were asked to identify oceanographic features such as the mixed layer, thermocline/halocline, oxygen minimum zone, etc. in their data and notate the plots.
- Students were also expected to write a short paragraph describing the profile and explaining the trends they observe. Explanations might include seasonal air temperature, upwelling/downwelling, or different water masses.
- While I tried to make the Python notebook as simple as possible, students occasionally had issues getting the program to run. Most of the time they were using the wrong reference designators, but sometimes the OOI database was down, or did not contain data for that particular location on the selected date. Unfortunately, the notebook doesn’t generate descriptive error messages when these issues occur.
- The reference designators can be hard to find on the OOI webpage, so including screencaps or video with the instructions is highly recommended if the activity is not being done in an in-person class.
- In theory, this notebook is very flexible and can be used by instructors to generate plots for other OOI-based educational activities. For example, the date range can be expanded to include more data, though since the generated plots are with respect to depth, a true time series can’t be generated. Instructors could also request that students access specific Arrays/Nodes/dates to produce profiles that might correspond with areas or topics under discussion. With a little adaptation to the code behind the form, other instruments and parameters such as density or fluorometry data could be accessed. The simplicity of the text-based entry may also make it a nice gateway notebook to more complicated Python programming.