2 years of temperature and salinity data from 30m at Irminger Sea Flanking Mooring B showing the annual cycle.

Data portals are great for navigating and finding useful datasets. But sometimes, the easiest way to access data is with a bit of code, especially when you want to make your own graphs or do a bit of custom processing (like the above example).

For about 3 years, I’ve been using Python notebooks to grab data from the OOI, and every time I start a new notebook, I find myself copying the same 90 lines of code that I use to request and download data from the OOI. But now, I’ve finally gotten around to making my own python library so my new notebooks will be much shorter. It turns out, it’s not that difficult to make a library, but for a year I was confused by eggs and wheels (it’s long story), before Tom Connolly at Cal State showed me a much simpler approach.

And so now I’m pleased to present ooilab, my (very simple, and probably not that robust) library to request and download data from the OOI Data Portal (aka OOI Net).

You can find more information about the library on its GitHub page.  And you can check out the example below for a quick introduction on how to use it.

I should note that the OOI recently released the OOI Data Explorer, a wonderful new tool for previewing and accessing OOI data. (The demo webinar is a great intro.)

The new site includes merged and averaged datasets that are also available via Erddap, which makes it far easier to access data than ever before. However, several moorings are not yet available on the new portal, and under the hood the new system still uses the old data system to generate its data products. So if you want the highest-resolution data, or if you need to splice together telemetered and recovered streams using your own method, grabbing the data yourself via a bit of code from OOINet might still be the better option for the foreseeable future.

And with this new library, that will hopefully be a bit easier.

Accessing OOI data with ooilab

By Sage Lichtenwalner, November 17, 2020

Accessing data from the OOI Data Portal (aka OOInet) is relatively straightforward, once you know the details for the instruments you need.

Even better, with a bit of code, you can request, download and load high-res data without having to navigate through the Data Portal, or rely on its limited plotting capability. This is especially useful if you want to create a script to download the latest data, or grab data from multiple instruments. Either case isn't very practical to do by hand.

In the past, the examples I've written in Python have been very lengthy. But I've finally gotten around to creating a new python library, ooilab, which you can use to grab OOI data with just a few lines of code in your notebook.

In this notebook, we'll quickly demonstrate how you can use this new library.

Let's get Started

First, as we normally do, let's setup our notebook for the Google Colab environment.

In [1]:
# Notebook Setup
import xarray as xr
!pip install netcdf4
import matplotlib.pyplot as plt

# Setup default plot styles
import seaborn as sns
sns.set()

# Supress open_mfdataset warnings
import warnings
warnings.filterwarnings('ignore')
Requirement already satisfied: netcdf4 in /usr/local/lib/python3.6/dist-packages (1.5.4)
Requirement already satisfied: numpy>=1.9 in /usr/local/lib/python3.6/dist-packages (from netcdf4) (1.18.5)
Requirement already satisfied: cftime in /usr/local/lib/python3.6/dist-packages (from netcdf4) (1.3.0)

Importing ooilab

Our next step is to install and import the new ooilab library. You can find more information about how to use the library on my ooilab GitHub page.

In [2]:
!pip install git+https://github.com/seagrinch/ooilab.git
import ooilab
Collecting git+https://github.com/seagrinch/ooilab.git
  Cloning https://github.com/seagrinch/ooilab.git to /tmp/pip-req-build-x4po9qrc
  Running command git clone -q https://github.com/seagrinch/ooilab.git /tmp/pip-req-build-x4po9qrc
Requirement already satisfied (use --upgrade to upgrade): ooilab==0.2 from git+https://github.com/seagrinch/ooilab.git in /usr/local/lib/python3.6/dist-packages
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from ooilab==0.2) (1.18.5)
Building wheels for collected packages: ooilab
  Building wheel for ooilab (setup.py) ... done
  Created wheel for ooilab: filename=ooilab-0.2-cp36-none-any.whl size=2577 sha256=6255bbbc1688df16fe0e7c65b04f42d274b088d887967487b6e4c5e920305070
  Stored in directory: /tmp/pip-ephem-wheel-cache-_x99eu94/wheels/7f/c9/1d/60f91d31a29c5e03e8bc9dd7a58918a976a3f2ffe1ffcb3ea1
Successfully built ooilab

If you plan on using the library to make data requests, you will first need to specify your API username and token, which you can find on the bottom of your "User Profile" page after logging in. (Note, these are different from the username and password you use to log into the Data Portal.)

In [3]:
ooilab.API_USERNAME = ''
ooilab.API_TOKEN = ''

Requesting Data from OOINet

Now that we have the library installed, we can use it to request data, using the request_data() function (go figure).

Let's grab the last 2 years of data from the 30m CTD at Global Irminger Sea Flanking Mooring B.

In [34]:
# Data Request - This only needs to be run once
ooilab.request_data('GI03FLMB-RIM01-02-CTDMOG060',
                    'recovered_inst',
                    'ctdmo_ghqr_instrument_recovered',
                    '2018-01-01T00:00:00.000Z',
                    '2021-01-01T00:00:00.000Z')
Out[34]:
'https://opendap.oceanobservatories.org/thredds/catalog/ooi/[email protected]/20201117T152256970Z-GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered/catalog.html'

Let's save that url for future reference, so we don't have to rerun this function if we restart the notebook. I would also recommend commenting out the above line to prevent the code from accidentally running again.

In [4]:
url = 'https://opendap.oceanobservatories.org/thredds/catalog/ooi/[email protected]/20201117T152256970Z-GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered/catalog.html'

Loading Data Files

Once the Data Portal has generated your dataset (you should get an email) you can load it. This is a two-step process.

  1. First we run get_filelist(url) to filter out just the files we need from the Thredds url we got earlier.
  2. Then we can pass that list to xarray.open_mfdataset() like any other list of files.
In [5]:
# Grab the list of files
flist = ooilab.get_filelist(url)
flist
Out[5]:
['https://opendap.oceanobservatories.org/thredds/dodsC/ooi/[email protected]/20201117T152256970Z-GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered/deployment0004_GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered_20180101T000001-20180611T193001.nc#fillmismatch',
 'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/[email protected]/20201117T152256970Z-GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered/deployment0005_GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered_20180611T144501-20190810T120001.nc#fillmismatch',
 'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/[email protected]/20201117T152256970Z-GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered/deployment0006_GI03FLMB-RIM01-02-CTDMOG060-recovered_inst-ctdmo_ghqr_instrument_recovered_20190808T133001-20200825T154501.nc#fillmismatch']
In [6]:
# Load the dataset
data = xr.open_mfdataset(flist).swap_dims({'obs': 'time'}).sortby('time')

Plotting Fun

Let's make a quick plot of the data.

In [7]:
# Quickplot
fig,(ax1,ax2) = plt.subplots(2,1, sharex=True, figsize=(8,6))
data['ctdmo_seawater_temperature'].plot(ax=ax1)
data['practical_salinity'].plot(ax=ax2)
ax1.set_xlabel(None);
ax1.set_title('GI03FLMB 30m CTD');

Cleaning Your Data

As we can see, there are some spikes in the salinity dataset. Luckily, ooilab comes with two quick functions you can use to clean your data.

  • reject_outliers(data, sd=5)
  • clean_data(data, min=0, max=100, sd=5)

Let's use the clean_data function, which also calls reject_outliers, to clean this up using the default settings.

In [8]:
data['practical_salinity'] = ooilab.clean_data(data['practical_salinity'])
In [9]:
# Quickplot
fig,(ax1,ax2) = plt.subplots(2,1, sharex=True, figsize=(8,6))
data['ctdmo_seawater_temperature'].plot(ax=ax1)
data['practical_salinity'].plot(ax=ax2)
ax1.set_xlabel(None);
ax1.set_title('GI03FLMB 30m CTD');

It's about Averaging

Finally, let's run a quick daily average of these two variables.

In [10]:
avg_data = data[['ctdmo_seawater_temperature','practical_salinity']].resample(time='1D').mean()

And now let's make a final plot.

In [13]:
# Quickplot
fig,(ax1,ax2) = plt.subplots(2,1, sharex=True, figsize=(12,8),dpi=150)
data['ctdmo_seawater_temperature'].plot(ax=ax1,label='Raw Data')
data['practical_salinity'].plot(ax=ax2)
avg_data['ctdmo_seawater_temperature'].plot(ax=ax1,label='Daily Average')
avg_data['practical_salinity'].plot(ax=ax2)

ax1.legend(loc=1)
ax1.set_xlim([data.time.min(),data.time.max()])
ax1.set_xlabel(None);
ax1.set_ylabel('Temperature (C)')
ax2.set_ylabel('Salinity')
ax1.set_title('GI03FLMB 30m CTD',fontweight='bold',fontsize=16);
plt.savefig('Irminger.png');
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

What do you think?