Next: , Previous: New-inq-Functions, Up: API-Extensions


4.9 Parallel I/O with NetCDF

Parallel I/O allows many processes to read/write netCDF data at the same time. Used properly, it allows users to overcome I/O bottlenecks in high performance computing environments.

4.9.1 Parallel I/O Choices for NetCDF Users

Parallel read-only access can be achieved netCDF files using the netCDF C/Fortran library. Each process can run a copy of the netCDF library and open and read any subsets of the data in the file. This sort of “fseek parallelism” will break down dramatically for any kind of writing.

There are two methods available to users for read/write parallel I/O netCDF-4 or the parallel netCDF package from Argonne/Northwestern. Unfortunately the two methods involve different APIs, and different binary formats.

For parallel read/write access to classic and 64-bit offset data users must use the parallel-netcdf library from Argonne/Northwestern University. This is not a Unidata software package, but was developed using the Unidata netCDF C library as a starting point. For more information see the parallel netcdf web site: http://www.mcs.anl.gov/parallel-netcdf.

For parallel read/write access to netCDF-4/HDF5 files users must use the netCDF-4 API. The Argonne/Northwestern parallel netcdf package cannot read netCDF-4/HDF5 files.

4.9.2 Parallel I/O with NetCDF-4

NetCDF-4 provides access to HDF5 parallel I/O features for netCDF-4/HDF5 files. NetCDF classic and 64-bit offset format may not be opened or created for use with parallel I/O. (They may be opened and created, but parallel I/O is not available.)

A few functions have been added to the netCDF C API to handle parallel I/O. These functions are also available in the Fortran 90 and Fortran 77 APIs.

4.9.2.1 Building NetCDF-4 for Parallel I/O

You must build netCDF-4 properly to take advantage of parallel features.

For parallel I/O HDF5 must be built with –enable-parallel. Typically the CC environment variable is set to mpicc. You must build HDF5 and netCDF-4 with the same compiler and compiler options.

The netCDF configure script will detect the parallel capability of HDF5 and build the netCDF-4 parallel I/O features automatically. No configure options to the netcdf configure are required. If the Fortran APIs are desired set environmental variable FC to mpif90 (or some local variant.)

4.9.2.2 Opening/Creating Files for Parallel I/O

The nc_open_par and nc_create_par functions are used to create/open a netCDF file with the C API. (Or use nf_open_par/nf_create_par from Fortran 77).

For Fortran 90 users the nf90_open and nf90_create calls have been modified to permit parallel I/O files to be opened/created using optional parameters comm and info.

The parallel access associated with these functions is not a characteristic of the data file, but the way it was opened.

4.9.2.3 Collective/Independent Access

Parallel file access is either collective (all processors must participate) or independent (any processor may access the data without waiting for others).

All netCDF metadata writing operations are collective. That is, all creation of groups, types, variables, dimensions, or attributes.

Data reads and writes (ex. calls to nc_put_vara_int and nc_get_vara_int) may be independent (the default) or collective. To make writes to a variable collective, call the nc_var_par_access function (or nf_var_par_access for Fortran 77 users, or nf90_var_par_access for Fortran 90 users).

The example program below demonstrates simple parallel writing and reading of a netCDF file.