This section does not apply to netCDF-4 files.
Suppose that we have gigabyte-class files, something like this.
>> nc_dump ( 'test.nc' ) netcdf test.nc { dimensions: lon = 16000 ; lat = 8000 ; variables: double data(lat,lon), shape = [8000 16000] //global attributes: :history = "16-Oct-2006 07:53:14: blah" } >> ls -al test.nc -rw-r--r-- 1 jevans jevans 1024000156 Oct 16 07:55 test.nc
So it's about a gigabyte. Let's try to add an attribute to it.
>> tic; nc_addhist ( 'test.nc', 'test_attribute_data' ); toc Elapsed time is 113.619506 seconds.
That's a considerable length of time for something as simple as
adding an attribute. So what's going on here? Basically, when
you write a global attribute, you are usually exceeding the amount of header
space that is set aside inside of the file for attributes, and
consequently the entire file must be re-written. It's all being
handled by the netcdf library for you, but when we have gigabyte-sized
files, that's a lot of effort for just a few extra bytes. Fortunately,
there is a way of reserving a large amount of header space, and we
can set that with nc_padheader.m
. Let's add
about 20000 bytes to the header, that should be enough for any additional
global attributes.
>> nc_padheader ( 'test.nc', 20000 ); >> tic; nc_addhist ( 'test.nc', 'test_attribute_data' ); toc Elapsed time is 0.455081 seconds.
The call to nc_padheader
takes awhile, but as you
can see, subsequent attribute writes go much faster because enough space
has been reserved for them.
Note that the local filesystem here was XFS, which is ideal for really large files. Writing an attribute to a netCDF file with insufficient header space on an EXT3 filesystem took almost twice as long.