Chapter 11.  Writing Attributes To Really Big Files

This section does not apply to netCDF-4 files.

Suppose that we have gigabyte-class files, something like this.

>> nc_dump ( 'test.nc' )
netcdf test.nc {

dimensions:
    lon = 16000 ;
    lat = 8000 ;


variables:
    double data(lat,lon), shape = [8000 16000]


//global attributes:
    :history = "16-Oct-2006 07:53:14:  blah"
}
>> ls -al test.nc
-rw-r--r-- 1 jevans jevans 1024000156 Oct 16 07:55 test.nc
    

So it's about a gigabyte. Let's try to add an attribute to it.

>> tic; nc_addhist ( 'test.nc', 'test_attribute_data' ); toc
Elapsed time is 113.619506 seconds.
    

That's a considerable length of time for something as simple as adding an attribute. So what's going on here? Basically, when you write a global attribute, you are usually exceeding the amount of header space that is set aside inside of the file for attributes, and consequently the entire file must be re-written. It's all being handled by the netcdf library for you, but when we have gigabyte-sized files, that's a lot of effort for just a few extra bytes. Fortunately, there is a way of reserving a large amount of header space, and we can set that with nc_padheader.m. Let's add about 20000 bytes to the header, that should be enough for any additional global attributes.

>> nc_padheader ( 'test.nc', 20000 );   
>> tic; nc_addhist ( 'test.nc', 'test_attribute_data' ); toc
Elapsed time is 0.455081 seconds.
    

The call to nc_padheader takes awhile, but as you can see, subsequent attribute writes go much faster because enough space has been reserved for them.

Note that the local filesystem here was XFS, which is ideal for really large files. Writing an attribute to a netCDF file with insufficient header space on an EXT3 filesystem took almost twice as long.