[Date Prev][Date Next][Date Index]

netCDF format





Jon Tischler writes:

> The netCDF standard can store all types of experimental data and has
> official methods for labeling each piece of data with important things that
> one needs to know (like the units).  Unfortunately, the pure netCDF format
> cannot group data into scans or entrys.  In netCDF, every file is only one
> picture or one scan or one image.  Each scan would have to be stored in a
> separate file.  

This is NOT TRUE. A netCDF file can contain an unlimited number of variables,
each of potentially different data types and sizes. As proof of this I append
a little IDL program which creates a netCDF file which contains two images
(arrays). The first is a 512x512 byte array, and the second is a 100x100 
float array. The program writes the file, closes it, and then opens it up and
reads it back in. The output of the IDL session shows that it read back in the
two arrays correctly.

; ************************************************************
; This program produces a netCDF files containing two images.
; Create two IDL variables to write to the file
a = bindgen(512, 512)  ; A byte image 512x512
b = findgen(100,100)   ; A float image, 100x100
; Open the netCDF file
file_id = ncdf_create('test.nc', /clobber)
; Define the dimension varaibles for the two arrays
byte_dim = ncdf_dimdef(file_id, 'Byte array size', 512)
float_dim = ncdf_dimdef(file_id, 'Float image size', 100)
; Define the array variables
byte_id = ncdf_vardef(file_id, 'Byte image', [byte_dim, byte_dim], /byte)
float_id = ncdf_vardef(file_id, 'Float image', [float_dim, float_dim], /float)
; Exit 'define mode', entry 'data mode'
ncdf_control, file_id, /endef
; Write the two variables to the file
ncdf_varput, file_id, byte_id, a
ncdf_varput, file_id, float_id, b
; Close the file
ncdf_close, file_id

; Now open the file and read it back in
new_id = ncdf_open('test.nc')
; Print out information about the file
file_info = ncdf_inquire(new_id)
print, 'Number of variables = ', file_info.nvars
print, 'Number of dimensions = ', file_info.ndims
print, 'Number of variables = ', file_info.nvars
var_info = ncdf_varinq(new_id, 0)
print, 'First variable: name=', var_info.name, ' type=', var_info.datatype
var_info = ncdf_varinq(new_id, 1)
print, 'Second variable: name=', var_info.name, ' type=', var_info.datatype
; Read in the first (byte) image and print out information about it
ncdf_varget, new_id, 'Byte image', var1
help, var1
print, 'Maximum value of var1=', max(var1)
; Read in the second (float) image and print out information about it
ncdf_varget, new_id, 'Float image', var2
help, var2
print, 'Maximum value of var2=', max(var2)
; Close the file
ncdf_close, new_id
end
;************************************************************

Here is the IDL output:

; IDL Version 3.5.1 (vms vax)
; Journal File for BNLX26::RIVERS
; Working directory: USER_DISK:[RIVERS.COMMITTEES.APS_COMPUTER]
; Date: Mon May 16 18:06:15 1994
 
.run netcdf_example
;Number of variables =            2
;Number of dimensions =            2
;Number of variables =            2
;First variable: name=Byte image type=BYTE
;Second variable: name=Float image type=FLOAT
;Maximum value of var1= 255
;Maximum value of var2=      9999.00

************************************************************


> HDF, on the other hand, has great ability to group and
> organize data through the use of Vgroups.  With the merging of the two
> standards, it becomes easy to organize data in the fashion that users
> expect and to include the information needed to plot and ananlyze the data
> at a latter time.

It is true that HDF can group data more flexibly than netCDF. On the other
hand, this may create so much flexibility that very little is actually
standardized! At this point my recommendation would be to use netCDF for
everything it CAN be used for and to go to HDF when one outgrows netCDF. It is
nice that the NCSA HDF code can read/write both file formats.

> I have been designining an implementation in HDF 3.3r3 in order to see what
> requirements must be imposed to obtain useful data files.  Many of John
> Quintana's specifications listed above (2,3,&8) are automatically satisfied
> by the use of HDF.  I beleive that with the netCDF features now available
> in HDF, the other specifications can also be met by using the Vgroup
> feature of HDF to organize multiple SDS's.  The remaining work is in
> deciding the best method for tagging the stored information.

I think there are some real limitations in the use of SDS for even very simple
data which I will  illustrate at the committee meeting next week.

> I will say here that I am trying to make heavy use of the attributes (a
> feature from netCDF) to identify parts of the data to automate plotting and
> analysis.  The presence of a 'units' attribute and the 'udunits.dat' file
> from netCDF is a wonderful thing for an experimentalist to see.

I will be building an IDL GUI file reader for netCDF data to allow P+C
specification of which variables to read and how.

> With HDF 3.3r3 and a small number of additional standards, it should be
> easy to store multiple types of scans (EXAFS, MCA, crystallography,
> diffraction, status, CCD, ...) in one data file and to identify both the
> preferred way to plot and/or analyze that data.
> Writing such a standard data file should be easy to do.  And, reading and
> plotting (or ananlyzing) such a standard data file should also be
> straightforward.

Different groups have different philosphies on how much information should go
in a single data file. Does a 'stream of consciousness' for an entire shift go
in one file, or are there separate files for each scan?

I am glad to see the discussion has started!
    

                                   Mark