Tuesday, September 17, 2013

Tutorial: Using the fwrite command to save a subset of data.

At some point, you may find it useful to be able to grab a subset of data from a larger data set, or from multiple data sets, and put it all in a custom GrADS data file with a matching control file.  Luckily GrADS makes this possible through the use of the 'fwrite' function.  I have decided to do a tutorial on this after some experimentation prompted by a few recent discussions.  This tutorial should help you through the basics of using the 'fwrite' function in GrADS to save a subset of data into a new data file, and then write out a .ctl file to open your new data file.

Before we get started with the step by step instruction, lets take a quick look at the 'set fwrite' command in GrADS.  This command gives you the options regarding the binary output you are putting in your file.  I have played with these different options a little bit and so far I have not seen any difference in the result, so I have only been working with the fname option.  Computer scientists may know better how to use the additional options.  A full list of options can be found here.

Now, we can get into the step by step directions.  As always we are first going to start by opening a file. 

    'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs/gfs20130915/gfs_00z'

Now this next bit is optional, but basically I want to set my dimensional boundaries dynamically, so I want to read this information directly from the parent .ctl file.  You can of course set all of these values manually and vary them to customize your data subset as you see fit.

  'q ctlinfo'
  xdef=sublin(result,5)
  ydef=sublin(result,6)
  zdef=sublin(result,7)

  xi=subwrd(xdef,4)          ; *Initial x
  xpts=subwrd(xdef,2)       ;*Total x points
  dx=subwrd(xdef,5)          ;*delta x

  yi=subwrd(ydef,4)          ; *Initial y
  ypts=subwrd(ydef,2)       ;*Total y points
  dy=subwrd(ydef,5)          ;*delta y

  zpts=subwrd(zdef,2)       ; *total z points

  x=1
  count=8
  while(x=1)
    check=sublin(result,count)
    if(subwrd(check,1) = 'tdef')
      x=0
      tdef=sublin(result,count)
    endif
    count=count+1
  endwhile
 
  tpts=subwrd(tdef,2)          ;*Total Time
 dt=subwrd(tdef,5)             ; *Delta T

  'q dims'
  timeline=sublin(result,5) 
  ti=subwrd(timeline,6)
     ;*Initial Time


The block of code above, while it looks long is just setting up the dimension limits to use when calling 'fwrite' and writing the matching .ctl file. The loop is in place to account for the fact that different data sets will have different numbers of levels and that the line corresponding to time will vary between data sets.

Now that our dimensions are set, we are ready to write our data file.  In this example, I will show you how to write two variables at varying levels for a number of time steps.  The variables chosen will be Relative Humidity, and Precipitation Rate.  These can obviously be changed.  The first thing we do is use the 'set fwrite' command to specify our output file.

  'set fwrite mydatafile.dat'

Now, we can set up our dimensions and display our data. 

  'set gxout fwrite'
  'set x 1 'xpts
  'set y 1 'ypts


This sets up our dimensions to encompass the entire horizontal domain.  Now we simply loop through all desired time steps, and all desired levels, setting our levels dynamically as we go for simplicity in writing the .ctl file.

  t=1
  while (i <= 5)
    'set t ' i
    z=1
   levs=''


   while (z <= zpts)  ;*Run through all z points
      'set z 'z
      'q dims'
      level=sublin(result,4)

      lev=subwrd(level,6)
      levs=levs''lev' '  ;*Build a list of levels from the dimensions.  This will go in the .ctl file.
     'd rhprs'             ;*write out relative humidity
     z=z+1
  endwhile
 

* Var 2 (pratesfc) is only a surface variable, so it is only run through once.
 
  z=1
  while (z <= 1)
     'set z 'z
      'd pratesfc'      ;*write out precipitation rate
      z=z+1
   endwhile
   t=t+1

  endwhile 
 
  'disable fwrite'
 
Okay, so the above block of code is pretty much all you need to write out your GrADS data file.  Because you used the 'set gxout fwrite' command, when your display your variable, it won't actually show it on screen, just write it to the file specified by 'set fwrite'.  The trick here is knowing what order to display things.  The way that I have found it to work is by setting time as the parent loop, then variables, and then levels.  For example:

T=1, Time
     Var=1, Vars
         Level=1, Levels
            'd var'
        endloop
    endloop
endloop

In the example here, I simply ran through the variables one at a time inside of the time loop, but the above sequence is good generically.  I used precipitation rate to show that you can also include variables that don't have data at all levels with variables that do.

So, now that that .dat file is made, you need to write out a corresponding .ctl file so you can work with it in GrADS.  This step is pretty simple if you know how to work with .ctl files (if you don't I recommend checking out this page).  The code below shows you how to write out a .ctl file for the example here.

  write (myctlfile.ctl,'dset   ^mydatafile ')
  write (
myctlfile.ctl,'undef  1e+20 ')
  write (
myctlfile.ctl,'title My Test Control File')
  write (
myctlfile.ctl,'xdef 'xpts' linear     'xi'  'dx)
  write (
myctlfile.ctl,'ydef 'ypts' linear   'yi'   'dy)
  write (
myctlfile.ctl,'zdef 'zpts' levels  'levs   )
  write (
myctlfile.ctl,'tdef 5 linear  'ti' 'dt   )
  write (
myctlfile.ctl,'vars  2 ')
  write (
myctlfile.ctl,'rh 'zpts' 11,100  Relative Humidity       [%] ')
  write (
myctlfile.ctl,'prate 1 11,100  Precip Rate       [kg m-2 s-1] ')
  write (
myctlfile.ctl,'endvars')

Basically, you need to tell it to point to your data file, define your undefined values (this can be checked with your parent .ctl file), and then list your dimensions and variables.  A lot of the variables in this example are set dynamically, so you don't have to fudge around with things too much, but usually when you write these out you need to make sure that your dimensions match up to what you actually write out, or you will see some really goofy things.  Be sure not to forget the number of levels after each variable in the vars section.

That should about do it for this tutorial, hopefully this gets you started.  I personally like to use this to grab online data and place it into a local .ctl file for more intensive processing as things move faster locally.  Thanks for reading!

Download Example Script

1 comment:

  1. Thank you very much for this tutorial. It really helps! However, there are some errors in this example. 1. In the loop, the index "t" is confusing with the time dimension. It should be "i"; 2. In the .ctl, "write (myctlfile.ctl,'rh 'zpts' 11,100 Relative Humidity [%] ')
    write (myctlfile.ctl,'prate 1 11,100 Precip Rate [kg m-2 s-1] ')" is supposed to be "write (myctlfile.ctl,'rh 'zpts' t,z,x,y Relative Humidity [%] ')
    write (myctlfile.ctl,'prate 0 t,x,y Precip Rate [kg m-2 s-1] ')".

    After the modification, the code is good to run.

    ReplyDelete