Showing posts sorted by relevance for query label:Tutorial label:Beginner. Sort by date Show all posts
Showing posts sorted by relevance for query label:Tutorial label:Beginner. Sort by date Show all posts

Friday, June 7, 2013

Tutorial: Tips on handeling more than 1 file at a time in GrADS

While it is possible to use one control file to work with multiple data or binary files by using a few options in the .ctl format (specifically for files of identical spatial dimensions but for different time periods), I find it much easier to move around numerous .ctl files in GrADS.  This tutorial is going to show you how to open more than 1 file at a time as well as address the main hiccups encountered when handling more than one file.

To start out we will need two files to work with.  For simplicity we will use two files with the same spatial dimensions and variables.  Before we start, since we are dealing with more than one file in this tutorial, we will use the 'reinit' command to close all of our current files as to not get ourselves tripped up.  Then, we'll start by opening these two files.

    'reinit'
    'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_hd_00z'
    'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_hd_12z'

Once our two files are open, we will use the 'q files' (not to be confused with q file) command, to list our open files.
 
Output:

    File 1 : GFS half degree (0.5x0.5) fcst starting from 00Z04june2013, downloaded June 04 04:32 UTC 
    Descriptor: http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_h_00z
    Binary: http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_hd_00

    File 2 : GFS half degree (0.5x0.5) fcst starting from 12Z04june2013, downloaded June 04 16:33 UTC

    Descriptor: http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_h_12z
    Binary: http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130604/gfs_hd_12

So basically, you just get a little information regarding your open files, the most relevant of which is your file number.  Since GrADS only works with one file at a time, you need to specify which file you are working with.  This is best done using the 'set dfile' command.  This changes your default file around. So:

    'set dfile 1'  - Points to file 1, in this case 00z GFS
    'set dfile 2'  - Points to file 2, in this case 12z GFS


Once your dfile is set, you treat GrADS as you normally would with just one file.  So, we will continue this example by plotting CAPE from the GFS output, we will start with file 1.

    'set dfile 1'
    'set gxout shaded'
    'set lat 22 50'
    'set lon 230 295'
    'set mpdset hires'
    'set t 5'
    'd capesfc'


Now, lets move on to file 2, and plot CAPE at the same time as a model comparison.  However, since the time of the model initialization has switched, when we work with file 2, we will point to the actual time, not the timestep as we did in file one.  So our command will be 'set time', not 'set t'  First, we will need to get our time though, and we will do that with the 'q dims' command.

By doing that, we get our time is equal to: 12Z13MAY2013.  Now we simply switch files:

    'set dfile 2'
    'set time 12Z04JUN2013'
    'set gxout contour'
    'd capesfc'

CAPE at 12Z June 4th from 00z GFS (shaded) and 12z GFS contoured

Now, we have our comparison map, since these files were identical spatially, we did not need to change the lat/lon dimensions around.  However, I will mention that in general it is not a bad practice to reset all of your space and time dimensions every time you switch from one file to the next, just to ensure that you are plotting everything correctly.

Now, this brings me into the special case:  If both of your files have the same spatial/time dimensions (for example if you were comparing two runs of the WRF, or different GFS ensemble members), you can short cut without having to go through the 'set dfile' commands.

Similar to the GrADS array syntax you can simply put a '.filenumber' at the end of your plotted variable:

   'd var.1'  - Plots var from file 1
   'd var.2'  - Plots var from file 2


Again, only use this shortcut if your model dimensions, are identical, e.g., grid spacing, timestep, etc.  The main problem you will run into handling multiple files in GrADS is an "Entire Grid Undefined" error.  This is simply because you didn't reset the dimensions to fit with your file.  If you get this error, simply do a 'q dims' command, see where x,y,z,t fall outside of your file dimensions.

So that is all for the basics of file handling in GrADS.  Hope you found it useful!

Download Example Script 

Tuesday, May 7, 2013

Tutorial: Shapefile Basics

The capability of GrADS to read shapefiles brings a whole new dimension to the plotting software.  It provides you with a whole new range of possibilities when it comes to plotting data.  With the use of shapefiles, you are no longer limited to generic maps with data plotted.  Shapefiles provide you the possibility of plotting everything from county boundaries, to roads and rivers, to congressional districts, weather watches and warnings, and much more.  This tutorial will give you a very basic introduction into using shapefiles in GrADS.

It is recommended (not required): that you have both color.gs and cbar.gs in your script library to plot the RADAR color scale and associated colorbar.

For this tutorial, we will simply plot county boundaries on a map of the US.  Before we get started, we will first need to download the proper shapefiles.  The county shapefiles are available from NOAA here.  This site should provide you with a table showing recent versions of the shapefile, download and unzip it.  The list of files included within the zipped folder shows four files, these are the files from the April 2013 version:
  • c_02ap13.prj
  • c_02ap13.shp
  • c_02ap13.shx
  • c_02ap13.dbf 

Most shapefiles I have encountered have some combination of these files, and because each shape tends to come with multiple files, I have found it useful to create a separate folder to store my shapefiles in, aptly named: Shapefiles.

Now that you have the county shapefiles in your proper folder we can begin with the tutorial.  The first, and possibly most important thing to know about shapefiles is that you need to plot something from your GrADS data before you can draw your shapefiles.  If you don't you will get an error that says: "No Scaling Environment."   

So, in order to plot our counties, we need to plot a variable first, which means we will need a file.  For this example we will use the RAP model using the 'sdfopen' command.

     'sdfopen http://nomads.ncep.noaa.gov:9090/dods/rap/rap20130507/rap_00z'


Now that our data file is open, the rest of the script is very short and most of it is just setting up the domain and display options.  If you have read through any previous tutorials, then next block of code will look familiar.


    'clear'
    'set lat 22 51'
    'set lon -128 -65'
    'set gxout shaded'
    'set mpdset hires'
    'set map 0'     ;*Sets map outline color to Black


With the domain set up, we just need to plot the variable.  For this example, we will plot the model predicted RADAR reflectivity.  I included in this block of code with a good approximation of the NWS color scale for reflectivity using color.gs.

 *If you have color.gs*
     'color 5 75 5 -kind white->dodgerblue->blue->lime->green->darkgreen->yellow->goldenrod->orange->firebrick->red->darkred->fuchsia->indigo'
**********************
     'd refcclm'
     'cbar'

  
Now that the RADAR is plotted, its time to draw the shapefile.  This is perhaps the easiest part of the tutorial.  You simply use the 'draw shp' command and point to the file path.  In this example:

    'draw shp Shapefiles/c_03ap12.shp'

This command is all you need to draw the counties on the map.  If you wish to change the properties of the shapefile you can do so using a few additional commands.  The outline color can be set using the 'set line command.  The fill properties can be set using the 'set shpopts' command.  By default, shapefiles are not filled.

    'set line 0'       ;*Where the outline color is set to 0
     'set shpopts 15'    ;*Fills shapefile, where 15 is the number of the desired fill color. A value of -1 turns off fill.

Depending on your exact output options, your final map should look similar to this.

RAP Reflectivity map with counties

Using shapefiles in GrADS can get much more complex.  However, for the purposes of this tutorial, I wanted to provide a real basic introduction to shapefiles in GrADS.  I may, at a later time, write up more advanced tutorials for shapefiles.  In the mean time, if you are interested in learning more about shapefiles in GrADS, I encourage you to check out this resource to get a more comprehensive look.

Again, you can download shapefiles for pretty much everything you could imagine, here is just a small list of shapefile resources to get you started.
 
Download this Script

Thursday, May 2, 2013

Tutorial: Basics of Scripting

This tutorial is short and focuses on the very basics of GrADS scripting.  It is the goal of this tutorial to provide a little background on scripting and to help the budding GrADS user avoid various (and frustrating) syntax pitfalls often encountered while scripting with GrADS.

Why Script?  Easy; because it's annoying to keep typing the same commands over and over in the console. Scripting allows you to place all of your commands in one file, and then execute a number of GrADS commands in sequence with a single command.  Additionally, scripting allows you to further extend the capability of the GrADS software; for example you can use tools such as mathematical functions, conditional statements and loops within a script, whereas you cannot use these tools using the command line.  There are however a few important syntax issues that differ when using scripting vs. using the command console.

How to start a new GrADS script:  It is straightforward to start a new GrADS script.  All you need to do is open up a text file and start adding various GrADS commands to it, in the order that you wish to execute the commands.  When you want to run the script, simply save it with the suffix ".gs", If you are using Windows, be sure to change the "save as" option from ".txt" to "all files", and use the ".gs" suffix.  To run the script in GrADS, open GrADS, and type in the name of the script (be sure to include the path if it is in a different directory than the one you are running GrADS out of).

Scripting Basics: How to avoid syntax errors when writing your GrADS Script:  The following list is designed to help you learn some of the basic syntax involved in using GrADS commands within a script.  Because scripting allows you to use functions, and variables that fall outside of the basic GrADS command structure, you must be able to distinguish GrADS commands within your script.  Below is a list of several syntax quirks and methods used in the GrADS scripting format.
  • GrADS commands must be inside single quotes (')
  • Variables must be outside of the single quotes
  • NO indentations are allowed, avoid the TAB button keep at all costs
  • Lines can be commented out using the astrix  (*)
    • Note: When commenting the * must be on the farthest left column of your text
  • Semi-colons (;) can be used in place of a new line
  • The results of queries are put into the default variable 'result'
  • Concatenating (adding) strings is easy, just type them in sequence
    • e.g., str1='Hello' and str2='World': str1' 'str2 outputs as 'Hello World'
  • Mathematical operations involving control file variables occur inside single quotes (')
    • For example, converting Temperature in C to F would look like 'temp_f=(tmp_c-32)*5/9'
  • Mathematical operations involving variables not within the control file occur outside of single quotes (')
  • Built in math functions on control file variables are inside single quotes
    • For example, 'cosZ=cos(Z)'
  • Built in math functions on variables not within the control file occur outside single quotes and usually require the prefix "math_"
    • For example, cozZ=math_cos(Z)

Example Scripts:
(Note: the file dates within the example scripts may need to be changed to be more recent for these scripts to work)

To run the following examples, copy paste the text into a file and save it with the suffix ".gs".  Then simply type the name of the script in the GrADS console.
  • Query file dimensions and output date/time for the May 2, 2013 00z GFS
          'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130502/gfs_hd_00z' 
          'q dims'
          say result ;* Say command prints to console
         timestring=sublin(result,5) ;*5th line of the result
         date=subwrd(timestring, 6) ;*6th word of the time string
         say 'Date: 'date  ;* Note the string concatenation here

         The following output from this script will appear on the console:

          Default file number is: 1
          X is varying   Lon = 0 to 360   X = 1 to 721
          Y is varying   Lat = -90 to 90   Y = 1 to 361
         Z is fixed     Lev = 1000  Z = 1
         T is fixed     Time = 00Z02MAY2013  T = 1
         E is fixed     Ens = 1  E = 1


        Date: 00Z02MAY2013

 Download this script
  • Prompt user for a variable to plot and plot variable
        'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/gfs_hd20130502/gfs_hd_00z' 

        say 'Please Enter a Variable to Plot: (1=CAPE, 2=PRECIPITABLE WATER, 3=CLOUD COVER)'

        check=1
        while(check=1)

         pull var       ;*Pull command prompts user to enter a variable
          if(var !=1 & var !=2 & var!=3);check=1; say 'Not an acceptable choice, please choose again';else
            if(var=1);plot='capesfc';endif
            if(var=2);plot='pwatclm';endif
            if(var=3);plot='tcdcclm';endif
            say 'Plotting Variable: 'plot
            check=0
          endif
        endwhile
        'set gxout shaded'
        'd 'plot

        'draw title 'plot
     
  'printim img_test.png'    ;*printim command


Download this script 
      
     The preceding script will prompt the user for a variable and output an image similar to the one here (depending on what variable you choose):

  
Output image from example 2


Hopefully these example use enough of the basic scripting functions to give you an idea of how to use them.  You can see the use of the semi-colon in place of new lines to save space in the scripts, the use of the astrix to denote comments and the use of the say/pull commands.  Scripting can be kind of tricky to get a handle of at first, but it's a skill that comes with practice.  Good Luck!

This post was updated on March 13th 2014.

Friday, May 3, 2013

Tutorial: How to make a basemap using GrADS

Note: This tutorial was updated October 2019 to reflect the fact that the Grads Data Server has been discontinued, so the online access of the terrain data through monsoondata.org is no longer available.

This tutorial was first published in 2013, with a tutorial on how to create a basemap of the United States.  In updating the tutorial for 2019, I have decided to alter the tutorial to focus on a different region of the globe, Europe to be specific.  Functionally, the concepts are the same (see images below).

Introduction:
Basemaps are a great way to add some color and vibrance to your weather maps.  GrADS allows you to output data onto a pre-generated .png image (i.e., a basemap) instead of redrawing the map everytime you want to make a figure.  This can be a big time saver if you're making several images in sequence.  This tutorial will show you how to make a basemap in GrADS using the 'basemap.gs' script and accompanying polygon files to mask the ocean..  You will learn how to:
  1. Download and open netCDF topography data from NOAA 
  2. Use the and the "basemap" script from the GMU GrADS Script Library to mask out oceans. 
It is recommend that you check out the basemap overlay tutorial if you are unfamiliar with GrADS basemaps.
Finished Basemap: 2013 TutorialFinished Basemap: 2019 Tutorial

So, for this tutorial you must download the following files from the GMU GrADS Script Library.
  • basemap.gs
  • lpoly_lowres.asc
  • opoly_lowres.asc
  • lpoly_mres.asc
  • opoly_mres.asc
  • lpoly_hires.asc
  • opoly_hires.asc
In addition to these scripts, it is recommended that you have the color.gs and/or the colormaps script your personal script folder so you can use the same color settings as this tutorial.

Where to get the data:
While the GDS has been discontinued, you can still access high resolution (1min) topography datasets in netCDF format from NOAA.  Simply access this link and download either the full ETOPO1 (netcdf) file, or use the NCEI "Grid Extract Tool" to subset the data to a geographic region of your choice.  If you're using the Grid Extract Tool it will look similar to the screen shot below.

Screen shot of grid extract tool.

Once you have the netCDF (etopo1.nc) file locally, move it to your own data folder: (e.g., home/grads_data/terrain/).

How to make the basemap:
Basemaps are fairly easy to make, and can essentially be boiled down to three simple steps.
  1. Load the topography data and set a domain.
  2. Define a topography color scale (this is why we need color.gs)
  3. Fill in the water areas using the ".asc" files

Step 1: Load the topography data and set a domain:

This is probably the most important step in making a basemap.  In order to use the basemap later for your weather maps, you must know a) The lat/lon boundaries, b) the size of the image in pixels, and c) the GrADS page area of the map (set using the 'parea' command).  I strongly recommend that you write down your lat/lon boundaries, your page area coordinates,and your x and y pixel size for saving the image so that you know how to match data on top of the map.

    'set display color white'
    'set mpdset hires'
    'clear'
    path = 'home/grads_data/terrain/' ; *or whatever your path to the etopo1.nc file is.
    'sdfopen 'path'etopo1.nc'

You can run a "qfile" on the data to look at the metadata if you would like, but it's not necessary for this tutorial.

Now to set the actual domain.  So, to make the image above, we will set the domain for the United States.  We will do this using the 'set lat' and 'set lon' commands.

  'set lat 30 60'
  'set lon -16 35'

At this point, we have our file open and our domain set.  So the last thing we need to do before moving on to step 2, is define our page area.  Remember this is important, so you know how to scale your weather maps in GrADS to fit on this image.  We do this using the 'set parea' command.  For the above image, the page area is set by:

   'set parea 0.5 10.0 0.5 7.5'

Now that we have our area, we can move on to step two.

Step 2: Define a Topography Color Scale

This is very easy with the color.gs script, all you need to do is pick out a range of colors.  For the old US map, we used the color.gs script to make a color scale that transitioned from light green to brown, that fit the range of topography in the US (0 to ~4000 meters).

    int=(4100-200)/50       ;*int sets the interval used for topography, this scale sets 50 intervals.
    'color 200 4100 'int' -kind lightgreen->tan->brown'

For the 2019 update, I chose to use the colormaps script to define the "terrain2" color scale as follows:

  'colormaps_v2 -map terrain2 -levels 0 3500 100'

Choosing a color scale is entirely, up to you, but the above options look nice to me.  Once the color scale is set, simply display the variable.


Now, our basemap is starting to take shape and we are set to move on to the final step.

Step 3: Fill in Water areas
This is the last step, and the only step that involves the use of basemap.gs.

To fill in the water areas, you will need to run the basemap.gs script as such:

    'basemap O 11 1 M'

In this command, the O stands for Ocean, so the script will fill in bodies of water.  The two numbers correspond to the fill and outline colors respectively.  11 for medium blue and 1 for black.  Lastly, the M stands for medium resolution, so this will use the file opoly_mres.asc to fill in the ocean.  The reason to use mres instead of hires, is because the hires oceans only cover the oceans around North America, so since this tutorial looks over at Europe, you need to use mres.

Now, that you have a nice looking basemap, all thats left is to save the image using the 'printim' command.  Be sure to specify the image dimensions, so that you can match them later when you save GrADS output over your basemap.

    'printim img_name.png x800 y600 png'

And that's it, that's all there is to it! You now have a nice, elegant basemap to serve as a background for your GrADS output.  To learn more how to output GrADS data onto your basemap, check out this tutorial.

Note: This example script is for the 2013 tutorial with the broken GDS link.

Download Example script


Sunday, November 22, 2020

Python | Tutorial: Intro to Cartopy

Introduction:

There are currently two main Python libraries for plotting geographic data on map: Cartopy and Basemap.  New users should use the Cartopy since support Cartopy will replace Basemap and support for Basemap is expected to wrap up in 2020.  Therefore, this tutorial will focus on Cartopy.  If you want to use Basemap (e.g., for plotting in "3D"), here are a couple of links to existing tutorials and examples to help get you started.  Furthermore, many concepts described here may be useful for understanding Basemap.

Cartopy:

Cartopy is a geospatial plotting library built on top of Numpy and Matplotlib that makes plotting gridded data, shapefiles, and other geographic data on over 30 different map projections.  Furthermore, the Cartopy "transform" functionality makes it straightforward to convert data from one projection to another.  In this tutorial you will learn how to use Cartopy to:
  • Plot GFS surface temperature data on a map
  • Mask out land surfaces to plot a map of sea surface temperature and ice-cover from the GFS
  • Plot a regional map of surface temperature with US states
  • Use the transform function to plot GFS data on a different map projection

This tutorial accesses the NOMADS data server using the netCDF4 library, if you are unfamiliar with doing this, I recommend you see my tutorial on reading NetCDF data.


Getting Started - Grabbing GFS Data:
The following code is used to get started with this tutorial, essentially, it imports all of the required modules, pulls the GFS data from the NOMADS data server, and loads it into arrays.  Once the data is loaded, the remainder of the tutorial will focus on Cartopy. 

import numpy as np
from matplotlib import pyplot as plt
from cartopy import crs as ccrs
import cartopy.feature as cfeature
import netCDF4 as nc
import datetime as DT 
datetime=(DT.datetime.utcnow()-DT.timedelta(hours=6)).strftime('%Y%m%d')
fhour=20

nomads_path="https://nomads.ncep.noaa.gov:9090/dods/gfs_0p25/gfs{}/gfs_0p25_00z".format(datetime)

datafile=nc.Dataset(nomads_path)
lat=datafile.variables['lat'][:]
lon=datafile.variables['lon'][:] 
sfcT=datafile.variables['tmpsfc'] #Keep meta data for now!
sfcT_data=sfcT[fhour,:]

title=sfcT.long_name
time=datafile.variables['time']
tstart=DT.datetime.strptime(time.minimum,'%Hz%d%b%Y')
timestr=(tstart+DT.timedelta(hours=time.resolution*24.*fhour)).strftime('%c')

Now that your data is loaded, we initialize a figure, and subplot using matplotlibs "projection" keyword to link to the Cartopy library.  Once you've attached the Cartopy package to your subplot, there are a number of additional objects and classes you can add to your figure.  In the first example, I'll demonstrate by calling upon the "coastlines" attribute to plot continents on the map.

fig=plt.figure(figsize=(11,5))
ax=plt.subplot(111,projection=ccrs.PlateCarree())

Now, your figure and subplot are defined.  The Cartopy "PlateCarree()" projection is a basic cylindrical map projection that accepts coordinate information in the form of lat/lon coordinate pairs.  The lat/lon data can be either 1D or 2D, but essentially, if your coordinate information is lat/lon, your projection is PlateCarree().   You can pass 2 keywords into PlateCarree (central_longitude and globe), but in most instances, you won't need to.   Occasionally, I reset the central_longitude to 180, instead of it's default zero.

The code to plot the map is:
Z=plt.pcolormesh(lon,lat,sfcT_data,cmap='jet')

pos=ax.get_position()

plt.title(title,fontweight='bold',loc='left')
plt.title("Valid:{} UTC".format(timestr),loc='right')
ax.coastlines()

cbar_ax=fig.add_axes([pos.x1+0.01,pos.y0,0.015,pos.height])
cbar=plt.colorbar(Z,cax=cbar_ax)
cbar.set_label(title)

datafile.close()

plt.show()

And you'll get an image similar to this one:
GFS surface temperature from Cartopy

Now, lets say you wanted to mask out land areas, and add sea-ice to the map, to focus on the oceans.
Adding the sea-ice is easy, all it requires is for you to pull the icecsfc variable from the GFS data, and plot it overtop of the surface temperature map:

plt.pcolormesh(lon,lat,np.ma.masked_less(ice,0.1),vmin=0,vmax=1,cmap='Greys_r',zorder=3)

We use the numpy mask (ma) functionality to maskout gridcells where the ice cover is less than 10%.
To mask out land areas, we make use of the "Cartopy Feature" module, which handles shapefiles.  The Cartopy Feature module connects seemlessly to the Natural Earth Dataset, and allows for anyone to plot a host of different GIS datasets without needing to download individual shapefiles before hand.  The interface even has many pre-defined functions, including land.  This makes masking out the land areas, a breeze.

ax.add_feature(cfeature.LAND,zorder=4,color='gray')

In the above code, the "add_feature" function is used, and the pre-defined "LAND" feature from the Cartopy Feature package is supplied with a couple of basic keyword arguments.  Putting it all together:


fig=plt.figure(figsize=(11,5))
ax=plt.subplot(111,projection=ccrs.PlateCarree())
Z=plt.pcolormesh(lon,lat,sfcT_data-273.15,cmap='jet',vmin=-1,vmax=30.)
plt.pcolormesh(lon,lat,np.ma.masked_less(ice,0.1),vmin=0,vmax=1,cmap='Greys_r',zorder=3)
ax.add_feature(cfeature.LAND,zorder=4,color='gray')
pos=ax.get_position()

plt.title(title,fontweight='bold',loc='left')
plt.title("Valid:{} UTC".format(timestr),loc='right')
ax.coastlines(color='k',zorder=5)

cbar_ax=fig.add_axes([pos.x1+0.01,pos.y0,0.015,pos.height])
cbar=plt.colorbar(Z,cax=cbar_ax)
cbar.set_label(title)

datafile.close()

plt.show()

Running the above code will give you an image like this:
GFS SST and Sea-ice cover


Now, lets zoom in over the United States, and add states to the map.  To clip the extent, the "set_extent" function is used:

ax.set_extent([-125, -65, 25, 55],crs=ccrs.PlateCarree())

The list corresponds to lon/lat boundaries from [western-most longitude, eastern-most longitude, southern-most latitude, northern-most latitude] and the crs argument indicates which transform is being applied.  To add the states, you need to define the states from Natural Earth database. the "scale" argument

states = cfeature.NaturalEarthFeature(category='cultural', scale='50m', facecolor='none',
                         name='admin_1_states_provinces_shp',edgecolor='k')

 By changing the extent, and adding the states, you'll get a map that looks like this:

GFS SST centered over the United States


You'll notice that the "LAND" feature masks out the Great Lakes, which is unfortunate, and to my knowledge the only way to get both the land mask and the SST over lakes using GFS surface temperature data, is to use a Lakes shapefile with shapefile clipping, a technique that is beyond this simple introduction, but may be the subject of a future tutorial.

Finally, if you want to use a different map projection to plot the data, e.g., a North Polar Sterographic projection, you simply define a different projection when setting up your subplot, and use the "transform" keyword in the plotting function to specify that you are transforming your lon/lat coordinates from the PlateCarree() projection:

ax=plt.subplot(111,projection=ccrs.NorthPolarStereo())
Z=plt.pcolormesh(lon,lat,sfcT_data-273.15,cmap='jet',vmin=-1,vmax=30.,transform=ccrs.PlateCarree())
plt.pcolormesh(lon,lat,np.ma.masked_less(ice,0.1),vmin=0,vmax=1,cmap='Greys_r',zorder=3,transform=ccrs.PlateCarree())
ax.set_extent([-180, 180, 45, 90],crs=ccrs.PlateCarree()) 

Note, that we also reset the extent to focus on the Northern Latitudes.  You should get something that looks like this:
GFS Sea-ice and SST Polar Plot

That is the extent of this introductory tutorial on how to use Cartopy to work with atmospheric data.  You can learn more from the Cartopy Website, and get a list of different map projections here Cartopy Map Projections. Future tutorials on Cartopy will focus on working with shapefiles, transforming individual points, how to work with data without lat/lon coordinates, and how to put fine details on your map.

Python | Tutorial: Introduction to Numpy

While this post has (almost) nothing to do with atmospheric science, it is an important introduction to the Numpy package, which is how you will be storing and working with data in Python.  This tutorial is heavily geared towards beginners and those who want to strengthen their background understanding of Numpy.  If you're already comfortable using Numpy to work with data, there probably isn't a whole lot new in this tutorial for you to learn, but who knows.

As stated in the introduction to Python post, Python stores vector data in lists, tuples, and dictionaries.  Numpy is an all encompassing package that transforms these types into numerical arrays and allows you to carry out efficient, vectorized, and higher order numerical functions.  To demonstrate using a simple example, if you wanted to add to vectors stored as Python lists:

list1=[5,5,5]
list2=[5,5,5]

list3=list1+list2

You end up with the result:

list3=[5, 5, 5, 5, 5, 5]

Which is not addition, but rather list concatenation.  To actually add the numbers together to get a vector of length three with "10s" for each value, you need to loop through the list.  To demonstrate how to do this with lists here is simple function that uses the zip function to access both lists at once:

def list_add(list1,list2):
    list3=[]
    for i,j in zip(list1,list2):
        list3.append(i+j)
    return list3

In the above function, a third list (creatively named "list3") is initialized with no values, and a "for loop" is used to loop through the input lists and add the list values together, storing them in list3.


list3=list_add(list1,list2)

result: [10, 10, 10]

which is the result we were looking for.  Great, we can now add vectors together using lists.  Although, addition is a pretty simple thing, and you can probably imagine the "list_add" function starts to get really complex as you grow to multidimensional arrays.  Enter Numpy.  Numpy allows you to define arrays, which behave much more intuitively for data analysis.  I fact, I don't think it's too much of a stretch to say that, almost 90-95% of the data structures you will ever work with in Python will involve using Numpy arrays.  Building off of the above example, if we repeat the exercise, only replacing "list1" and "list2" with Numpy arrays, call them "array1" and "array2":


import numpy as np

array1=np.array([5,5,5])
array2=np.array([5,5,5])

print(array1+array2)

result: [10, 10, 10]

Much simpler, in contrast to adding the two lists together, which required defining a function and using a for loop, simply adding the two arrays gives the correct solution.  Additionally, since Numpy is optimized, it is way more efficient to use Numpy array operations than it is to use loops.

Numpy array intrinsic attributes and functions

What's great about Numpy, is that it takes full advantage of Python object oriented framework when storing data.  Specifically, when you define a Numpy array, it isn't only a block of numbers, it's an entire data object with attributes and functions attached to it that make it more intuitive to work with.  To briefly demonstrate, a couple of the more useful attributes, a two-dimensional array will be defined using the np.ones function (which defines an array of ones given specific dimensions):

ones=np.ones((10,5)) #Note passing the array dimensions as a list instead of a tuple works too

This will define an array of 5 columns and 10 rows, we can confirm that using the shape attribute:

print(ones.shape)
 
result: (10,5)

You can also find out what kind of array it is using the dype attribute (e.g., is it an integer or floating point array).

print(ones.dype)
 
result: float64

You can also easily transpose the array:

ones_transpose=ones.T
print(ones.shape,ones_transpose.shape)
 
result:(10,5),(5,10)

Or define a new array with the same size and attributes:

twos=np.ones_like(ones)*2

This is just a small sampling of the features and functions intrinsic to any defined Numpy array, that I happen to find most useful.  There are dozens more that I won't describe here.

Numpy: My Top 5 Numpy functions for Atmospheric Science Research.

Perhaps this heading should be at the very top of this blog post, since the previous sections mostly dealt with trying to provide a cursory explanation of what makes Numpy distinct and powerful.  These are the 5 functions I use the most, not including the numpy.ma (masking) package which is extremely useful, but needs a dedicated blog post to adequately describe.
  1. Numpy linspaceand arange

    Two functions for the price of one.  Basically, these functions allow you to quickly create vectors that span a range.  The difference between the two functions is best summarized as follows: 
    • arange takes up to three arguments: start, end, increment
    • linspace also takes up to three arguments: start, end, vector length 

  2. lines=np.linspace(0,10,11)
    arange=np.arange(0,10,2)
    
    print(lines)
    print(arange)
    
    result (linspace): [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
    result (arange): [0 2 4 6 8]
    

    A few things to unpack from this example, note that the arange vector stops at 8, despite the second argument being 10.  This is because the non-inclusive nature of Python for the right hand side:  Basically, arange is incrementing 2 and stops before it hits 10.  Linspace on the other hand just builds a range interpolating between 0 and 10 that has 11 values.
  3. Numpy Meshgrid

    I use meshgrid a lot when I need a 2D spatial array for mapping purposes, but only have two 1D vectors that define my coordinate system.  I don't know of a much better way to describe it beyond that, so I'll just throw up an example.

    lon=np.linspace(0,360,100)
    lat=np.linspace(-90,90,50)
    
    lon,lat=np.meshgrid(lon,lat)
    
    print(lon.shape)
    
    result: (50,100)
    

    In this example, the 1D vectors are rebroadcasted into 2D arrays taking on the shape of the original 1D vectors in each dimension.  Now, you have 2D coordinate information that is often required for certain functions (e.g., 2D interpolation, or regridding) Be warned: when using this function, if you redefine lat/lon as 2D arrays (as in this example), and you for some reason try to run meshgrid again, for instance as part of a loop or in a notebook setting, it's very easy to run out of memory. 

  4. Numpy Concatenate

    Concatenate is a great way to build an array without knowing it's dimensions up front.  For example, lets say you have a collection of netCDF files with gridded data, and all of these files have matching spatial coordinates, but each file has a different number of time variables, and different time coordinates.  If you want to put all of this information into an array, you can't simply define the array size up front, you need to add to the time dimension as you loop through each file.  Concatenate allows you to simply take an array and append it to another array along a specified dimension.  To simplify the example here: 
    array1=np.ones((5,50,40))
    array2=np.ones((3,50,40))
    array3=np.ones((7,50,40))
    
    array_total=np.concatenate((array1,array2),axis=0)
    array_total=np.concatenate((array_total,array3),axis=0)
    
    print(np.shape(array_total))
    
    result: (15,50,40)
    

    See that the end result has 15 in the 0 axis dimension (5+3+7).  Note that ALL axes except the specified axis must match, or the function will crash. While concatentate (and it's relative functions hstack,vstack and dstack) are convenient, beware, it is still much better to define your entire array up front and load data into it if you know the array size.  This is because Concatentate is relatively inefficient because it requires that the entire array is reallocated into memory every time you run it, which can get really slow for large arrays.
  5. Numpy Mean and Nanmean

    These are pretty self explanatory.  These functions simple take an average of an array, allowing you to specify an axis as well.  The "axis" command is really handy if you want to do a time-average while preserving the spatial dimensions, for instance.

    arr1=np.linspace(0,20,5)
    print(np.mean(arr1))
    
    result: 10.0
    
    ## set the 3rd index to a nan
    arr1[3]=np.nan
    print(np.nanmean(arr1))
    
    result: 8.75  

    As you noticed, once you change the 3rd index value from 15 to a nan, the function now takes the mean 0,5,10,20 (which is 8.75) instead of 0,5,10,15,20 (which is 10).  However, if you try to run plain "mean" on an array with nans in it, the returned value will also be a nan.
  6. Numpy diff and gradient


    Theses functions are pretty similar, in that they allow you to take a gradient of a function.  diff is the simpliest of the two functions listed here, and just returns Δx along a specified dimension.  This will reduce your array length along the specified dimension by 1.  Gradient is similar, but comes with a few extra bells and whistles:
    • Gradient preserves the array length by using numerical approximations along the boundaries
    • Allows you to specify a sample distance for each axis (i.e., the dx for dy/dx)
    • If no "axis" argument is supplied will compute the gradient in ALL dimensions and return a tuple of length n-dimensions with each index corresponding to the gradient along that index.
    Most often, I use "gradient" especially if I need to multiply the gradient against something because I can be somewhat lazy, and I don't want to have to deal with different sized arrays/vectors. The example I will provide will be show how to use these functions to approximate the derivative of "sin(x)" which is analytically (of course) "cos(x)".  Note that this example uses matplotlib.

    import numpy as np
    from matplotlib import pyplot as plt
    
    x=np.linspace(0,2*np.pi,40)
    array=np.sin(x)
    dx=x[1]-x[0]
    
    xstag=(x[:-1]+x[1:])/2.
    
    plt.figure()
    plt.plot(x,np.gradient(array,dx),label='d/dx(sin(x)) (gradient)',lw=4.4)
    plt.plot(x,array,label='Sin(x)')
    plt.plot(xstag,np.diff(array)/dx,label="d/dx(sin(x)) (diff)",ls='--',color='r')
    plt.legend()
    plt.show()
    

    This will generate the following figure: 
    sin(x) and it's approximated derivative (cos(x))





    I played with the line colors and thicknesses to make it easier to differentiate the "diff" approximation from the "gradient" approximation.  Note that to correctly plot the "diff" function I had to create a new version of "x" called "xstag" which basically shifted the x-center points 0.5dx, and shortened the array length (n) by one simply by averaging the first n-1 array elements with the last n-1 array elements.  Also note, that I obtained dx by simply differencing the first two array elements.


Final Notes:

While this post had little to do with atmospheric science, I hope it provided a useful introduction to the Numpy object class.  For more information on Numpy and its capabilities please visit the official site.