Sunday, November 22, 2020

Python | Tutorial: Introduction 1 - Introduction to Python

Introduction:

While I want to keep the focus the Python section of this blog specifically on using Python for meteorological and atmospheric science applications, and not a broad tutorial on Python in general, a few basic Python syntax and coding strategies well greatly help in understanding the tutorials presented here.  This post is very much geared towards people who are new to Python, so if you are already comfortable using Python, and/or you aren't into reading, you can probably skip this one.

The Basics (a little background on Python):

  • Python is NOT a free form language, that is it has fairly strict indentation and formatting rules that have to be followed in order for your program to work.  If you're new to python (or other programming languages with these restrictions), this can be extremely frustrating at first, but you get use to it over time.  Also, almost every programming specific text editor (e.g., atom) automatically adheres your code to the python requirements. 
  • Like GrADS, Python "commands" can be run within the command line, or commands can be written in Python "scripts" with the suffix ".py
  • Python is an interpreted language, instead of compiled like, for example, Fortran.  For the purposes of the tutorials and scripts presented in this blog, the only practical application of this information is that Python scripts will run slower than compiled Fortran programs, making it less ideal for applications such as numerical modeling.
  • Most data in Python is stored in lists, tuples, and dictionaries.  Many Python modules restructure these data types into "arrays" which is extremely useful for data-analysis, but it's important to know that these arrays are really lists/tuples/dictionaries at their cores. I don't want to get too far into the weeds with respect to the differences between these data structures in this post, and furthermore there are numerous tutorials available via a quick google search.  However,  I use lists and dictionaries all the time when using Python for atmospheric science applications, so it is definitely worth taking the time to learn a little bit more about these data structures.  Here are a couple of links to help get you started: Lists, Dictionaries.  Tuples are everywhere in Python for a number of different reasons, and while I encounter them relatively often, I rarely ever do so in such a way that I have to consciously be aware of the fact I'm working with a Tuple.
  • Numpy, Scipy, and Matplotlib:  Most data and array operations and plotting will be performed using these three modules.  Unlike Fortran, these modules perform operations in row major, which may have no practical bearing on your usage.  However, when dealing with large datasets, it may be advantageous to store your data with row major structuring in mind. 
  • Most geographic mapping is performed with either Matplotlib Basemap, or Cartopy.  While many folks like Basemap, it's slowly being phased out for Cartopy, so if you are new to Python, I strongly encourage you start using Cartopy instead of Basemap.
  • Finally, Python is an extremely versatile and powerful language that is relatively easy to learn.  It has countless data analysis applications, and numerous packages developed specifically with atmospheric science in mind.  Refer to (Link to getting started page) for more information on how to get started with Python. 

Common Syntax:
This list is by no means an exhaustive list, but is just a quick reference guide on some of the basic syntax that is used in Python.
  • Arrays and lists use square brackets ([]) and tuples use standard parenthesis.  Array elements are referenced using the ":" as the indexing wildcard.  For example: array[:].
  • The number sign, or hashtag (#), is used to denote a comment.
  • For loops and if statements do not require an "enddo" or "endif" statement at the end of the loop, simply return to the outer indent (see example below).
    • Important:  Python accomplishes this by requiring "smart indent" coding practices ... which will drive you nuts if you're unfamiliar with it, but you'll learn to love it since it forces you you write more readable code.
  • print statements follow this syntax: print("hello world") print "hello world" will also work in versions 2.x.  
Example Code:

Simple syntax example:

## This simple script requires no imported modules, and demonstrates the for loop,
## Some basic list structure and conditional statements.
list=[1,2,3,4,5]

for i in list:
    print(i)

list.append(6)  ## add a 6 to the end of the list

print('--------------')
for idx in range(len(list)):
    print(idx, list[idx])
print('--------------')
list.append('Hello') ## add a string to the end of the list.

for i in list:
    if isinstance(i, str) == True:
        print(i, "is a string")
print('--------------')
 
Arrays and Plotting:

import numpy as np
from matplotlib import pyplot as plt

x=np.arange(0,20,1)*2.*3.141592/10
y=np.sin(x)

plt.figure(figsize=(10,5))
ax=plt.subplot(1,2,1)  ## 1 = one row, 2 = 2 columns, 1 = set 1st plot.
plt.plot(x,y) 
plt.plot(x,y,ls='none',marker='o')

## Now a 2D Plot with Wind Barbs ##

x1, y1 = np.meshgrid(x, x) 
u, v = 1.5*x1, 1.5*y1 
Z=np.sqrt(u**2.+v**.2)

ax=plt.subplot(1,2,2) ## 1 = one row, 2 = 2 columns, 2 = set 2nd plot.
ax.barbs(x1,y1,u,v,Z,length=4.5,cmap='jet') 
plt.show()
The above code should produce this image:
Figure produced by code above
 
def main():

As stated above, Python is an interpreted language and operations are performed from top-to-bottom.
While, this can be advantageous under many circumstances, one disadvantage is that functions, or objects must be defined above their use in the code.  Some may find it awkward to define a functions either on the fly throughout the code, or all in once place at the top.  There are two ways around this 1) You can define all of your functions in a separate Python (.py) script and import it into the script you want to call the functions from, or 2) you can define a "main" function and direct the program to call the main function:

The below code will result in an error:

import numpy as np
 
batman()

def batman():
    for j in range(8):
        print(np.log(-1))
    print(" BATMAN!")

The below code will run just fine:
import numpy as np

def main():
    batman()

def batman():
    for j in range(8):
        print(np.log(-1))
    print(" BATMAN!"
 
if __name__ == '__main__':
    main() 


In some blog posts, I may use the above syntax, especially if I have a lot of functions to define, and in others I will use the simple top-to-bottom format, defining variables as I go.  It's just a matter of user preference and readability.

Final Notes:


While this post definitely has a lot of background text that might not be relevant to you as a scientist aiming to write "research grade" code to do cool analysis, hopefully, this short intro to Python will help you better understand how Python works at a more base level and familiarize you with some of the common vocabulary you'll encounter with the language.

0 comments:

Post a Comment