This is an old revision of the document!
Table of Contents
Useful python stuff
You will find on this page some useful, but unsorted, python tips and tricks that can't fit in a section of the main JYP's recommended steps for learning python page
Reading/setting environments variables
>>> os.environ['TMPDIR']
'/data/jypmce/climafcache'
>>> os.environ.get('SCRATCHDIR', '/data/jypmce/some_scratch_stuff')
'/data/jypmce/some_scratch_stuff'
>>> os.environ['temporary_env_var_for_THIS_script'] = 'some value'
>>> os.environ['temporary_env_var_for_THIS_script']
'some value'
Generating (aka raising) an error
This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors
- raise RuntimeError('\n\nOMG! An error! :-(\nAborting script...')
Stopping a script
A user can use CTRL-C or kill to stop a script, or CTRL-Z to suspend it temporarily (use fg to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error
sys.exit('Some optional message about why we are stopping')
Checking if a file/directory is writable by the current user
>>> os.access('/', os.W_OK)
False
>>> os.access('/home/jypmce/.bashrc', os.W_OK)
True
Working with paths and filenames
If you are in a hurry, you can just use string functions to work with path and file names. But you will need some specific functions to check if a file exists, and similar operations. All these are available in 2 libraries that have similar functions. Both of these libraries can deal with Unix-type paths on Linux computers, and Windows-type paths on Windows computers
Example: getting the full path of the Python used
Note: the actual python may be different from the default python!
$ which python
/usr/bin/python
$ /modfs/modtools/miniconda3//envs/analyse_3.6_test/bin/python
>>> import sys, shutil
>>> shutil.which('python')
'/usr/bin/python'
>>> sys.executable
'/modfs/modtools/miniconda3//envs/analyse_3.6_test/bin/python'
Example: getting the full path of a script
>>> import os
>>> os.getcwd()
'/home/jypmce/PMIP4'
>>> os.path.exists('./argv_test.py')
True
>>> os.path.abspath('./argv_test.py')
'/home/jypmce/PMIP4/argv_test.py'
>>> os.path.exists('/home/jypmce/PMIP4/argv_test.py')
True
Example: getting the size(s) of all the files in a directory
$ cd /data/jypmce/TestDir $ ls -l total 72 -rw-r--r-- 1 jypmce ipsl 18147 Jun 25 2012 get_TS_cmip5.py -rw-r--r-- 1 jypmce ipsl 16152 Jun 21 2012 get_TS_cmip5.py~ -rw-r--r-- 1 jypmce ipsl 13954 Jul 3 2012 get_TS_cmip5_regular.py -rw-r--r-- 1 jypmce ipsl 16539 Jun 22 2012 get_TS_cmip5_regular.py~
>>> os.chdir('/data/jypmce/TestDir')
>>> print(os.getcwd())
/data/jypmce/TestDir
>>> files_list = os.listdir()
>>> files_list
['get_TS_cmip5.py~', 'get_TS_cmip5_regular.py', 'get_TS_cmip5_regular.py~', 'get_TS_cmip5.py']
>>> files_sizes = list(map(os.path.getsize, files_list))
>>> files_sizes
[16152, 13954, 16539, 18147]
>>> sum(files_sizes)
64792
Generating file names
Name depending on the current date/time
>>> import time
>>> plot_version = time.strftime('%Y%m%d_%H%M')
>>> f_name = 'test_%s.nc' % (plot_version,)
>>> f_name
'test_20210827_1334.nc'
Temporary file
>>> import tempfile, os >>> f_tmp = tempfile.NamedTemporaryFile(mode='w', suffix='.nc', delete=False) >>> f_tmp <tempfile._TemporaryFileWrapper object at 0x2b5614743820> >>> f_tmp.name '/tmp/tmpi6uk9hre.nc' >>> f_tmp.close() >>> os.remove(f_tmp.name)
Using command-line arguments
The extremely easy but non-flexible way: sys.argv
The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the sys.argv strings' list
Simple argv_test.py test script:
#!/usr/bin/env python
import sys
nb_args = len(sys.argv)
print('Number of script arguments (including script name) =', nb_args)
for idx, val in enumerate(sys.argv):
    print(idx, val)
$ python argv_test.py Number of script arguments (including script name) = 1 0 argv_test.py $ python argv_test.py tas tas_tes.nc Number of script arguments (including script name) = 3 0 argv_test.py 1 tas 2 tas_tes.nc
The C-style way: getopt
Use getopt (C-style parser for command line options)
The deprecated Python way: optparse
optparse (parser for command line options) is deprecated since Python version 3.2! You should now use argparse (check Upgrading optparse code for converting from optparse to argparse)
The current Python way: argparse
argparse (parser for command-line options, arguments and sub-commands) is available since Python version 3.2
Using ordered dictionaries
Dictionary order is guaranteed to be insertion order! Note that the usual Python dictionary also guarantees the order since version 3.6
Check the OrderedDict class (from collections import OrderedDict) and the OrderedDict vs dict in Python: The Right Tool for the Job tutorial
Using sets
Python sets are groups of unique elements. They can be used to easily find all the unique elements of something and you can easily determine the intersection, union (and other similar operations) of sets.
Printing a readable version of long lists or dictionaries
The pprint module can be used for pretty printing objects (lists, dictionaries, …). It will wrap long lines in a meaningful way
>>> import pprint
>>> test_dic = {'AWI-ESM-1-1-LR_AWI':{'r1i1p1f1': {'grid': 'gn'}}, 'CESM2_NCAR':{'r1i1p1f1': {'grid': 'gn'}}, 'IPSL-CM6A-LR_IPSL':{'r1i1p1f1': {'grid': 'gr'}, 'r1i1p1f2': {'grid': 'gr'}, 'r1i1p1f3': {'grid': 'gr'}, 'r1i1p1f4': {'grid': 'gr'}}}
>>> print(test_dic)
{'AWI-ESM-1-1-LR_AWI': {'r1i1p1f1': {'grid': 'gn'}}, 'CESM2_NCAR': {'r1i1p1f1': {'grid': 'gn'}}, 'IPSL-CM6A-LR_IPSL': {'r1i1p1f1': {'grid': 'gr'}, 'r1i1p1f2': {'grid': 'gr'}, 'r1i1p1f3': {'grid': 'gr'}, 'r1i1p1f4': {'grid': 'gr'}}}
>>> pprint.pprint(test_dic)
{'AWI-ESM-1-1-LR_AWI': {'r1i1p1f1': {'grid': 'gn'}},
 'CESM2_NCAR': {'r1i1p1f1': {'grid': 'gn'}},
 'IPSL-CM6A-LR_IPSL': {'r1i1p1f1': {'grid': 'gr'},
                       'r1i1p1f2': {'grid': 'gr'},
                       'r1i1p1f3': {'grid': 'gr'},
                       'r1i1p1f4': {'grid': 'gr'}}}
                       
>>> dir(test_dic)
['__class__', '__contains__', '__delattr__', [... lots of unreadable stuff removed...] 'setdefault', 'update', 'values']
>>> pprint.pprint(dir(test_dic))
['__class__',
 '__contains__',
[... lots of lines removed in this example ]
 'setdefault',
 'update',
 'values']
Sorting
- When dealing with numerical values, you should use the numpy sorting, searching, and counting routines!
- Example: sorting the keys and the values of a dictionary, and then using thekeyparameter to sort the keys of a dictionary according to the value associated with the key- If we provide akeyfunction, thesortfunction will sort the elements by the values returned by the function, instead of sorting by the initial values. The function used for generating the key below is very simple and we can use a lambda (i.e in place) function
- >>> demo_dic = {'a':10, 'b':5, 'c':-1, 'd':0} >>> sorted(demo_dic.keys()) ['a', 'b', 'c', 'd'] >>> sorted(demo_dic.values()) [-1, 0, 5, 10] >>> sorted(demo_dic.keys(), key=lambda key_name:demo_dic[key_name]) ['c', 'd', 'b', 'a']
 
numpy related stuff
Finding and counting unique values
Use np.unique, do not try to use histogram related functions!
>>> vals = np.random.randint(2, 5, (10,)) * 0.5 # Get 10 discreet float values >>> vals array([1. , 2. , 1. , 2. , 2. , 1.5, 1. , 1.5, 2. , 1.5]) >>> np.unique(vals) array([1. , 1.5, 2. ]) >>> unique_vals, nb_unique = np.unique(vals, return_counts=True) >>> unique_vals array([1. , 1.5, 2. ]) >>> nb_unique array([3, 3, 4]) >>> sorted_vals = np.sort(vals) # Sorted copy, in order to check the result >>> sorted_vals array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])
Applying a ufunc over all the elements of an array
There are all sorts of ufuncs (Universal Functions), and we will just use below add from the math operations, applied on the arrays defined in Finding and counting unique values
# Get the sum of all the elements of 'vals' >>> np.add.reduce(vals) 15.5 >>> np.add.reduce(sorted_vals) 15.5 >>> vals.sum() # The usual and easy way to do it 15.5 # Compute the sum of the elements of 'nb_unique' # AND keep (accumulate) the intermediate results >>> nb_unique array([3, 3, 4]) >>> np.add.accumulate(nb_unique) array([ 3, 6, 10]) # The accumulated values can be used as indices to separate the different groups of sorted values! >>> sorted_vals array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ]) >>> sorted_vals[0:3] array([1., 1., 1.]) >>> sorted_vals[3:6] array([1.5, 1.5, 1.5]) >>> sorted_vals[6:10] array([2., 2., 2., 2.]) # Compute the sum of each equal-value group >>> sorted_vals[0:3].sum(), sorted_vals[3:6].sum(), sorted_vals[6:10].sum() (3.0, 4.5, 8.0)
[ PMIP3 Wiki Home ] - [ Help! ] - [ Wiki syntax ]
