User Tools

Site Tools


other:python:misc_by_jyp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
other:python:misc_by_jyp [2022/12/12 12:11]
jypeter Added the "Storing objects and data in a file" section
other:python:misc_by_jyp [2023/04/26 15:50]
jypeter Started a Data represenation section
Line 5: Line 5:
 </​WRAP>​ </​WRAP>​
  
-==== Reading/​setting environments variables ==== 
  
 +===== Reading/​setting environments variables =====
  
 <​code>>>>​ os.environ['​TMPDIR'​] <​code>>>>​ os.environ['​TMPDIR'​]
Line 17: Line 17:
 </​code>​ </​code>​
  
-==== Generating (aka raising) an error ====+ 
 +===== Generating (aka raising) an error =====
  
 This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors
Line 25: Line 26:
  
  
-==== Stopping a script ====+===== Stopping a script ​=====
  
 A user can use ''​CTRL-C''​ or ''​kill''​ to stop a script, or ''​CTRL-Z''​ to suspend it temporarily (use ''​fg''​ to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error A user can use ''​CTRL-C''​ or ''​kill''​ to stop a script, or ''​CTRL-Z''​ to suspend it temporarily (use ''​fg''​ to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error
Line 31: Line 32:
 <​code>​sys.exit('​Some optional message about why we are stopping'​)</​code>​ <​code>​sys.exit('​Some optional message about why we are stopping'​)</​code>​
  
 +===== Data representation =====
  
-==== Checking if a file/​directory is writable by the current user ====+A few notes for a future section or page about about data representation (bits and bytes) on disk and in memory, vs data format 
 + 
 +  * Binary data representation 
 +    * [[https://​en.wikipedia.org/​wiki/​Bit_numbering|Bit numbering]] 
 +    * [[https://​en.wikipedia.org/​wiki/​Endianness|Endianness]] 
 +    * [[https://​en.wikipedia.org/​wiki/​Integer_(computer_science)|Integers]] 
 +      * Using [[https://​en.wikipedia.org/​wiki/​Two%27s_complement|two'​s complement]] for negative integers 
 +      * Range: 
 +        * 4-byte integers: −2,​147,​483,​648 to 2,​147,​483,​647 
 +        * 8-byte integers: −9,​223,​372,​036,​854,​775,​808 to 9,​223,​372,​036,​854,​775,​807 
 +    * [[https://​en.wikipedia.org/​wiki/​IEEE_754|Floating point numbers]] (//IEEE 754// standard) 
 +      * Range: 
 +        * 4-byte float: ~8 significant digits ^10E±38 
 +          * See also [[https://​en.wikipedia.org/​wiki/​Single-precision_floating-point_format|Single-precision floating-point format|Single-precision floating-point format]] 
 +        * 8-byte float: ~15 significant digits ^10E±308 
 + 
 +  * Array addressing 
 + 
 +  * disk and ram usage: how to check the usage (available ram and disk), best practice on multi-user systems (how much allowed?) 
 +    * ''​du'',​ ''​df'',​ ''​cat /​proc/​meminfo'',​ ''​top''​ 
 + 
 +  * understanding and reverse-engineering //binary// format 
 +    * ''​od'',​ ''​strings''​ 
 + 
 +  * binary vs text format: ascii, utf, raw 
 +    * text related functions in python: ''​str'',​ ''​int'',​ ''​float'',​ ''​ord'',​ ... 
 +      * lists conversion with ''​map''​ and ''​join''​ 
 + 
 +  * Misc : ''​md5sum''​ 
 +===== Checking if a file/​directory is writable by the current user =====
  
 <​code>>>>​ os.access('/',​ os.W_OK) <​code>>>>​ os.access('/',​ os.W_OK)
Line 39: Line 70:
 True</​code>​ True</​code>​
  
-==== Playing with strings ==== 
  
-=== Filenames, etc... ​===+===== Playing with strings =====
  
-Check [[other:​python:​misc_by_jyp#​working_with_paths_and_filenames|Working with paths and filenames]] and [[other:​python:​misc_by_jyp#​generating_file_names|Generating file names]] 
  
-=== Splitting strings ===+==== Splitting ​(complex) ​strings ​====
  
 It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings
Line 64: Line 93:
 >>>​ shlex.split(complex_string) >>>​ shlex.split(complex_string)
 ['​-o',​ '​1',​ '​--long',​ 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7'​]</​code>​ ['​-o',​ '​1',​ '​--long',​ 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7'​]</​code>​
 +
 +
 ==== Working with paths and filenames ==== ==== Working with paths and filenames ====
  
Line 149: Line 180:
 >>>​ f_tmp.close() >>>​ f_tmp.close()
 >>>​ os.remove(f_tmp.name)</​code>​ >>>​ os.remove(f_tmp.name)</​code>​
-==== Using command-line arguments ==== 
  
-=== The extremely easy but non-flexible way: sys.argv ===+ 
 +===== Using command-line arguments ===== 
 + 
 +==== The extremely easy but non-flexible way: sys.argv ​====
  
 The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''​sys.argv''​ strings'​ list The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''​sys.argv''​ strings'​ list
Line 173: Line 206:
 2 tas_tes.nc</​code>​ 2 tas_tes.nc</​code>​
  
-=== The C-style way: getopt ===+ 
 +==== The C-style way: getopt ​====
  
 Use [[https://​docs.python.org/​3/​library/​getopt.html|getopt]] (//C-style parser for command line options//) Use [[https://​docs.python.org/​3/​library/​getopt.html|getopt]] (//C-style parser for command line options//)
  
-=== The deprecated Python way: optparse ===+ 
 +==== The deprecated Python way: optparse ​====
  
 [[https://​docs.python.org/​3/​library/​optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://​docs.python.org/​3/​library/​argparse.html#​upgrading-optparse-code|Upgrading optparse code]] for converting from ''​optparse''​ to ''​argparse''​) [[https://​docs.python.org/​3/​library/​optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://​docs.python.org/​3/​library/​argparse.html#​upgrading-optparse-code|Upgrading optparse code]] for converting from ''​optparse''​ to ''​argparse''​)
  
-=== The current Python way: argparse ===+ 
 +==== The current Python way: argparse ​====
  
 [[https://​docs.python.org/​3/​library/​argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//​) is available since Python version 3.2 [[https://​docs.python.org/​3/​library/​argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//​) is available since Python version 3.2
  
-==== Using ordered dictionaries ====+ 
 +===== Using ordered dictionaries ​=====
  
 **Dictionary order is guaranteed to be insertion order**! Note that the [[https://​docs.python.org/​3/​library/​stdtypes.html#​dict|usual Python dictionary]] also guarantees the order since version **3.6** **Dictionary order is guaranteed to be insertion order**! Note that the [[https://​docs.python.org/​3/​library/​stdtypes.html#​dict|usual Python dictionary]] also guarantees the order since version **3.6**
Line 191: Line 228:
 Check the [[https://​docs.python.org/​3/​library/​collections.html#​collections.OrderedDict|OrderedDict class]] (''​from collections import OrderedDict''​) and the [[https://​realpython.com/​python-ordereddict/​|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial Check the [[https://​docs.python.org/​3/​library/​collections.html#​collections.OrderedDict|OrderedDict class]] (''​from collections import OrderedDict''​) and the [[https://​realpython.com/​python-ordereddict/​|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial
  
-==== Using sets ====+ 
 +===== Using sets =====
  
 [[https://​docs.python.org/​3/​tutorial/​datastructures.html#​sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //​something//​ and you can easily determine the **intersection**,​ **union** (and other similar operations) of sets. [[https://​docs.python.org/​3/​tutorial/​datastructures.html#​sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //​something//​ and you can easily determine the **intersection**,​ **union** (and other similar operations) of sets.
  
-==== Printing a readable version of long lists or dictionaries ====+ 
 +===== Printing a readable version of long lists or dictionaries ​=====
  
 The [[https://​docs.python.org/​3/​library/​pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries,​ ...). It will wrap long lines in a meaningful way The [[https://​docs.python.org/​3/​library/​pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries,​ ...). It will wrap long lines in a meaningful way
Line 229: Line 268:
 </​code>​ </​code>​
  
-==== Storing objects and data in a file (shelve and friends) ==== 
  
-The built-in [[other:​python:​jyp_steps?s[]=shelve#​the_shelve_package|shelve]] module can be **easily** used for storing temporary/​intermediate data+===== Storing objects and data in a file (shelve and friends) ===== 
 + 
 +The built-in [[other:​python:​jyp_steps#​the_shelve_package|shelve]] module can be **easily** used for storing temporary/​intermediate data
  
 More options: More options:
Line 237: Line 277:
   * Working with [[other:​python:​jyp_steps#​netcdf_filesusing_cdms2_xarray_and_netcdf4|NetCDF]] files   * Working with [[other:​python:​jyp_steps#​netcdf_filesusing_cdms2_xarray_and_netcdf4|NetCDF]] files
  
-==== Sorting ====+ 
 +===== Using a configuration file ===== 
 + 
 +The built-in [[https://​docs.python.org/​3/​library/​configparser.html|configparser]] module can be easily used for reading (**and** writing!) text configuration files. 
 + 
 +Note: a configuration file is also a way to easily store and exchange text data ! 
 + 
 + 
 +===== Working with global variables ===== 
 + 
 +There is a good chance you don't actually want/need a //global// variable. Be sure to use the ''​global''​ statement correctly if you want to avoid side-effects... 
 + 
 +  * [[https://​docs.python.org/​3/​faq/​programming.html?​highlight=global#​why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value|Using (and changing) a global variable inside a script or module]] 
 +    * Simple module example\\ <​code>​_myvar = 10 
 + 
 +def set_myvar(new_val):​ 
 +    # Note: need to explicitly define a global variable (of a module) 
 +    # as '​global'​ BEFORE changing its value in a function! 
 +    # Otherwise, the value will not be REdefined outside the function 
 +    global _myvar 
 +    _myvar = new_val 
 + 
 +def get_myvar():​ 
 +    return _myvar 
 + 
 +def myfunc(nb_repeat = 10): 
 +    print(nb_repeat * _myvar)</​code>​ 
 +  * [[https://​docs.python.org/​3/​faq/​programming.html?​highlight=global#​how-do-i-share-global-variables-across-modules|Sharing global variables across modules]] 
 +===== Sorting ​=====
  
   * When dealing with **numerical values**, you should use the [[https://​numpy.org/​doc/​stable/​reference/​routines.sort.html|numpy sorting, searching, and counting routines]]!   * When dealing with **numerical values**, you should use the [[https://​numpy.org/​doc/​stable/​reference/​routines.sort.html|numpy sorting, searching, and counting routines]]!
Line 254: Line 322:
 ['​c',​ '​d',​ '​b',​ '​a'​]</​code>​ ['​c',​ '​d',​ '​b',​ '​a'​]</​code>​
  
-==== numpy related stuff ====+===== numpy related stuff =====
  
-=== Using a numpy array to store arbitrary objects ===+==== Using a numpy array to store arbitrary objects ​====
  
 The numpy arrays are usually used to store [[https://​numpy.org/​doc/​stable/​reference/​arrays.scalars.html|scalars]] of the same type (see also the [[https://​numpy.org/​doc/​stable/​reference/​arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values. The numpy arrays are usually used to store [[https://​numpy.org/​doc/​stable/​reference/​arrays.scalars.html|scalars]] of the same type (see also the [[https://​numpy.org/​doc/​stable/​reference/​arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values.
Line 275: Line 343:
        ​[<​cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>,​        ​[<​cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>,​
         None, None]], dtype=object)</​code>​         None, None]], dtype=object)</​code>​
 +
         ​         ​
-=== Dealing with a variable number of indices ===+==== Dealing with a variable number of indices ​====
  
 [[https://​numpy.org/​doc/​stable/​user/​basics.indexing.html#​dealing-with-variable-indices|Official reference]] [[https://​numpy.org/​doc/​stable/​user/​basics.indexing.html#​dealing-with-variable-indices|Official reference]]
Line 340: Line 409:
        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])</​code>​        [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])</​code>​
  
-=== Finding and counting unique values ===+ 
 +==== Finding and counting unique values ​====
  
 Use ''​np.unique'',​ do **not** try to use histogram related functions! Use ''​np.unique'',​ do **not** try to use histogram related functions!
Line 360: Line 430:
 array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])</​code>​ array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])</​code>​
  
-=== Applying a ufunc over all the elements of an array ===+ 
 +==== Applying a ufunc over all the elements of an array ====
  
 There are all sorts of //ufuncs// (Universal Functions), and we will just use below ''​add''​ from the [[https://​numpy.org/​doc/​stable/​reference/​ufuncs.html#​math-operations|math operations]],​ applied on the arrays defined in [[#​finding_and_counting_unique_values|Finding and counting unique values]] There are all sorts of //ufuncs// (Universal Functions), and we will just use below ''​add''​ from the [[https://​numpy.org/​doc/​stable/​reference/​ufuncs.html#​math-operations|math operations]],​ applied on the arrays defined in [[#​finding_and_counting_unique_values|Finding and counting unique values]]
Line 393: Line 464:
 (3.0, 4.5, 8.0)</​code>​ (3.0, 4.5, 8.0)</​code>​
  
-=== Applying a ufunc over specified sections of an array ===+ 
 +==== Applying a ufunc over specified sections of an array ====
  
 The [[https://​numpy.org/​doc/​stable/​reference/​generated/​numpy.ufunc.reduceat.html#​numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //​improves//​ what has been shown above The [[https://​numpy.org/​doc/​stable/​reference/​generated/​numpy.ufunc.reduceat.html#​numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //​improves//​ what has been shown above
Line 411: Line 483:
 >>>​ np.add.reduceat(np.sort(vals),​ slices_indices) >>>​ np.add.reduceat(np.sort(vals),​ slices_indices)
 array([3. , 4.5, 8. ])</​code>​ array([3. , 4.5, 8. ])</​code>​
 +
 +
 +===== matplotlib related stuff =====
 +
 +==== Working with time axes (and ticks) ====
 +
 +If you have problems setting the limits of a time axis, choosing the ticks' locations, or specifying the style of the labels, you should check the:
 +  * [[https://​matplotlib.org/​stable/​gallery/​index.html#​ticks|Ticks examples'​ gallery]]
 +  * [[https://​matplotlib.org/​stable/​gallery/​text_labels_and_annotations/​date.html|Date tick labels example]]
 +
  
 /* /*
-==== Tip template ====+===== Tip template ​=====
  
 <​code>​Some code</​code>​ <​code>​Some code</​code>​
other/python/misc_by_jyp.txt · Last modified: 2024/04/19 12:02 by jypeter