User Tools

Site Tools


other:python:misc_by_jyp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
other:python:misc_by_jyp [2023/05/03 15:56]
jypeter [Numerical values]
other:python:misc_by_jyp [2024/04/17 09:25]
jypeter [Useful python stuff] Added the Extra tutorials section
Line 5: Line 5:
 </​WRAP>​ </​WRAP>​
  
 +===== Extra tutorials =====
  
 +Only **when you have already read all the content of this page several times**, and you are looking for new ideas
 +
 +  * [[https://​medium.com/​@yaduvanshineelam09/​ultimate-python-cheat-sheet-practical-python-for-everyday-tasks-8a33abc0892f|Ultimate Python Cheat Sheet: Practical Python For Everyday Tasks]]
 ===== Reading/​setting environments variables ===== ===== Reading/​setting environments variables =====
  
Line 65: Line 69:
 ==== Working with paths and filenames ==== ==== Working with paths and filenames ====
  
-If you are in a hurry, you can just use string functions to work with path and file names. ​But you will need some specific functions to check if a file exists, and similar operations. All these are available in 2 libraries that have similar functions. Both of these libraries can deal with Unix-type paths on Linux computers, and Windows-type paths on Windows computers+If you are in a hurry, you can just use string functions to work with paths and file names.
  
-  ​* [[https://​docs.python.org/​3/​library/​os.path.html|os.path]] //Common ​pathname manipulations//​+ 
 +You will need some specific objects and functions to check if a file exists, and similar operations. Check the libraries listed below, that can automatically deal with Unix-type paths on Linux and MacOS computers, and Windows-type paths on Windows computers 
 + 
 +  ​* [[https://​docs.python.org/​3/​library/​os.path.html|os.path]]//common ​pathname manipulations//​
     * Available since... a long time! Use this if you want to avoid backward compatibility problems     * Available since... a long time! Use this if you want to avoid backward compatibility problems
     * Some functions are directly in [[https://​docs.python.org/​3/​library/​os.html|os]] //​Miscellaneous operating system interfaces//​\\ e.g. [[https://​docs.python.org/​3/​library/​os.html#​os.remove|os.remove]] and [[https://​docs.python.org/​3/​library/​os.html#​os.rmdir|os.rmdir]]     * Some functions are directly in [[https://​docs.python.org/​3/​library/​os.html|os]] //​Miscellaneous operating system interfaces//​\\ e.g. [[https://​docs.python.org/​3/​library/​os.html#​os.remove|os.remove]] and [[https://​docs.python.org/​3/​library/​os.html#​os.rmdir|os.rmdir]]
-  * [[https://​docs.python.org/​3/​library/​pathlib.html|pathlib]] //Object-oriented filesystem paths//+  * [[https://​docs.python.org/​3/​library/​pathlib.html|pathlib]]: a **more recent** ​//object-oriented// way to deal with //filesystem paths//
     * Available since Python version 3.4     * Available since Python version 3.4
     * [[https://​docs.python.org/​3/​library/​pathlib.html#​correspondence-to-tools-in-the-os-module|Matching pathlib, and os or os.path functions]]     * [[https://​docs.python.org/​3/​library/​pathlib.html#​correspondence-to-tools-in-the-os-module|Matching pathlib, and os or os.path functions]]
-  * [[https://​docs.python.org/​3/​library/​shutil.html|High-level file operations]]+  * [[https://​docs.python.org/​3/​library/​shutil.html|shutil]]: ​High-level file operations, e.g copy/move a file or directory tree
  
  
-=== Example: getting the full path of the Python used ===+=== Example: getting the full path of the Python ​executable ​used ===
  
 Note: the actual python may be different from the default python! Note: the actual python may be different from the default python!
Line 83: Line 90:
 /​usr/​bin/​python /​usr/​bin/​python
  
-$ /modfs/modtools/miniconda3//envs/analyse_3.6_test/bin/python+$ /home/share/unix_files/cdat/​miniconda3_21-02/envs/cdatm_py3/bin/python
 >>>​ import sys, shutil >>>​ import sys, shutil
 >>>​ shutil.which('​python'​) >>>​ shutil.which('​python'​)
 '/​usr/​bin/​python'​ '/​usr/​bin/​python'​
 >>>​ sys.executable >>>​ sys.executable
-'/modfs/modtools/miniconda3//envs/analyse_3.6_test/​bin/​python'</​code>​+'/home/share/unix_files/cdat/​miniconda3_21-02/envs/cdatm_py3/​bin/​python'</​code>​
  
  
Line 104: Line 111:
 </​code>​ </​code>​
  
 +
 +=== Example: system independent paths with pathlib ===
 +
 +Note: the following example was generated on a Linux server and uses a <wrap em>/</​wrap>​ character as a path separator
 +
 +<​code>>>>​ my_home = Path.home()
 +>>>​ my_home
 +PosixPath('/​home/​users/​my_login'​)
 +>>>​ my_conf = my_home / '​.config'​ / '​evince'​
 +>>>​ my_conf
 +PosixPath('/​home/​users/​my_login/​.config/​evince'​)
 +>>>​ my_conf.is_dir()
 +True
 +>>>​ my_conf.is_file()
 +False
 +>>>​ list(my_conf.glob('​*'​))
 +[PosixPath('/​home/​users/​my_login/​.config/​evince/​evince_toolbar.xml'​),​ PosixPath('​ /​home/​users/​my_login/​.config/​evince/​accels'​)]
 +>>>​ [ ff.name for ff in my_conf.glob('​*'​) ]
 +['​evince_toolbar.xml',​ '​accels'​]
 +</​code>​
  
 === Example: getting the size(s) of all the files in a directory === === Example: getting the size(s) of all the files in a directory ===
Line 290: Line 317:
 ['​c',​ '​d',​ '​b',​ '​a'​]</​code>​ ['​c',​ '​d',​ '​b',​ '​a'​]</​code>​
  
 +
 +===== Efficient looping with numpy, map, itertools and list comprehension =====
 +
 +<wrap hi>Big, nested, explicit ''​for''​ loops should be avoided at all cost</​wrap>,​ in order to reduce a script execution time!
 +
 +  * **''​numpy''​ arrays** should be used when dealing with //numerical data//
 +    * **Masked arrays** can be used to deal with //special cases// and remove tests from loops
 +
 +  * The built-in [[https://​docs.python.org/​3/​library/​functions.html?​highlight=map#​map|map]] function (and similar functions like [[https://​docs.python.org/​3/​library/​functions.html?​highlight=zip#​zip|zip]],​ [[https://​docs.python.org/​3/​library/​functions.html?​highlight=filter#​filter|filter]],​ ...) can be used to efficiently apply a function (possibly a //simple// [[https://​docs.python.org/​3/​tutorial/​controlflow.html#​lambda-expressions|lambda]] function) to all the elements of a list
 +    * <​code>>>>​ my_ints = [1, 2, 3]
 +
 +>>>​ map(str, my_ints)
 +['​1',​ '​2',​ '​3'​]
 +
 +>>>​ map(lambda ii: str(10*ii + 5), my_ints)
 +['​15',​ '​25',​ '​35'​]</​code>​
 +
 +  * The [[https://​docs.python.org/​3/​library/​itertools.html|itertools]] module defines many more fancy iterators that can be used for efficient looping
 +    * Example: replacing nested loops with [[https://​docs.python.org/​3/​library/​itertools.html#​itertools.product|product]]
 +      * <​code>>>>​ it.product('​AB',​ '​01'​)
 +<​itertools.product object at 0x2b35a7b5f100>​
 +
 +>>>​ list(it.product('​AB',​ '​01'​))
 +[('​A',​ '​0'​),​ ('​A',​ '​1'​),​ ('​B',​ '​0'​),​ ('​B',​ '​1'​)]
 +
 +>>>​ for c1, c2 in it.product('​AB',​ '​01'​):​
 +...   ​print(c1 + c2)
 +...
 +A0
 +A1
 +B0
 +B1
 +
 +>>>​ for c1, c2 in it.product(['​A',​ '​B'​],​ ['​0',​ '​1'​]):​
 +...   ​print(c1 + c2)
 +...
 +A0
 +A1
 +B0
 +B1
 +
 +>>>​ for c1, c2, c3 in it.product('​AB',​ '​01',​ '​$!'​):​
 +...   ​print(c1 + c2 + c3, end=', ')
 +...
 +A0$, A0!, A1$, A1!, B0$, B0!, B1$, B1!,</​code>​
 +
 +  * The [[https://​docs.python.org/​3/​tutorial/​datastructures.html?​highlight=comprehension#​list-comprehensions|list comprehension]] (aka //implicit loops//) can also be used to generate lists from lists
 +    * Example: converting a list of integers to a list of strings\\ Note: in that case, you should rather use the ''​map''​ function detailed above
 +      * <​code>>>>​ my_ints = [1, 2, 3]
 +
 +>>>​ [ str(ii) for ii in my_ints ]
 +['​1',​ '​2',​ '​3'​]</​code>​
 ===== numpy related stuff ===== ===== numpy related stuff =====
  
Line 452: Line 531:
 array([3. , 4.5, 8. ])</​code>​ array([3. , 4.5, 8. ])</​code>​
  
 +==== Exercise your brain with numpy ====
 +
 +Have a look at [[https://​github.com/​rougier/​numpy-100/​blob/​master/​100_Numpy_exercises.ipynb|100 numpy exercises]]
  
 ===== matplotlib related stuff ===== ===== matplotlib related stuff =====
Line 467: Line 549:
  
 FIXME Add parts (pages 28 to 37) of this [[https://​wiki.lsce.ipsl.fr/​pmip3/​doku.php/​other:​python:​jyp_steps#​part_2|old tutorial]] to this section FIXME Add parts (pages 28 to 37) of this [[https://​wiki.lsce.ipsl.fr/​pmip3/​doku.php/​other:​python:​jyp_steps#​part_2|old tutorial]] to this section
 +
 +==== Base notions ====
 +
 +  * **Never forget** that all the bits and pieces of information we use are coded in [[https://​en.wikipedia.org/​wiki/​Binary_number#​Counting_in_binary|base 2]] (''​0''​s and ''​1''​s ...), grouped in bytes!
 +    * Some things can be stored exactly (integers, characters, ...)
 +    * In other cases (**//real// numbers** that we work with all the time, compressed images/​videos/​music) we only store **//good enough approximation//​**
 +
 +  * 1 byte <=> 8 bits
 +    * ''​REAL*4''​ <=> 4 bytes <=> 32 bits
 +    * For easier written/​displayed representation,​ 1 byte is usually split into 2 groups of 4 bits, and displayed using base 16 and [[https://​en.wikipedia.org/​wiki/​Hexadecimal|hexadecimal representation]] (characters ''​0'',​ ''​1'',​ ..., ''​A'',​ ''​B'',​ ..., ''​F''​)
 +      * ''​0000''​ <=> ''​0'',​\\ ''​0010''​ <=> ''​1'',​ ...,\\ ''​1111''​ <=> ''​F''​
 +      * ''​1101''​ <=> ''​D''​ in hexadecimal <=> ''​13''​ in decimal (''​**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1''​)
 +      * ''​11111101''​ in //base 2// <=> ''​1111 1101''​ <=> ''​FD''​ in //​hexadecimal//​ <=> ''​253''​ (''​15 * 16 + 13''​) in //decimal//
 +
 +  * Base conversion with Python
 +    * <​code>>>>​ hex(13) # Decimal to Hexadecimal conversion
 +'​0xd'​
 +>>>​ hex(253)
 +'​0xfd'​
 +>>>​ hex(256)
 +'​0x100'​
 +>>>​ int('​0x100',​ 16) # Hexadecimal to Decimal conversion
 +256
 +>>>​ int('​1111',​ 2) # Binary to Decimal conversion
 +15
 +>>>​ int('​11111101',​ 2) # '​11111101'​ <=> '1111 1101' <=> '​FD'​ <=> 15 * 16 + 13 = 253
 +253
 +>>>​ 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0
 +11
 +>>>​ int('​13',​ 8) # 1*8 + 3
 +11</​code>​
 +
 +  * More technical topics
 +    * [[https://​en.wikipedia.org/​wiki/​Bit_numbering|Bit numbering]]:​ the art of ordering bits, everything about MSB (Most Significant Byte) and LSB (Least Significant Byte)
 +    * [[https://​en.wikipedia.org/​wiki/​Endianness|Endianness]]:​ the art of ordering bytes
 ==== Numerical values ==== ==== Numerical values ====
  
-  * Binary data representation of some numbers (not everythin is listed here):+  * Binary data representation of some numbers (only some common types are listed here): 
 +    * Languages and packages **references** used below: 
 +      * Python: [[https://​numpy.org/​doc/​stable/​reference/​arrays.scalars.html#​sized-aliases|NumPy Sized aliases]] 
 +      * NetCDF: [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|Data Types]], [[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|Fortran related Data Types]], [[https://​docs.unidata.ucar.edu/​nug/​current/​_c_d_l.html#​cdl_data_types|CDL Data Types]] 
 +      * Fortran: Intel Fortran Compiler [[https://​www.intel.com/​content/​www/​us/​en/​docs/​fortran-compiler/​developer-guide-reference/​2023-1/​intrinsic-data-types.html|Intrinsic Data Types]]
     * [[https://​en.wikipedia.org/​wiki/​Integer_(computer_science)|Integers]]     * [[https://​en.wikipedia.org/​wiki/​Integer_(computer_science)|Integers]]
       * Range:       * Range:
-        * 4-byte integers: −2,​147,​483,​648 to 2,​147,​483,​647+        * 4-byte ​//​signed// ​integers: ​''​−2,​147,​483,​648'' ​to ''​2,​147,​483,​647''​
           * Python: ''​numpy.int32''​           * Python: ''​numpy.int32''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]], [[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ''​int'',​ ''​NC_INT64'',​ ''​NF90_INT''​ +          * NetCDF: ''​int'',​ ''​NC_INT''​ or ''​NC_LONG'',​ ''​NF90_INT''​ 
-          * Fortran: +          * Fortran: ​''​INTEGER*4''​ 
-        * 8-byte integers: −9,​223,​372,​036,​854,​775,​808 to 9,​223,​372,​036,​854,​775,​807+        * 8-byte ​//​signed// ​integers: ​''​−9,​223,​372,​036,​854,​775,​808'' ​to ''​9,​223,​372,​036,​854,​775,​807''​
           * Python: ''​numpy.int64''​           * Python: ''​numpy.int64''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]]: ''​int64'',​ ''​NC_INT64''​ +          * NetCDF: ''​int64'',​ ''​NC_INT64''​ 
-          * Fortran:+          * Fortran: ​''​INTEGER*8''​
       * Tech note: signed integers use [[https://​en.wikipedia.org/​wiki/​Two%27s_complement|two'​s complement]] for coding negative integers       * Tech note: signed integers use [[https://​en.wikipedia.org/​wiki/​Two%27s_complement|two'​s complement]] for coding negative integers
     * [[https://​en.wikipedia.org/​wiki/​IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//​)     * [[https://​en.wikipedia.org/​wiki/​IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//​)
       * Range:       * Range:
-        * 4-byte float: ~8 significant digits * 10E±38+        * 4-byte float: ​''​~8 significant digits * 10E±38''​
           * Python: ''​numpy.float32''​           * Python: ''​numpy.float32''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]][[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ​ +          * NetCDF''​float''​''​NC-FLOAT'',​ ''​NF90_FLOAT''​ 
-          * Fortran:+          * Fortran:''​REAL*4''​
           * See also [[https://​en.wikipedia.org/​wiki/​Single-precision_floating-point_format|Single-precision floating-point format]]           * See also [[https://​en.wikipedia.org/​wiki/​Single-precision_floating-point_format|Single-precision floating-point format]]
-        * 8-byte float: ~15 significant digits * 10E±308+        * 8-byte float: ​''​~15 significant digits * 10E±308''​
           * Python: ''​numpy.float64''​           * Python: ''​numpy.float64''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]], [[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]:  +          * NetCDF: ​''​double'',​ ''​NC_DOUBLE'',​ ''​NF90_DOUBLE''​ 
-          * Fortran: +          * Fortran: ​''​REAL*8''​ 
-      * Special values: +      ​* **Special values**
-        * [[https://​en.wikipedia.org/​wiki/​NaN|NaN]] ​(''​numpy.nan''​): //Not a Number// +        * [[https://​en.wikipedia.org/​wiki/​NaN|NaN]]:​ //Not a Number// 
-        * Infinity ​(''​-numpy.inf''​ and ''​numpy.inf''​) +          * Python: ''​numpy.nan''​ 
-        * Note: it is cleaner to use masks (and [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values ! +        * Infinity 
-    * [[https://​en.wikipedia.org/​wiki/​Bit_numbering|Bit numbering]] +          * Python: ​''​-numpy.inf''​ and ''​numpy.inf''​ 
-    * [[https://​en.wikipedia.org/​wiki/​Endianness|Endianness]]+        * Note: it is cleaner to use masks (and [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.generic.html|Numpy masked arrays]]) ​rather ​than ''​NaN''​s, when you have to deal with missing values ! 
 +      * <wrap hi>The RISKS of working with (the wrong) floats</​wrap>:​ 
 +        ​* [[https://​en.wikipedia.org/​wiki/​Round-off_error|Round-off error]] 
 +        * [[https://​en.wikipedia.org/​wiki/​Catastrophic_cancellation|Catastrophic cancellation]] 
 +          * [[https://​docs.oracle.com/​cd/​E19957-01/​806-3568/​ncg_goldberg.html|What Every Computer Scientist Should Know About Floating-Point Arithmetic]]
     * A rather technical example: we //play// with a numpy 4-byte integer scalar     * A rather technical example: we //play// with a numpy 4-byte integer scalar
       * <​code>>>>​ one_int32 = np.int32(1)       * <​code>>>>​ one_int32 = np.int32(1)
other/python/misc_by_jyp.txt · Last modified: 2024/04/19 12:02 by jypeter