| Both sides previous revisionPrevious revisionNext revision | Previous revision | 
| other:python:misc_by_jyp [2023/05/04 17:25]  – [Numerical values] Lots of changes jypeter | other:python:misc_by_jyp [2025/08/29 11:17] (current)  – [Extra tutorials] Added the "Stats stuff" category jypeter | 
|---|
| </WRAP> | </WRAP> | 
|  |  | 
|  | ===== Extra tutorials ===== | 
|  |  | 
|  | Only **when you have already read all the content of this page several times**, and you are looking for new ideas | 
|  |  | 
|  | * [[https://medium.com/data-science/calculating-distance-between-two-geolocations-in-python-26ad3afe287b|Calculating distance between two geo-locations in Python]]: | 
|  | * ''[[https://github.com/mapado/haversine|haversine]]'', ''[[https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.haversine_distances.html|haversine_distances]] @ scikit-learn'' and [[https://en.wikipedia.org/wiki/Haversine_formula|Haversine formula]] | 
|  | * Looking at table data with ''pandas'' | 
|  | * [[https://blog.devgenius.io/data-profiling-in-python-common-ways-to-explore-your-data-part-1-0efd0dedff75|Summary information]] | 
|  | * [[https://blog.devgenius.io/data-profiling-in-python-common-ways-to-explore-your-data-part-2-396384522e91|More detailed information]] | 
|  | * [[https://blog.devgenius.io/data-cleansing-in-python-common-ways-to-clean-your-data-3459a256dd85|Table data cleaning]] | 
|  | * Stats stuff | 
|  | * [[https://medium.com/@tubelwj/python-outlier-detection-iqr-method-and-z-score-implementation-8e825edf4b32|Python Outlier Detection: IQR Method and Z-score Implementation]] | 
|  | * [[https://medium.com/pythons-gurus/clean-code-in-python-good-vs-bad-practices-examples-2df344bddacc|Clean Code in Python: Good vs. Bad Practices Examples]] | 
|  | * [[https://peps.python.org/pep-0008/|PEP 8 – Style Guide for Python Code]] | 
|  | * [[https://realpython.com/python-pep8/|How to Write Beautiful Python Code With PEP 8]] | 
|  | * [[https://www.datacamp.com/tutorial/pep8-tutorial-python-code|PEP-8 Tutorial: Code Standards in Python]] | 
|  | * Some checkers/linters: [[https://docs.astral.sh/ruff/|ruff]], [[https://flake8.pycqa.org/en/stable/|flake8]] | 
|  | * [[https://medium.com/@yaduvanshineelam09/ultimate-python-cheat-sheet-practical-python-for-everyday-tasks-8a33abc0892f|Ultimate Python Cheat Sheet: Practical Python For Everyday Tasks]] | 
|  | * [[https://medium.com/pythoneers/16-hacks-that-will-take-your-python-skills-to-the-next-level-12e7a9b97421|16 Hacks That Will Take Your Python Skills to the Next Level]] | 
|  | * [[https://levelup.gitconnected.com/modular-coding-in-python-finally-solve-your-import-errors-af2fd172fcf7|Modular Coding in Python: Finally Solve your Import Errors]] (understanding and fixing ModuleNotFoundError and ImportError) | 
|  | * [[https://medium.com/@moraneus/understanding-multithreading-and-multiprocessing-in-python-1ed39bb078d5|Understanding Multithreading and Multiprocessing in Python]] | 
| ===== Reading/setting environments variables ===== | ===== Reading/setting environments variables ===== | 
|  |  | 
|  |  | 
|  |  | 
|  | ===== Using log files (aka logging) ===== | 
|  |  | 
|  | It is always possible to display information messages using the ''print()'' command, but it is more efficient to use //logging// tools when you want to **display correctly a lot of information about a script progress | 
|  | ** | 
|  | * [[https://loguru.readthedocs.io/|Loguru]] is a library which aims to bring enjoyable logging in Python | 
|  | * See also [[https://betterstack.com/community/guides/logging/loguru/|A Complete Guide to Logging in Python with Loguru]] | 
|  | * More on [[https://betterstack.com/community/guides/logging/#python|logging with python]] | 
|  | * The default (but not easy to use) Python ''[[https://docs.python.org/3/library/logging.html|logging]]'' module | 
| ===== Stopping a script ===== | ===== Stopping a script ===== | 
|  |  | 
| ===== Playing with strings ===== | ===== Playing with strings ===== | 
|  |  | 
|  | ==== String formatting ==== | 
|  |  | 
|  | * Knowing how to display/print a string correctly is always useful for information and debugging purpose | 
|  | * There are lots of different ways to display strings | 
|  |  | 
|  | === String formatting examples === | 
|  |  | 
|  | You will find below some examples of //quick printing//, as well as using //old style formatting//, //formatted string literals (f-strings)// and the //String ''format()'' Method//. More details in the next section | 
|  |  | 
|  | <code python> | 
|  | >>> # Basic (but quick and efficient) printing | 
|  |  | 
|  | >>> year = 1984 | 
|  | >>> print(year) | 
|  | 1984 | 
|  | >>> print('[', year, 'is a famous book ]') | 
|  | [ 1984 is a famous book ] | 
|  |  | 
|  | >>> # Old style formatting | 
|  |  | 
|  | >>> print('[ %i is a famous book ]' % (year,)) | 
|  | [ 1984 is a famous book ] | 
|  | >>> print('[ %10i is a famous book ]' % (year,)) | 
|  | [       1984 is a famous book ] | 
|  | >>> print('[ %-10i is a famous book ]' % (year,)) | 
|  | [ 1984       is a famous book ] | 
|  | >>> print('[ %010i is a famous book ]' % (year,)) | 
|  | [ 0000001984 is a famous book ] | 
|  |  | 
|  | >>> # Formatted string literals (f-strings) | 
|  |  | 
|  | >>> print(f'[ {year} is a famous book ]') | 
|  | [ 1984 is a famous book ] | 
|  | >>> print(f'[ {year=} is a famous book ]') | 
|  | [ year=1984 is a famous book ] | 
|  | >>> print(f'[ {year:10} is a famous book ]') | 
|  | [       1984 is a famous book ] | 
|  | >>> print(f'[ {year:<10} is a famous book ]') | 
|  | [ 1984       is a famous book ] | 
|  | >>> print(f'[ {year:010} is a famous book ]') | 
|  | [ 0000001984 is a famous book ] | 
|  | >>> print(f'[ {year:10.2f} is a famous book (yes, {year}!) ]') | 
|  | [    1984.00 is a famous book (yes, 1984!) ] | 
|  |  | 
|  | >>> # The String format() Method | 
|  |  | 
|  | >>> print('[ {} is a famous book ]'.format(year)) | 
|  | [ 1984 is a famous book ] | 
|  | >>> print('[ {:10} is a famous book ]'.format(year)) | 
|  | [       1984 is a famous book ] | 
|  | >>> print('[ {:<10} is a famous book ]'.format(year)) | 
|  | [ 1984       is a famous book ] | 
|  | >>> print('[ {:010} is a famous book ]'.format(year)) | 
|  | [ 0000001984 is a famous book ] | 
|  | >>> print('[ {:10.2f} is a famous book  (yes, {}!) ]'.format(year, year)) | 
|  | [    1984.00 is a famous book  (yes, 1984!) ] | 
|  | >>> print('[ {title:10.2f} is a famous book  (yes, {title}!) ]'.format(title=year)) | 
|  | [    1984.00 is a famous book  (yes, 1984!) ] | 
|  | >>> print('[ {title:10.2e} is a famous book ]'.format(title=year)) | 
|  | [   1.98e+03 is a famous book ]</code> | 
|  |  | 
|  | === String formatting references === | 
|  |  | 
|  | * [[https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals|Formatted String Literals]] (//f-strings//) | 
|  | * Available in Python >= 3.6 | 
|  | * [[https://docs.python.org/3/reference/lexical_analysis.html#f-strings|More documentation]] | 
|  | * [[https://docs.python.org/3/library/string.html#formatspec|Format Specification Mini-Language]] | 
|  | * See also the [[https://pyformat.info/|PyFormat site]] | 
|  |  | 
|  | * [[https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method|The String format() Method]] | 
|  | * [[https://docs.python.org/3/library/string.html#formatspec|Format Specification Mini-Language]] | 
|  | * See also the [[https://pyformat.info/|PyFormat site]] | 
|  |  | 
|  | * [[https://pyformat.info/|PyFormat site]]: string formatting using the //old style// and the //String ''format()'' method// | 
|  | * <wrap hi>Hint</wrap>: this can also be used as an **easy documentation for //f-strings// format**! | 
|  |  | 
|  | * [[https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method|Old string formatting]] | 
| ==== Splitting (complex) strings ==== | ==== Splitting (complex) strings ==== | 
|  |  | 
| ==== Working with paths and filenames ==== | ==== Working with paths and filenames ==== | 
|  |  | 
| If you are in a hurry, you can just use string functions to work with path and file names. But you will need some specific functions to check if a file exists, and similar operations. All these are available in 2 libraries that have similar functions. Both of these libraries can deal with Unix-type paths on Linux computers, and Windows-type paths on Windows computers | If you are in a hurry, you can just use string functions to work with paths and file names. | 
|  |  | 
| * [[https://docs.python.org/3/library/os.path.html|os.path]] //Common pathname manipulations// |  | 
|  | You will need some specific objects and functions to check if a file exists, and similar operations. Check the libraries listed below, that can automatically deal with Unix-type paths on Linux and MacOS computers, and Windows-type paths on Windows computers | 
|  |  | 
|  | * [[https://docs.python.org/3/library/os.path.html|os.path]]: //common pathname manipulations// | 
| * Available since... a long time! Use this if you want to avoid backward compatibility problems | * Available since... a long time! Use this if you want to avoid backward compatibility problems | 
| * Some functions are directly in [[https://docs.python.org/3/library/os.html|os]] //Miscellaneous operating system interfaces//\\ e.g. [[https://docs.python.org/3/library/os.html#os.remove|os.remove]] and [[https://docs.python.org/3/library/os.html#os.rmdir|os.rmdir]] | * Some functions are directly in [[https://docs.python.org/3/library/os.html|os]] //Miscellaneous operating system interfaces//\\ e.g. [[https://docs.python.org/3/library/os.html#os.remove|os.remove]] and [[https://docs.python.org/3/library/os.html#os.rmdir|os.rmdir]] | 
| * [[https://docs.python.org/3/library/pathlib.html|pathlib]] //Object-oriented filesystem paths// | * [[https://docs.python.org/3/library/pathlib.html|pathlib]]: a **more recent** //object-oriented// way to deal with //filesystem paths// | 
| * Available since Python version 3.4 | * Available since Python version 3.4 | 
| * [[https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module|Matching pathlib, and os or os.path functions]] | * [[https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module|Matching pathlib, and os or os.path functions]] | 
| * [[https://docs.python.org/3/library/shutil.html|High-level file operations]] | * [[https://docs.python.org/3/library/shutil.html|shutil]]: High-level file operations, e.g copy/move a file or directory tree | 
|  |  | 
|  |  | 
| === Example: getting the full path of the Python used === | === Example: getting the full path of the Python executable used === | 
|  |  | 
| Note: the actual python may be different from the default python! | Note: the actual python may be different from the default python! | 
| /usr/bin/python | /usr/bin/python | 
|  |  | 
| $ /modfs/modtools/miniconda3//envs/analyse_3.6_test/bin/python | $ /home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/bin/python | 
| >>> import sys, shutil | >>> import sys, shutil | 
| >>> shutil.which('python') | >>> shutil.which('python') | 
| '/usr/bin/python' | '/usr/bin/python' | 
| >>> sys.executable | >>> sys.executable | 
| '/modfs/modtools/miniconda3//envs/analyse_3.6_test/bin/python'</code> | '/home/share/unix_files/cdat/miniconda3_21-02/envs/cdatm_py3/bin/python'</code> | 
|  |  | 
|  |  | 
| </code> | </code> | 
|  |  | 
|  |  | 
|  | === Example: system independent paths with pathlib === | 
|  |  | 
|  | Note: the following example was generated on a Linux server and uses a <wrap em>/</wrap> character as a path separator | 
|  |  | 
|  | <code>>>> my_home = Path.home() | 
|  | >>> my_home | 
|  | PosixPath('/home/users/my_login') | 
|  | >>> my_conf = my_home / '.config' / 'evince' | 
|  | >>> my_conf | 
|  | PosixPath('/home/users/my_login/.config/evince') | 
|  | >>> my_conf.is_dir() | 
|  | True | 
|  | >>> my_conf.is_file() | 
|  | False | 
|  | >>> list(my_conf.glob('*')) | 
|  | [PosixPath('/home/users/my_login/.config/evince/evince_toolbar.xml'), PosixPath(' /home/users/my_login/.config/evince/accels')] | 
|  | >>> [ ff.name for ff in my_conf.glob('*') ] | 
|  | ['evince_toolbar.xml', 'accels'] | 
|  | </code> | 
|  |  | 
| === Example: getting the size(s) of all the files in a directory === | === Example: getting the size(s) of all the files in a directory === | 
| ['c', 'd', 'b', 'a']</code> | ['c', 'd', 'b', 'a']</code> | 
|  |  | 
|  |  | 
|  | ===== Efficient looping with numpy, map, itertools and list comprehension ===== | 
|  |  | 
|  | <wrap hi>Big, nested, explicit ''for'' loops should be avoided at all cost</wrap>, in order to reduce a script execution time! | 
|  |  | 
|  | * **''numpy'' arrays** should be used when dealing with //numerical data// | 
|  | * **Masked arrays** can be used to deal with //special cases// and remove tests from loops | 
|  |  | 
|  | * The built-in [[https://docs.python.org/3/library/functions.html?highlight=map#map|map]] function (and similar functions like [[https://docs.python.org/3/library/functions.html?highlight=zip#zip|zip]], [[https://docs.python.org/3/library/functions.html?highlight=filter#filter|filter]], ...) can be used to efficiently apply a function (possibly a //simple// [[https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions|lambda]] function) to all the elements of a list | 
|  | * <code>>>> my_ints = [1, 2, 3] | 
|  |  | 
|  | >>> map(str, my_ints) | 
|  | ['1', '2', '3'] | 
|  |  | 
|  | >>> map(lambda ii: str(10*ii + 5), my_ints) | 
|  | ['15', '25', '35']</code> | 
|  |  | 
|  | * The [[https://docs.python.org/3/library/itertools.html|itertools]] module defines many more fancy iterators that can be used for efficient looping | 
|  | * Example: replacing nested loops with [[https://docs.python.org/3/library/itertools.html#itertools.product|product]] | 
|  | * <code>>>> it.product('AB', '01') | 
|  | <itertools.product object at 0x2b35a7b5f100> | 
|  |  | 
|  | >>> list(it.product('AB', '01')) | 
|  | [('A', '0'), ('A', '1'), ('B', '0'), ('B', '1')] | 
|  |  | 
|  | >>> for c1, c2 in it.product('AB', '01'): | 
|  | ...   print(c1 + c2) | 
|  | ... | 
|  | A0 | 
|  | A1 | 
|  | B0 | 
|  | B1 | 
|  |  | 
|  | >>> for c1, c2 in it.product(['A', 'B'], ['0', '1']): | 
|  | ...   print(c1 + c2) | 
|  | ... | 
|  | A0 | 
|  | A1 | 
|  | B0 | 
|  | B1 | 
|  |  | 
|  | >>> for c1, c2, c3 in it.product('AB', '01', '$!'): | 
|  | ...   print(c1 + c2 + c3, end=', ') | 
|  | ... | 
|  | A0$, A0!, A1$, A1!, B0$, B0!, B1$, B1!,</code> | 
|  |  | 
|  | * The [[https://docs.python.org/3/tutorial/datastructures.html?highlight=comprehension#list-comprehensions|list comprehension]] (aka //implicit loops//) can also be used to generate lists from lists | 
|  | * Example: converting a list of integers to a list of strings\\ Note: in that case, you should rather use the ''map'' function detailed above | 
|  | * <code>>>> my_ints = [1, 2, 3] | 
|  |  | 
|  | >>> [ str(ii) for ii in my_ints ] | 
|  | ['1', '2', '3']</code> | 
| ===== numpy related stuff ===== | ===== numpy related stuff ===== | 
|  |  | 
| array([3. , 4.5, 8. ])</code> | array([3. , 4.5, 8. ])</code> | 
|  |  | 
|  | ==== Exercise your brain with numpy ==== | 
|  |  | 
|  | Have a look at [[https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.ipynb|100 numpy exercises]] | 
|  |  | 
| ===== matplotlib related stuff ===== | ===== matplotlib related stuff ===== | 
| A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// | A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// | 
|  |  | 
| FIXME Add parts (pages 28 to 37) of this [[https://wiki.lsce.ipsl.fr/pmip3/doku.php/other:python:jyp_steps#part_2|old tutorial]] to this section | FIXME Add parts (pages 28 to 37) of this [[http://www.lsce.ipsl.fr/Phocea/file.php?class=page&file=5/pythonCDAT_jyp_2sur2_070306.pdf|old tutorial]] to this section | 
|  |  | 
| ==== Base notions ==== | ==== Base notions ==== | 
| * Infinity | * Infinity | 
| * Python: ''-numpy.inf'' and ''numpy.inf'' | * Python: ''-numpy.inf'' and ''numpy.inf'' | 
| * Note: it is cleaner to use masks (and [[https://numpy.org/doc/stable/reference/maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values ! | * Note: it is cleaner to use masks (and [[https://numpy.org/doc/stable/reference/maskedarray.generic.html|Numpy masked arrays]]) rather than ''NaN''s, when you have to deal with missing values ! | 
| * <wrap hi>The RISKS of working with (the wrong) floats</wrap>: | * <wrap hi>The RISKS of working with (the wrong) floats</wrap>: | 
| * [[https://en.wikipedia.org/wiki/Round-off_error|Round-off error]] | * [[https://en.wikipedia.org/wiki/Round-off_error|Round-off error]] |