This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
other:python:misc_by_jyp [2022/05/23 15:36] jypeter [numpy related stuff] Added the "variable number of indices" section |
other:python:misc_by_jyp [2023/05/04 15:25] jypeter [Numerical values] Lots of changes |
||
---|---|---|---|
Line 5: | Line 5: | ||
</WRAP> | </WRAP> | ||
- | ==== Reading/setting environments variables ==== | ||
+ | ===== Reading/setting environments variables ===== | ||
<code>>>> os.environ['TMPDIR'] | <code>>>> os.environ['TMPDIR'] | ||
Line 17: | Line 17: | ||
</code> | </code> | ||
- | ==== Generating (aka raising) an error ==== | + | |
+ | ===== Generating (aka raising) an error ===== | ||
This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors | This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors | ||
Line 25: | Line 26: | ||
- | ==== Stopping a script ==== | + | ===== Stopping a script ===== |
A user can use ''CTRL-C'' or ''kill'' to stop a script, or ''CTRL-Z'' to suspend it temporarily (use ''fg'' to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error | A user can use ''CTRL-C'' or ''kill'' to stop a script, or ''CTRL-Z'' to suspend it temporarily (use ''fg'' to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error | ||
<code>sys.exit('Some optional message about why we are stopping')</code> | <code>sys.exit('Some optional message about why we are stopping')</code> | ||
- | + | ===== Checking if a file/directory is writable by the current user ===== | |
- | + | ||
- | ==== Checking if a file/directory is writable by the current user ==== | + | |
<code>>>> os.access('/', os.W_OK) | <code>>>> os.access('/', os.W_OK) | ||
Line 39: | Line 38: | ||
True</code> | True</code> | ||
- | ==== Playing with strings ==== | ||
- | === Filenames, etc... === | + | ===== Playing with strings ===== |
- | Check [[other:python:misc_by_jyp#working_with_paths_and_filenames|Working with paths and filenames]] and [[other:python:misc_by_jyp#generating_file_names|Generating file names]] | ||
- | === Splitting strings === | + | ==== Splitting (complex) strings ==== |
It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings | It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings | ||
Line 64: | Line 61: | ||
>>> shlex.split(complex_string) | >>> shlex.split(complex_string) | ||
['-o', '1', '--long', 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7']</code> | ['-o', '1', '--long', 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7']</code> | ||
+ | |||
+ | |||
==== Working with paths and filenames ==== | ==== Working with paths and filenames ==== | ||
Line 149: | Line 148: | ||
>>> f_tmp.close() | >>> f_tmp.close() | ||
>>> os.remove(f_tmp.name)</code> | >>> os.remove(f_tmp.name)</code> | ||
- | ==== Using command-line arguments ==== | ||
- | === The extremely easy but non-flexible way: sys.argv === | + | |
+ | ===== Using command-line arguments ===== | ||
+ | |||
+ | ==== The extremely easy but non-flexible way: sys.argv ==== | ||
The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''sys.argv'' strings' list | The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''sys.argv'' strings' list | ||
Line 173: | Line 174: | ||
2 tas_tes.nc</code> | 2 tas_tes.nc</code> | ||
- | === The C-style way: getopt === | + | |
+ | ==== The C-style way: getopt ==== | ||
Use [[https://docs.python.org/3/library/getopt.html|getopt]] (//C-style parser for command line options//) | Use [[https://docs.python.org/3/library/getopt.html|getopt]] (//C-style parser for command line options//) | ||
- | === The deprecated Python way: optparse === | + | |
+ | ==== The deprecated Python way: optparse ==== | ||
[[https://docs.python.org/3/library/optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://docs.python.org/3/library/argparse.html#upgrading-optparse-code|Upgrading optparse code]] for converting from ''optparse'' to ''argparse'') | [[https://docs.python.org/3/library/optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://docs.python.org/3/library/argparse.html#upgrading-optparse-code|Upgrading optparse code]] for converting from ''optparse'' to ''argparse'') | ||
- | === The current Python way: argparse === | + | |
+ | ==== The current Python way: argparse ==== | ||
[[https://docs.python.org/3/library/argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//) is available since Python version 3.2 | [[https://docs.python.org/3/library/argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//) is available since Python version 3.2 | ||
- | ==== Using ordered dictionaries ==== | + | |
+ | ===== Using ordered dictionaries ===== | ||
**Dictionary order is guaranteed to be insertion order**! Note that the [[https://docs.python.org/3/library/stdtypes.html#dict|usual Python dictionary]] also guarantees the order since version **3.6** | **Dictionary order is guaranteed to be insertion order**! Note that the [[https://docs.python.org/3/library/stdtypes.html#dict|usual Python dictionary]] also guarantees the order since version **3.6** | ||
Line 191: | Line 196: | ||
Check the [[https://docs.python.org/3/library/collections.html#collections.OrderedDict|OrderedDict class]] (''from collections import OrderedDict'') and the [[https://realpython.com/python-ordereddict/|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial | Check the [[https://docs.python.org/3/library/collections.html#collections.OrderedDict|OrderedDict class]] (''from collections import OrderedDict'') and the [[https://realpython.com/python-ordereddict/|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial | ||
- | ==== Using sets ==== | + | |
+ | ===== Using sets ===== | ||
[[https://docs.python.org/3/tutorial/datastructures.html#sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //something// and you can easily determine the **intersection**, **union** (and other similar operations) of sets. | [[https://docs.python.org/3/tutorial/datastructures.html#sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //something// and you can easily determine the **intersection**, **union** (and other similar operations) of sets. | ||
- | ==== Printing a readable version of long lists or dictionaries ==== | + | |
+ | ===== Printing a readable version of long lists or dictionaries ===== | ||
The [[https://docs.python.org/3/library/pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries, ...). It will wrap long lines in a meaningful way | The [[https://docs.python.org/3/library/pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries, ...). It will wrap long lines in a meaningful way | ||
Line 229: | Line 236: | ||
</code> | </code> | ||
- | ==== Sorting ==== | + | |
+ | ===== Storing objects and data in a file (shelve and friends) ===== | ||
+ | |||
+ | The built-in [[other:python:jyp_steps#the_shelve_package|shelve]] module can be **easily** used for storing temporary/intermediate data | ||
+ | |||
+ | More options: | ||
+ | * Some [[other:python:jyp_steps#data_file_formats|non-NetCDF]] file formats | ||
+ | * Working with [[other:python:jyp_steps#netcdf_filesusing_cdms2_xarray_and_netcdf4|NetCDF]] files | ||
+ | |||
+ | |||
+ | ===== Using a configuration file ===== | ||
+ | |||
+ | The built-in [[https://docs.python.org/3/library/configparser.html|configparser]] module can be easily used for reading (**and** writing!) text configuration files. | ||
+ | |||
+ | Note: a configuration file is also a way to easily store and exchange text data ! | ||
+ | |||
+ | |||
+ | ===== Working with global variables ===== | ||
+ | |||
+ | There is a good chance you don't actually want/need a //global// variable. Be sure to use the ''global'' statement correctly if you want to avoid side-effects... | ||
+ | |||
+ | * [[https://docs.python.org/3/faq/programming.html?highlight=global#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value|Using (and changing) a global variable inside a script or module]] | ||
+ | * Simple module example\\ <code>_myvar = 10 | ||
+ | |||
+ | def set_myvar(new_val): | ||
+ | # Note: need to explicitly define a global variable (of a module) | ||
+ | # as 'global' BEFORE changing its value in a function! | ||
+ | # Otherwise, the value will not be REdefined outside the function | ||
+ | global _myvar | ||
+ | _myvar = new_val | ||
+ | |||
+ | def get_myvar(): | ||
+ | return _myvar | ||
+ | |||
+ | def myfunc(nb_repeat = 10): | ||
+ | print(nb_repeat * _myvar)</code> | ||
+ | * [[https://docs.python.org/3/faq/programming.html?highlight=global#how-do-i-share-global-variables-across-modules|Sharing global variables across modules]] | ||
+ | ===== Sorting ===== | ||
* When dealing with **numerical values**, you should use the [[https://numpy.org/doc/stable/reference/routines.sort.html|numpy sorting, searching, and counting routines]]! | * When dealing with **numerical values**, you should use the [[https://numpy.org/doc/stable/reference/routines.sort.html|numpy sorting, searching, and counting routines]]! | ||
Line 246: | Line 290: | ||
['c', 'd', 'b', 'a']</code> | ['c', 'd', 'b', 'a']</code> | ||
- | ==== numpy related stuff ==== | + | ===== numpy related stuff ===== |
- | === Dealing with a variable number of indices === | + | ==== Using a numpy array to store arbitrary objects ==== |
+ | |||
+ | The numpy arrays are usually used to store [[https://numpy.org/doc/stable/reference/arrays.scalars.html|scalars]] of the same type (see also the [[https://numpy.org/doc/stable/reference/arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values. | ||
+ | |||
+ | It is also possible to store **arbitrary** Python objects in an array, rather than using nested lists or dictionaries! | ||
+ | |||
+ | <code>>>> some_array = np.empty((2, 3), dtype=object) | ||
+ | >>> some_array | ||
+ | array([[None, None, None], | ||
+ | [None, None, None]], dtype=object) | ||
+ | >>> some_array.shape | ||
+ | (2, 3) | ||
+ | >>> print(some_array[-1, -1]) | ||
+ | None | ||
+ | >>> some_array[-1, 0] = filled_contour # e.g. save an existing cartopy filled contour object | ||
+ | >>> some_array | ||
+ | array([[None, None, None], | ||
+ | [<cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>, | ||
+ | None, None]], dtype=object)</code> | ||
+ | |||
+ | |||
+ | ==== Dealing with a variable number of indices ==== | ||
[[https://numpy.org/doc/stable/user/basics.indexing.html#dealing-with-variable-indices|Official reference]] | [[https://numpy.org/doc/stable/user/basics.indexing.html#dealing-with-variable-indices|Official reference]] | ||
Line 291: | Line 356: | ||
(4, 10) | (4, 10) | ||
- | >>> # WARNING! WARNING! A slice is a VIEW and NOT A COPY | + | >>> # WARNING! DANGERRRR! NEVER forget that a VIEW is NOT A COPY |
- | >>> i10[my_fancy_slices] = -1 | + | >>> # and that you can change the content of the original array by mistake |
+ | >>> my_view = i10[my_slices] | ||
+ | >>> my_view[:, :] = -1 | ||
+ | >>> my_view | ||
+ | array([[-1., -1.], | ||
+ | [-1., -1.], | ||
+ | [-1., -1.], | ||
+ | [-1., -1.]]) | ||
>>> i10 | >>> i10 | ||
array([[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], | array([[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], | ||
[ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], | [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], | ||
[ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.], | [ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.], | ||
- | [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.], | + | [ 0., 0., 0., 1., -1., -1., 0., 0., 0., 0.], |
- | [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.], | + | [ 0., 0., 0., 0., -1., -1., 0., 0., 0., 0.], |
- | [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.], | + | [ 0., 0., 0., 0., -1., -1., 0., 0., 0., 0.], |
- | [-1., -1., -1., -1., -1., -1., -1., -1., -1., -1.], | + | [ 0., 0., 0., 0., -1., -1., 1., 0., 0., 0.], |
[ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], | [ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], | ||
[ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], | [ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], | ||
- | [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]]) | + | [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])</code> |
- | </code> | + | |
- | === Finding and counting unique values === | + | |
+ | ==== Finding and counting unique values ==== | ||
Use ''np.unique'', do **not** try to use histogram related functions! | Use ''np.unique'', do **not** try to use histogram related functions! | ||
Line 326: | Line 398: | ||
array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])</code> | array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])</code> | ||
- | === Applying a ufunc over all the elements of an array === | + | |
+ | ==== Applying a ufunc over all the elements of an array ==== | ||
There are all sorts of //ufuncs// (Universal Functions), and we will just use below ''add'' from the [[https://numpy.org/doc/stable/reference/ufuncs.html#math-operations|math operations]], applied on the arrays defined in [[#finding_and_counting_unique_values|Finding and counting unique values]] | There are all sorts of //ufuncs// (Universal Functions), and we will just use below ''add'' from the [[https://numpy.org/doc/stable/reference/ufuncs.html#math-operations|math operations]], applied on the arrays defined in [[#finding_and_counting_unique_values|Finding and counting unique values]] | ||
Line 359: | Line 432: | ||
(3.0, 4.5, 8.0)</code> | (3.0, 4.5, 8.0)</code> | ||
- | === Applying a ufunc over specified sections of an array === | + | |
+ | ==== Applying a ufunc over specified sections of an array ==== | ||
The [[https://numpy.org/doc/stable/reference/generated/numpy.ufunc.reduceat.html#numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //improves// what has been shown above | The [[https://numpy.org/doc/stable/reference/generated/numpy.ufunc.reduceat.html#numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //improves// what has been shown above | ||
Line 377: | Line 451: | ||
>>> np.add.reduceat(np.sort(vals), slices_indices) | >>> np.add.reduceat(np.sort(vals), slices_indices) | ||
array([3. , 4.5, 8. ])</code> | array([3. , 4.5, 8. ])</code> | ||
+ | |||
+ | |||
+ | ===== matplotlib related stuff ===== | ||
+ | |||
+ | ==== Working with time axes (and ticks) ==== | ||
+ | |||
+ | If you have problems setting the limits of a time axis, choosing the ticks' locations, or specifying the style of the labels, you should check the: | ||
+ | * [[https://matplotlib.org/stable/gallery/index.html#ticks|Ticks examples' gallery]] | ||
+ | * [[https://matplotlib.org/stable/gallery/text_labels_and_annotations/date.html|Date tick labels example]] | ||
+ | |||
+ | |||
+ | ===== Data representation ===== | ||
+ | |||
+ | A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// | ||
+ | |||
+ | FIXME Add parts (pages 28 to 37) of this [[https://wiki.lsce.ipsl.fr/pmip3/doku.php/other:python:jyp_steps#part_2|old tutorial]] to this section | ||
+ | |||
+ | ==== Base notions ==== | ||
+ | |||
+ | * **Never forget** that all the bits and pieces of information we use are coded in [[https://en.wikipedia.org/wiki/Binary_number#Counting_in_binary|base 2]] (''0''s and ''1''s ...), grouped in bytes! | ||
+ | * Some things can be stored exactly (integers, characters, ...) | ||
+ | * In other cases (**//real// numbers** that we work with all the time, compressed images/videos/music) we only store **//good enough approximation//** | ||
+ | |||
+ | * 1 byte <=> 8 bits | ||
+ | * ''REAL*4'' <=> 4 bytes <=> 32 bits | ||
+ | * For easier written/displayed representation, 1 byte is usually split into 2 groups of 4 bits, and displayed using base 16 and [[https://en.wikipedia.org/wiki/Hexadecimal|hexadecimal representation]] (characters ''0'', ''1'', ..., ''A'', ''B'', ..., ''F'') | ||
+ | * ''0000'' <=> ''0'',\\ ''0010'' <=> ''1'', ...,\\ ''1111'' <=> ''F'' | ||
+ | * ''1101'' <=> ''D'' in hexadecimal <=> ''13'' in decimal (''**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1'') | ||
+ | * ''11111101'' in //base 2// <=> ''1111 1101'' <=> ''FD'' in //hexadecimal// <=> ''253'' (''15 * 16 + 13'') in //decimal// | ||
+ | |||
+ | * Base conversion with Python | ||
+ | * <code>>>> hex(13) # Decimal to Hexadecimal conversion | ||
+ | '0xd' | ||
+ | >>> hex(253) | ||
+ | '0xfd' | ||
+ | >>> hex(256) | ||
+ | '0x100' | ||
+ | >>> int('0x100', 16) # Hexadecimal to Decimal conversion | ||
+ | 256 | ||
+ | >>> int('1111', 2) # Binary to Decimal conversion | ||
+ | 15 | ||
+ | >>> int('11111101', 2) # '11111101' <=> '1111 1101' <=> 'FD' <=> 15 * 16 + 13 = 253 | ||
+ | 253 | ||
+ | >>> 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0 | ||
+ | 11 | ||
+ | >>> int('13', 8) # 1*8 + 3 | ||
+ | 11</code> | ||
+ | |||
+ | * More technical topics | ||
+ | * [[https://en.wikipedia.org/wiki/Bit_numbering|Bit numbering]]: the art of ordering bits, everything about MSB (Most Significant Byte) and LSB (Least Significant Byte) | ||
+ | * [[https://en.wikipedia.org/wiki/Endianness|Endianness]]: the art of ordering bytes | ||
+ | ==== Numerical values ==== | ||
+ | |||
+ | * Binary data representation of some numbers (only some common types are listed here): | ||
+ | * Languages and packages **references** used below: | ||
+ | * Python: [[https://numpy.org/doc/stable/reference/arrays.scalars.html#sized-aliases|NumPy Sized aliases]] | ||
+ | * NetCDF: [[https://docs.unidata.ucar.edu/nug/current/md_types.html|Data Types]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|Fortran related Data Types]], [[https://docs.unidata.ucar.edu/nug/current/_c_d_l.html#cdl_data_types|CDL Data Types]] | ||
+ | * Fortran: Intel Fortran Compiler [[https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-1/intrinsic-data-types.html|Intrinsic Data Types]] | ||
+ | * [[https://en.wikipedia.org/wiki/Integer_(computer_science)|Integers]] | ||
+ | * Range: | ||
+ | * 4-byte //signed// integers: ''−2,147,483,648'' to ''2,147,483,647'' | ||
+ | * Python: ''numpy.int32'' | ||
+ | * NetCDF: ''int'', ''NC_INT'' or ''NC_LONG'', ''NF90_INT'' | ||
+ | * Fortran: ''INTEGER*4'' | ||
+ | * 8-byte //signed// integers: ''−9,223,372,036,854,775,808'' to ''9,223,372,036,854,775,807'' | ||
+ | * Python: ''numpy.int64'' | ||
+ | * NetCDF: ''int64'', ''NC_INT64'' | ||
+ | * Fortran: ''INTEGER*8'' | ||
+ | * Tech note: signed integers use [[https://en.wikipedia.org/wiki/Two%27s_complement|two's complement]] for coding negative integers | ||
+ | * [[https://en.wikipedia.org/wiki/IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//) | ||
+ | * Range: | ||
+ | * 4-byte float: ''~8 significant digits * 10E±38'' | ||
+ | * Python: ''numpy.float32'' | ||
+ | * NetCDF: ''float'', ''NC-FLOAT'', ''NF90_FLOAT'' | ||
+ | * Fortran:''REAL*4'' | ||
+ | * See also [[https://en.wikipedia.org/wiki/Single-precision_floating-point_format|Single-precision floating-point format]] | ||
+ | * 8-byte float: ''~15 significant digits * 10E±308'' | ||
+ | * Python: ''numpy.float64'' | ||
+ | * NetCDF: ''double'', ''NC_DOUBLE'', ''NF90_DOUBLE'' | ||
+ | * Fortran: ''REAL*8'' | ||
+ | * **Special values**: | ||
+ | * [[https://en.wikipedia.org/wiki/NaN|NaN]]: //Not a Number// | ||
+ | * Python: ''numpy.nan'' | ||
+ | * Infinity | ||
+ | * Python: ''-numpy.inf'' and ''numpy.inf'' | ||
+ | * Note: it is cleaner to use masks (and [[https://numpy.org/doc/stable/reference/maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values ! | ||
+ | * <wrap hi>The RISKS of working with (the wrong) floats</wrap>: | ||
+ | * [[https://en.wikipedia.org/wiki/Round-off_error|Round-off error]] | ||
+ | * [[https://en.wikipedia.org/wiki/Catastrophic_cancellation|Catastrophic cancellation]] | ||
+ | * [[https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html|What Every Computer Scientist Should Know About Floating-Point Arithmetic]] | ||
+ | * A rather technical example: we //play// with a numpy 4-byte integer scalar | ||
+ | * <code>>>> one_int32 = np.int32(1) | ||
+ | >>> one_int32 | ||
+ | 1 | ||
+ | >>> type(one_int32) | ||
+ | <class 'numpy.int32'> | ||
+ | >>> one_int32.dtype | ||
+ | dtype('int32') | ||
+ | >>> one_int32.shape # A numpy SCALAR, is an ARRAY WITH NO SHAPE ! | ||
+ | () | ||
+ | >>> one_int32[0] | ||
+ | Traceback (most recent call last): | ||
+ | File "<stdin>", line 1, in <module> | ||
+ | IndexError: invalid index to scalar variable. | ||
+ | >>> one_int32[()] # Note how to access the single element, when there is NO SHAPE | ||
+ | 1 | ||
+ | >>> one_int32.ndim # NO SHAPE means no dimensions, but there is ONE element | ||
+ | 0 | ||
+ | >>> one_int32.size | ||
+ | 1 | ||
+ | >>> one_int32.nbytes # The element requires 4 bytes of storage | ||
+ | 4 | ||
+ | >>> hex(one_int32) # We can print the hexadecimal representation for INTEGERS scalars and arrays | ||
+ | '0x1' | ||
+ | >>> hex(one_int32 * 15) | ||
+ | '0xf' | ||
+ | >>> hex(one_int32 * 16) | ||
+ | '0x10' | ||
+ | |||
+ | # 'Serialize' the data (i.e. change the data to a series of bytes) | ||
+ | # Note: the serialized data seems to be printed in the reverse order of 'hex(one_int32)' | ||
+ | >>> one_int32_serialized = one_int32.tobytes() | ||
+ | >>> type(one_int32_serialized) | ||
+ | <class 'bytes'> | ||
+ | >>> len(one_int32_serialized) | ||
+ | 4 | ||
+ | >>> one_int32_serialized | ||
+ | b'\x01\x00\x00\x00' | ||
+ | >>> one_int32_serialized.hex(' ') # Another way to print the hexadecimal values | ||
+ | '01 00 00 00' | ||
+ | |||
+ | # Use the following in the unlikely case where you need to change the endianness (bytes ordering) | ||
+ | >>> one_int32_reversed_endian = one_int32.byteswap() | ||
+ | >>> one_int32_reversed_endian # Same bytes in a different order represent a different number (of course) | ||
+ | 16777216 | ||
+ | >>> hex(one_int32_reversed_endian) # Compare to the output of hex(one_int32) above | ||
+ | '0x1000000' | ||
+ | >>> one_int32_reversed_endian.tobytes() | ||
+ | b'\x00\x00\x00\x01'</code> | ||
+ | * Another technical example: we use an array of 2 integers\\ When using ''byteswap()'', notice how bytes are swapped by groups of 4 bytes, because int32 use 4 bytes | ||
+ | * <code>>>> array_example = np.asarray((3, 17), dtype=np.int32) | ||
+ | >>> array_example | ||
+ | array([ 3, 17], dtype=int32) | ||
+ | >>> array_example.shape, array_example.ndim, array_example.size, array_example.nbytes | ||
+ | ((2,), 1, 2, 8) | ||
+ | >>> array_example.tobytes().hex(' ', 4) | ||
+ | '03000000 11000000' | ||
+ | >>> array_example.byteswap().tobytes().hex(' ', 4) | ||
+ | '00000003 00000011' | ||
+ | </code> | ||
+ | |||
+ | * Manipulating binary data with [[https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview|bytes, bytearray, memoryview]] | ||
+ | |||
+ | * Array addressing | ||
+ | * [[https://www.geeksforgeeks.org/calculation-of-address-of-element-of-1-d-2-d-and-3-d-using-row-major-and-column-major-order/|Calculation of address of element of 1-D, 2-D, and 3-D using row-major and column-major order]] | ||
+ | * In other words: //using indices to go from 1-D to n-Dimnensions data// | ||
+ | * The [[https://en.wikipedia.org/wiki/Array_(data_structure)|array]] structure | ||
+ | * python/C vs Fortran... | ||
+ | |||
+ | * disk and ram usage: how to check the usage (available ram and disk), best practice on multi-user systems (how much allowed?) | ||
+ | * ''du'', ''df'', ''cat /proc/meminfo'', ''top'' | ||
+ | |||
+ | * understanding and reverse-engineering //binary// format | ||
+ | * ''od'', ''strings'' | ||
+ | |||
+ | * binary vs text format: ascii, utf, raw | ||
+ | * text related functions in python: ''str'', ''int'', ''float'', ''ord'', ... | ||
+ | * lists conversion with ''map'' and ''join'' | ||
+ | |||
+ | * Misc : ''md5sum'' | ||
+ | |||
+ | ==== Strings ==== | ||
+ | |||
+ | * Encoding, [[https://en.wikipedia.org/wiki/ASCII|ASCII]], [[https://en.wikipedia.org/wiki/Unicode|unicode]], [[https://en.wikipedia.org/wiki/UTF-8|UTF-8]], ... | ||
+ | |||
+ | * Getting the binary representation of a string | ||
+ | * <code>>>> test_string = 'A B 0 1 à µ' | ||
+ | >>> type(test_string) | ||
+ | <class 'str'> | ||
+ | >>> len(test_string) | ||
+ | 11 | ||
+ | >>> test_string_bin = test_string.encode('utf-8') | ||
+ | >>> test_string_bin | ||
+ | b'A B 0 1 \xc3\xa0 \xc2\xb5' | ||
+ | >>> type(test_string_bin) | ||
+ | <class 'bytes'> | ||
+ | >>> len(test_string_bin) | ||
+ | 13 | ||
+ | >>> test_string_bin.hex('-') | ||
+ | '41-20-42-20-30-20-31-20-c3-a0-20-c2-b5' | ||
+ | </code> | ||
+ | |||
/* | /* | ||
- | ==== Tip template ==== | + | ===== Tip template ===== |
<code>Some code</code> | <code>Some code</code> |