Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision |
other:python:misc_by_jyp [2021/10/27 16:05] – Added sets jypeter | other:python:misc_by_jyp [2023/05/04 11:46] – [Data representation] Added the Base notions section jypeter |
---|
</WRAP> | </WRAP> |
| |
==== Reading/setting environments variables ==== | |
| |
| ===== Reading/setting environments variables ===== |
| |
<code>>>> os.environ['TMPDIR'] | <code>>>> os.environ['TMPDIR'] |
</code> | </code> |
| |
==== Generating (aka raising) an error ==== | |
| ===== Generating (aka raising) an error ===== |
| |
This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors | This will stop the script, unless it is called in a function, and the code calling the function explicitely catches and deals with errors |
| |
| |
==== Stopping a script ==== | ===== Stopping a script ===== |
| |
A user can use ''CTRL-C'' or ''kill'' to stop a script, or ''CTRL-Z'' to suspend it temporarily (use ''fg'' to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error | A user can use ''CTRL-C'' or ''kill'' to stop a script, or ''CTRL-Z'' to suspend it temporarily (use ''fg'' to resume a suspended script). The code below can be used by the script itself to interrupt its execution, instead of raising an error |
| |
<code>sys.exit('Some optional message about why we are stopping')</code> | <code>sys.exit('Some optional message about why we are stopping')</code> |
| ===== Checking if a file/directory is writable by the current user ===== |
| |
==== Checking if a file/directory is writable by the current user ==== | |
| |
<code>>>> os.access('/', os.W_OK) | <code>>>> os.access('/', os.W_OK) |
>>> os.access('/home/jypmce/.bashrc', os.W_OK) | >>> os.access('/home/jypmce/.bashrc', os.W_OK) |
True</code> | True</code> |
| |
| |
| ===== Playing with strings ===== |
| |
| |
| ==== Splitting (complex) strings ==== |
| |
| It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings |
| |
| <code>>>> str_with_blanks = 'one two\t3\t\tFOUR' |
| >>> str_with_blanks.split() |
| ['one', 'two', '3', 'FOUR'] |
| |
| >>> str_with_simple_delimiters = '1,2,3.14, 4' |
| >>> str_with_simple_delimiters.split(',') |
| ['1', '2', '3.14', ' 4'] |
| |
| >>> complex_string='-o 1 --long "A string with accented chars: é è à ç"' |
| >>> complex_string.split() |
| ['-o', '1', '--long', '"A', 'string', 'with', 'accented', 'chars:', '\xc3\xa9', '\xc3\xa8', '\xc3\xa0', '\xc3\xa7"'] |
| |
| >>> import shlex |
| >>> shlex.split(complex_string) |
| ['-o', '1', '--long', 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7']</code> |
| |
| |
==== Working with paths and filenames ==== | ==== Working with paths and filenames ==== |
>>> f_tmp.close() | >>> f_tmp.close() |
>>> os.remove(f_tmp.name)</code> | >>> os.remove(f_tmp.name)</code> |
==== Using command-line arguments ==== | |
| |
=== The extremely easy but non-flexible way: sys.argv === | |
| ===== Using command-line arguments ===== |
| |
| ==== The extremely easy but non-flexible way: sys.argv ==== |
| |
The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''sys.argv'' strings' list | The name of a script, the number of arguments (including the name of the script), and the arguments (as strings) can be accessed through the ''sys.argv'' strings' list |
2 tas_tes.nc</code> | 2 tas_tes.nc</code> |
| |
=== The C-style way: getopt === | |
| ==== The C-style way: getopt ==== |
| |
Use [[https://docs.python.org/3/library/getopt.html|getopt]] (//C-style parser for command line options//) | Use [[https://docs.python.org/3/library/getopt.html|getopt]] (//C-style parser for command line options//) |
| |
=== The deprecated Python way: optparse === | |
| ==== The deprecated Python way: optparse ==== |
| |
[[https://docs.python.org/3/library/optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://docs.python.org/3/library/argparse.html#upgrading-optparse-code|Upgrading optparse code]] for converting from ''optparse'' to ''argparse'') | [[https://docs.python.org/3/library/optparse.html|optparse]] (//parser for command line options//) is **deprecated since Python version 3.2**! You should now use argparse (check [[https://docs.python.org/3/library/argparse.html#upgrading-optparse-code|Upgrading optparse code]] for converting from ''optparse'' to ''argparse'') |
| |
=== The current Python way: argparse === | |
| ==== The current Python way: argparse ==== |
| |
[[https://docs.python.org/3/library/argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//) is available since Python version 3.2 | [[https://docs.python.org/3/library/argparse.html|argparse]] (//parser for command-line options, arguments and sub-commands//) is available since Python version 3.2 |
| |
==== Using ordered dictionaries ==== | |
| ===== Using ordered dictionaries ===== |
| |
**Dictionary order is guaranteed to be insertion order**! Note that the [[https://docs.python.org/3/library/stdtypes.html#dict|usual Python dictionary]] also guarantees the order since version **3.6** | **Dictionary order is guaranteed to be insertion order**! Note that the [[https://docs.python.org/3/library/stdtypes.html#dict|usual Python dictionary]] also guarantees the order since version **3.6** |
Check the [[https://docs.python.org/3/library/collections.html#collections.OrderedDict|OrderedDict class]] (''from collections import OrderedDict'') and the [[https://realpython.com/python-ordereddict/|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial | Check the [[https://docs.python.org/3/library/collections.html#collections.OrderedDict|OrderedDict class]] (''from collections import OrderedDict'') and the [[https://realpython.com/python-ordereddict/|OrderedDict vs dict in Python: The Right Tool for the Job]] tutorial |
| |
==== Using sets ==== | |
| ===== Using sets ===== |
| |
[[https://docs.python.org/3/tutorial/datastructures.html#sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //something// and you can easily determine the **intersection**, **union** (and other similar operations) of sets. | [[https://docs.python.org/3/tutorial/datastructures.html#sets|Python sets]] are **groups of unique elements**. They can be used to easily find all the unique elements of //something// and you can easily determine the **intersection**, **union** (and other similar operations) of sets. |
| |
==== Printing a readable version of long lists or dictionaries ==== | |
| ===== Printing a readable version of long lists or dictionaries ===== |
| |
The [[https://docs.python.org/3/library/pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries, ...). It will wrap long lines in a meaningful way | The [[https://docs.python.org/3/library/pprint.html|pprint]] module can be used for //pretty printing// objects (lists, dictionaries, ...). It will wrap long lines in a meaningful way |
</code> | </code> |
| |
==== Sorting ==== | |
| |
| ===== Storing objects and data in a file (shelve and friends) ===== |
| |
| The built-in [[other:python:jyp_steps#the_shelve_package|shelve]] module can be **easily** used for storing temporary/intermediate data |
| |
| More options: |
| * Some [[other:python:jyp_steps#data_file_formats|non-NetCDF]] file formats |
| * Working with [[other:python:jyp_steps#netcdf_filesusing_cdms2_xarray_and_netcdf4|NetCDF]] files |
| |
| |
| ===== Using a configuration file ===== |
| |
| The built-in [[https://docs.python.org/3/library/configparser.html|configparser]] module can be easily used for reading (**and** writing!) text configuration files. |
| |
| Note: a configuration file is also a way to easily store and exchange text data ! |
| |
| |
| ===== Working with global variables ===== |
| |
| There is a good chance you don't actually want/need a //global// variable. Be sure to use the ''global'' statement correctly if you want to avoid side-effects... |
| |
| * [[https://docs.python.org/3/faq/programming.html?highlight=global#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value|Using (and changing) a global variable inside a script or module]] |
| * Simple module example\\ <code>_myvar = 10 |
| |
| def set_myvar(new_val): |
| # Note: need to explicitly define a global variable (of a module) |
| # as 'global' BEFORE changing its value in a function! |
| # Otherwise, the value will not be REdefined outside the function |
| global _myvar |
| _myvar = new_val |
| |
| def get_myvar(): |
| return _myvar |
| |
| def myfunc(nb_repeat = 10): |
| print(nb_repeat * _myvar)</code> |
| * [[https://docs.python.org/3/faq/programming.html?highlight=global#how-do-i-share-global-variables-across-modules|Sharing global variables across modules]] |
| ===== Sorting ===== |
| |
| * When dealing with **numerical values**, you should use the [[https://numpy.org/doc/stable/reference/routines.sort.html|numpy sorting, searching, and counting routines]]! |
* [[https://docs.python.org/3/howto/sorting.html|Sorting HOW TO]] | * [[https://docs.python.org/3/howto/sorting.html|Sorting HOW TO]] |
* Example: sorting the keys and the values of a dictionary, and then using the ''key'' parameter to sort the keys of a dictionary according to the value associated with the key | * Example: sorting the keys and the values of a dictionary, and then using the ''key'' parameter to sort the keys of a dictionary according to the value associated with the key |
>>> sorted(demo_dic.keys(), key=lambda key_name:demo_dic[key_name]) | >>> sorted(demo_dic.keys(), key=lambda key_name:demo_dic[key_name]) |
['c', 'd', 'b', 'a']</code> | ['c', 'd', 'b', 'a']</code> |
| |
| ===== numpy related stuff ===== |
| |
| ==== Using a numpy array to store arbitrary objects ==== |
| |
| The numpy arrays are usually used to store [[https://numpy.org/doc/stable/reference/arrays.scalars.html|scalars]] of the same type (see also the [[https://numpy.org/doc/stable/reference/arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values. |
| |
| It is also possible to store **arbitrary** Python objects in an array, rather than using nested lists or dictionaries! |
| |
| <code>>>> some_array = np.empty((2, 3), dtype=object) |
| >>> some_array |
| array([[None, None, None], |
| [None, None, None]], dtype=object) |
| >>> some_array.shape |
| (2, 3) |
| >>> print(some_array[-1, -1]) |
| None |
| >>> some_array[-1, 0] = filled_contour # e.g. save an existing cartopy filled contour object |
| >>> some_array |
| array([[None, None, None], |
| [<cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>, |
| None, None]], dtype=object)</code> |
| |
| |
| ==== Dealing with a variable number of indices ==== |
| |
| [[https://numpy.org/doc/stable/user/basics.indexing.html#dealing-with-variable-indices|Official reference]] |
| |
| <code>>>> i10 = np.identity(10) |
| >>> i10 |
| array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], |
| [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], |
| ... |
| [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]]) |
| >>> i10.shape |
| (10, 10) |
| |
| >>> i10[3:7, 4:6] |
| array([[0., 0.], |
| [1., 0.], |
| [0., 1.], |
| [0., 0.]]) |
| |
| >>> s0 = slice(3, 7) |
| >>> s1 = slice(4, 6) |
| >>> i10[s0, s1] |
| array([[0., 0.], |
| [1., 0.], |
| [0., 1.], |
| [0., 0.]]) |
| |
| >>> my_slices = (s0, s1) |
| >>> i10[my_slices] |
| array([[0., 0.], |
| [1., 0.], |
| [0., 1.], |
| [0., 0.]]) |
| |
| >>> my_fancy_slices = (s0, Ellipsis) |
| >>> i10[my_fancy_slices] |
| array([[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.], |
| [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.], |
| [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.], |
| [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]]) |
| >>> i10[my_fancy_slices].shape |
| (4, 10) |
| |
| >>> # WARNING! DANGERRRR! NEVER forget that a VIEW is NOT A COPY |
| >>> # and that you can change the content of the original array by mistake |
| >>> my_view = i10[my_slices] |
| >>> my_view[:, :] = -1 |
| >>> my_view |
| array([[-1., -1.], |
| [-1., -1.], |
| [-1., -1.], |
| [-1., -1.]]) |
| >>> i10 |
| array([[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], |
| [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], |
| [ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.], |
| [ 0., 0., 0., 1., -1., -1., 0., 0., 0., 0.], |
| [ 0., 0., 0., 0., -1., -1., 0., 0., 0., 0.], |
| [ 0., 0., 0., 0., -1., -1., 0., 0., 0., 0.], |
| [ 0., 0., 0., 0., -1., -1., 1., 0., 0., 0.], |
| [ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], |
| [ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], |
| [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])</code> |
| |
| |
| ==== Finding and counting unique values ==== |
| |
| Use ''np.unique'', do **not** try to use histogram related functions! |
| |
| <code>>>> vals = np.random.randint(2, 5, (10,)) * 0.5 # Get 10 discreet float values |
| >>> vals |
| array([1. , 2. , 1. , 2. , 2. , 1.5, 1. , 1.5, 2. , 1.5]) |
| |
| >>> np.unique(vals) |
| array([1. , 1.5, 2. ]) |
| >>> unique_vals, nb_unique = np.unique(vals, return_counts=True) |
| >>> unique_vals |
| array([1. , 1.5, 2. ]) |
| >>> nb_unique |
| array([3, 3, 4]) |
| |
| >>> sorted_vals = np.sort(vals) # Sorted copy, in order to check the result |
| >>> sorted_vals |
| array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])</code> |
| |
| |
| ==== Applying a ufunc over all the elements of an array ==== |
| |
| There are all sorts of //ufuncs// (Universal Functions), and we will just use below ''add'' from the [[https://numpy.org/doc/stable/reference/ufuncs.html#math-operations|math operations]], applied on the arrays defined in [[#finding_and_counting_unique_values|Finding and counting unique values]] |
| |
| <code># Get the sum of all the elements of 'vals' |
| >>> np.add.reduce(vals) |
| 15.5 |
| >>> np.add.reduce(sorted_vals) |
| 15.5 |
| >>> vals.sum() # The usual and easy way to do it |
| 15.5 |
| |
| # Compute the sum of the elements of 'nb_unique' |
| # AND keep (accumulate) the intermediate results |
| >>> nb_unique |
| array([3, 3, 4]) |
| >>> np.add.accumulate(nb_unique) |
| array([ 3, 6, 10]) |
| |
| # The accumulated values can be used as indices to separate the different groups of sorted values! |
| >>> sorted_vals |
| array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ]) |
| >>> sorted_vals[0:3] |
| array([1., 1., 1.]) |
| >>> sorted_vals[3:6] |
| array([1.5, 1.5, 1.5]) |
| >>> sorted_vals[6:10] |
| array([2., 2., 2., 2.]) |
| |
| # Compute the sum of each equal-value group |
| >>> sorted_vals[0:3].sum(), sorted_vals[3:6].sum(), sorted_vals[6:10].sum() |
| (3.0, 4.5, 8.0)</code> |
| |
| |
| ==== Applying a ufunc over specified sections of an array ==== |
| |
| The [[https://numpy.org/doc/stable/reference/generated/numpy.ufunc.reduceat.html#numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //improves// what has been shown above |
| |
| <code># Define a list with the boundaries of the intervals we want to apply the 'add' function to |
| # We need to add the beginning index (0), AND remove the last index |
| # (reduceat will automatically go to the end of the input array |
| >>> nb_unique |
| array([3, 3, 4]) |
| >>> slices_indices = [0] + list(np.add.accumulate(nb_unique)) |
| >>> slices_indices.pop() # Remove last element |
| 10 |
| >>> slices_indices |
| [0, 3, 6] |
| |
| # Compute the sums over the selected intervals with just one call |
| >>> np.add.reduceat(np.sort(vals), slices_indices) |
| array([3. , 4.5, 8. ])</code> |
| |
| |
| ===== matplotlib related stuff ===== |
| |
| ==== Working with time axes (and ticks) ==== |
| |
| If you have problems setting the limits of a time axis, choosing the ticks' locations, or specifying the style of the labels, you should check the: |
| * [[https://matplotlib.org/stable/gallery/index.html#ticks|Ticks examples' gallery]] |
| * [[https://matplotlib.org/stable/gallery/text_labels_and_annotations/date.html|Date tick labels example]] |
| |
| |
| ===== Data representation ===== |
| |
| A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// |
| |
| FIXME Add parts (pages 28 to 37) of this [[https://wiki.lsce.ipsl.fr/pmip3/doku.php/other:python:jyp_steps#part_2|old tutorial]] to this section |
| |
| ==== Base notions ==== |
| |
| * **Never forget** that all the bits and pieces of information we use are coded in [[https://en.wikipedia.org/wiki/Binary_number#Counting_in_binary|base 2]] (''0''s and ''1''s), grouped in bytes! |
| * Some things can be stored exactly (integers, characters, ...) |
| * In other cases (**//real// numbers** that we work with all the time, compressed images/videos/music) we only store **//good enough approximation//** |
| |
| * 1 byte <=> 8 bits |
| * ''REAL*4'' <=> 4 bytes <=> 32 bits |
| * For easier written/displayed representation, 1 byte is usually split into 2 groups of 4 bits, using base 16 and [[https://en.wikipedia.org/wiki/Hexadecimal|hexadecimal representation]] |
| * ''0000'' <=> ''0'', ''0010'' <=> ''1'', ..., ''1111'' <=> ''F'' |
| * ''1101'' <=> ''D'' in hexadecimal <=> ''13'' in decimal (''**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1'') |
| * ''11111101'' <=> ''1111 1101'' <=> ''FC'' in hexadecimal <=> ''253'' in decimal (''15 * 16 + 13'') |
| |
| * Conversion with Python |
| * <code>>>> hex(13) # Decimal to Hexadecimal conversion |
| '0xd' |
| >>> hex(255) |
| '0xff' |
| >>> hex(256) |
| '0x100' |
| >>> int('0x100', 16) # Hexadecimal to Decimal conversion |
| 256 |
| >>> int('11', 2) |
| 3 |
| >>> int('1111', 2) # Binary to Decimal conversion |
| 15 |
| >>> int('11111101', 2) |
| 253 |
| >>> 15 * 16 + 13 |
| 253 |
| >>> 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0 |
| 11 |
| >>> int('13', 8) # 1*8 + 3 |
| 11</code> |
| ==== Numerical values ==== |
| |
| * Binary data representation of some numbers (not everythin is listed here): |
| * [[https://en.wikipedia.org/wiki/Integer_(computer_science)|Integers]] |
| * Range: |
| * 4-byte integers: −2,147,483,648 to 2,147,483,647 |
| * Python: ''numpy.int32'' |
| * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ''int'', ''NC_INT64'', ''NF90_INT'' |
| * Fortran: |
| * 8-byte integers: −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
| * Python: ''numpy.int64'' |
| * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]]: ''int64'', ''NC_INT64'' |
| * Fortran: |
| * Tech note: signed integers use [[https://en.wikipedia.org/wiki/Two%27s_complement|two's complement]] for coding negative integers |
| * [[https://en.wikipedia.org/wiki/IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//) |
| * Range: |
| * 4-byte float: ~8 significant digits * 10E±38 |
| * Python: ''numpy.float32'' |
| * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: |
| * Fortran: |
| * See also [[https://en.wikipedia.org/wiki/Single-precision_floating-point_format|Single-precision floating-point format]] |
| * 8-byte float: ~15 significant digits * 10E±308 |
| * Python: ''numpy.float64'' |
| * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: |
| * Fortran: |
| * Special values: |
| * [[https://en.wikipedia.org/wiki/NaN|NaN]] (''numpy.nan''): //Not a Number// |
| * Infinity (''-numpy.inf'' and ''numpy.inf'') |
| * Note: it is cleaner to use masks (and [[https://numpy.org/doc/stable/reference/maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values ! |
| * [[https://en.wikipedia.org/wiki/Bit_numbering|Bit numbering]] |
| * [[https://en.wikipedia.org/wiki/Endianness|Endianness]] |
| * A rather technical example: we //play// with a numpy 4-byte integer scalar |
| * <code>>>> one_int32 = np.int32(1) |
| >>> one_int32 |
| 1 |
| >>> type(one_int32) |
| <class 'numpy.int32'> |
| >>> one_int32.dtype |
| dtype('int32') |
| >>> one_int32.shape # A numpy SCALAR, is an ARRAY WITH NO SHAPE ! |
| () |
| >>> one_int32[0] |
| Traceback (most recent call last): |
| File "<stdin>", line 1, in <module> |
| IndexError: invalid index to scalar variable. |
| >>> one_int32[()] # Note how to access the single element, when there is NO SHAPE |
| 1 |
| >>> one_int32.ndim # NO SHAPE means no dimensions, but there is ONE element |
| 0 |
| >>> one_int32.size |
| 1 |
| >>> one_int32.nbytes # The element requires 4 bytes of storage |
| 4 |
| >>> hex(one_int32) # We can print the hexadecimal representation for INTEGERS scalars and arrays |
| '0x1' |
| >>> hex(one_int32 * 15) |
| '0xf' |
| >>> hex(one_int32 * 16) |
| '0x10' |
| |
| # 'Serialize' the data (i.e. change the data to a series of bytes) |
| # Note: the serialized data seems to be printed in the reverse order of 'hex(one_int32)' |
| >>> one_int32_serialized = one_int32.tobytes() |
| >>> type(one_int32_serialized) |
| <class 'bytes'> |
| >>> len(one_int32_serialized) |
| 4 |
| >>> one_int32_serialized |
| b'\x01\x00\x00\x00' |
| >>> one_int32_serialized.hex(' ') # Another way to print the hexadecimal values |
| '01 00 00 00' |
| |
| # Use the following in the unlikely case where you need to change the endianness (bytes ordering) |
| >>> one_int32_reversed_endian = one_int32.byteswap() |
| >>> one_int32_reversed_endian # Same bytes in a different order represent a different number (of course) |
| 16777216 |
| >>> hex(one_int32_reversed_endian) # Compare to the output of hex(one_int32) above |
| '0x1000000' |
| >>> one_int32_reversed_endian.tobytes() |
| b'\x00\x00\x00\x01'</code> |
| * Another technical example: we use an array of 2 integers\\ When using ''byteswap()'', notice how bytes are swapped by groups of 4 bytes, because int32 use 4 bytes |
| * <code>>>> array_example = np.asarray((3, 17), dtype=np.int32) |
| >>> array_example |
| array([ 3, 17], dtype=int32) |
| >>> array_example.shape, array_example.ndim, array_example.size, array_example.nbytes |
| ((2,), 1, 2, 8) |
| >>> array_example.tobytes().hex(' ', 4) |
| '03000000 11000000' |
| >>> array_example.byteswap().tobytes().hex(' ', 4) |
| '00000003 00000011' |
| </code> |
| |
| * Manipulating binary data with [[https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview|bytes, bytearray, memoryview]] |
| |
| * Array addressing |
| * [[https://www.geeksforgeeks.org/calculation-of-address-of-element-of-1-d-2-d-and-3-d-using-row-major-and-column-major-order/|Calculation of address of element of 1-D, 2-D, and 3-D using row-major and column-major order]] |
| * In other words: //using indices to go from 1-D to n-Dimnensions data// |
| * The [[https://en.wikipedia.org/wiki/Array_(data_structure)|array]] structure |
| * python/C vs Fortran... |
| |
| * disk and ram usage: how to check the usage (available ram and disk), best practice on multi-user systems (how much allowed?) |
| * ''du'', ''df'', ''cat /proc/meminfo'', ''top'' |
| |
| * understanding and reverse-engineering //binary// format |
| * ''od'', ''strings'' |
| |
| * binary vs text format: ascii, utf, raw |
| * text related functions in python: ''str'', ''int'', ''float'', ''ord'', ... |
| * lists conversion with ''map'' and ''join'' |
| |
| * Misc : ''md5sum'' |
| |
| ==== Strings ==== |
| |
| * Encoding, [[https://en.wikipedia.org/wiki/ASCII|ASCII]], [[https://en.wikipedia.org/wiki/Unicode|unicode]], [[https://en.wikipedia.org/wiki/UTF-8|UTF-8]], ... |
| |
| * Getting the binary representation of a string |
| * <code>>>> test_string = 'A B 0 1 à µ' |
| >>> type(test_string) |
| <class 'str'> |
| >>> len(test_string) |
| 11 |
| >>> test_string_bin = test_string.encode('utf-8') |
| >>> test_string_bin |
| b'A B 0 1 \xc3\xa0 \xc2\xb5' |
| >>> type(test_string_bin) |
| <class 'bytes'> |
| >>> len(test_string_bin) |
| 13 |
| >>> test_string_bin.hex('-') |
| '41-20-42-20-30-20-31-20-c3-a0-20-c2-b5' |
| </code> |
| |
| |
/* | /* |
==== Tip template ==== | ===== Tip template ===== |
| |
<code>Some code</code> | <code>Some code</code> |