Differences

This shows you the differences between two versions of the page.

--- other:python:misc_by_jyp [2022/02/21 16:15] – [numpy related stuff] jypeter
+++ other:python:misc_by_jyp [2022/12/12 13:35] – Added config file section jypeter
@@ Line 39: / Line 39: @@
 True</code>
+==== Playing with strings ====
+=== Filenames, etc... ===
+Check [[other:python:misc_by_jyp#working_with_paths_and_filenames|Working with paths and filenames]] and [[other:python:misc_by_jyp#generating_file_names|Generating file names]]
+=== Splitting strings ===
+It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings
+<code>>>> str_with_blanks = 'one    two\t3\t\tFOUR'
+>>> str_with_blanks.split()
+['one', 'two', '3', 'FOUR']
+>>> str_with_simple_delimiters = '1,2,3.14,  4'
+>>> str_with_simple_delimiters.split(',')
+['1', '2', '3.14', '  4']
+>>> complex_string='-o 1 --long "A string with accented chars: é è à ç"'
+>>> complex_string.split()
+['-o', '1', '--long', '"A', 'string', 'with', 'accented', 'chars:', '\xc3\xa9', '\xc3\xa8', '\xc3\xa0', '\xc3\xa7"']
+>>> import shlex
+>>> shlex.split(complex_string)
+['-o', '1', '--long', 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7']</code>
 ==== Working with paths and filenames ====
@@ Line 204: / Line 229: @@
 </code>
+==== Storing objects and data in a file (shelve and friends) ====
+The built-in [[other:python:jyp_steps#the_shelve_package|shelve]] module can be **easily** used for storing temporary/intermediate data
+More options:
+  * Some [[other:python:jyp_steps#data_file_formats|non-NetCDF]] file formats
+  * Working with [[other:python:jyp_steps#netcdf_filesusing_cdms2_xarray_and_netcdf4|NetCDF]] files
+==== Using a configuration file ====
+The built-in [[https://docs.python.org/3/library/configparser.html|configparser]] module can be easily used for reading (**and** writing!) text configuration files.
+Note: a configuration file is also a way to easily store and exchange text data !
 ==== Sorting ====
@@ Line 222: / Line 260: @@
 ==== numpy related stuff ====
+=== Using a numpy array to store arbitrary objects ===
+The numpy arrays are usually used to store [[https://numpy.org/doc/stable/reference/arrays.scalars.html|scalars]] of the same type (see also the [[https://numpy.org/doc/stable/reference/arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values.
+It is also possible to store **arbitrary** Python objects in an array, rather than using nested lists or dictionaries!
+<code>>>> some_array = np.empty((2, 3), dtype=object)
+>>> some_array
+array([[None, None, None],
+       [None, None, None]], dtype=object)
+>>> some_array.shape
+(2, 3)
+>>> print(some_array[-1, -1])
+None
+>>> some_array[-1, 0] = filled_contour # e.g. save an existing cartopy filled contour object
+>>> some_array
+array([[None, None, None],
+       [<cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>,
+        None, None]], dtype=object)</code>
+=== Dealing with a variable number of indices ===
+[[https://numpy.org/doc/stable/user/basics.indexing.html#dealing-with-variable-indices|Official reference]]
+<code>>>> i10 = np.identity(10)
+>>> i10
+array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
+       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
+...
+       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
+>>> i10.shape
+(10, 10)
+>>> i10[3:7, 4:6]
+array([[0., 0.],
+       [1., 0.],
+       [0., 1.],
+       [0., 0.]])
+>>> s0 = slice(3, 7)
+>>> s1 = slice(4, 6)
+>>> i10[s0, s1]
+array([[0., 0.],
+       [1., 0.],
+       [0., 1.],
+       [0., 0.]])
+>>> my_slices = (s0, s1)
+>>> i10[my_slices]
+array([[0., 0.],
+       [1., 0.],
+       [0., 1.],
+       [0., 0.]])
+>>> my_fancy_slices = (s0, Ellipsis)
+>>> i10[my_fancy_slices]
+array([[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
+       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
+       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
+       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]])
+>>> i10[my_fancy_slices].shape
+(4, 10)
+>>> # WARNING! DANGERRRR! NEVER forget that a VIEW is NOT A COPY
+>>> # and that you can change the content of the original array by mistake
+>>> my_view = i10[my_slices]
+>>> my_view[:, :] = -1
+>>> my_view
+array([[-1., -1.],
+       [-1., -1.],
+       [-1., -1.],
+       [-1., -1.]])
+>>> i10
+array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
+       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
+       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
+       [ 0.,  0.,  0.,  1., -1., -1.,  0.,  0.,  0.,  0.],
+       [ 0.,  0.,  0.,  0., -1., -1.,  0.,  0.,  0.,  0.],
+       [ 0.,  0.,  0.,  0., -1., -1.,  0.,  0.,  0.,  0.],
+       [ 0.,  0.,  0.,  0., -1., -1.,  1.,  0.,  0.,  0.],
+       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
+       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
+       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])</code>
 === Finding and counting unique values ===
@@ Line 253: / Line 375: @@
 .5
 >>> vals.sum() # The usual and easy way to do it
-.5</code>
+.5
+# Compute the sum of the elements of 'nb_unique'
+# AND keep (accumulate) the intermediate results
+>>> nb_unique
+array([3, 3, 4])
+>>> np.add.accumulate(nb_unique)
+array([ 3,  6, 10])
+# The accumulated values can be used as indices to separate the different groups of sorted values!
+>>> sorted_vals
+array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ])
+>>> sorted_vals[0:3]
+array([1., 1., 1.])
+>>> sorted_vals[3:6]
+array([1.5, 1.5, 1.5])
+>>> sorted_vals[6:10]
+array([2., 2., 2., 2.])
+# Compute the sum of each equal-value group
+>>> sorted_vals[0:3].sum(), sorted_vals[3:6].sum(), sorted_vals[6:10].sum()
+(3.0, 4.5, 8.0)</code>
+=== Applying a ufunc over specified sections of an array ===
+The [[https://numpy.org/doc/stable/reference/generated/numpy.ufunc.reduceat.html#numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //improves// what has been shown above
+<code># Define a list with the boundaries of the intervals we want to apply the 'add' function to
+# We need to add the beginning index (0), AND remove the last index
+# (reduceat will automatically go to the end of the input array
+>>> nb_unique
+array([3, 3, 4])
+>>> slices_indices = [0] + list(np.add.accumulate(nb_unique))
+>>> slices_indices.pop() # Remove last element
+
+>>> slices_indices
+[0, 3, 6]
+# Compute the sums over the selected intervals with just one call
+>>> np.add.reduceat(np.sort(vals), slices_indices)
+array([3. , 4.5, 8. ])</code>
 /*