User Tools

Site Tools


other:python:misc_by_jyp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
other:python:misc_by_jyp [2022/02/21 15:15]
jypeter [numpy related stuff]
other:python:misc_by_jyp [2022/07/08 14:00] (current)
jypeter [numpy related stuff] Added the arbitrary object array
Line 39: Line 39:
 True</​code>​ True</​code>​
  
 +==== Playing with strings ====
 +
 +=== Filenames, etc... ===
 +
 +Check [[other:​python:​misc_by_jyp#​working_with_paths_and_filenames|Working with paths and filenames]] and [[other:​python:​misc_by_jyp#​generating_file_names|Generating file names]]
 +
 +=== Splitting strings ===
 +
 +It's easy to split a string with multiple blank delimiters, or a specific delimiter, but it can be harder to deal with sub-strings
 +
 +<​code>>>>​ str_with_blanks = '​one ​   two\t3\t\tFOUR'​
 +>>>​ str_with_blanks.split()
 +['​one',​ '​two',​ '​3',​ '​FOUR'​]
 +
 +>>>​ str_with_simple_delimiters = '​1,​2,​3.14, ​ 4'
 +>>>​ str_with_simple_delimiters.split(','​)
 +['​1',​ '​2',​ '​3.14',​ ' ​ 4']
 +
 +>>>​ complex_string='​-o 1 --long "A string with accented chars: é è à ç"'​
 +>>>​ complex_string.split()
 +['​-o',​ '​1',​ '​--long',​ '"​A',​ '​string',​ '​with',​ '​accented',​ '​chars:',​ '​\xc3\xa9',​ '​\xc3\xa8',​ '​\xc3\xa0',​ '​\xc3\xa7"'​]
 +
 +>>>​ import shlex
 +>>>​ shlex.split(complex_string)
 +['​-o',​ '​1',​ '​--long',​ 'A string with accented chars: \xc3\xa9 \xc3\xa8 \xc3\xa0 \xc3\xa7'​]</​code>​
 ==== Working with paths and filenames ==== ==== Working with paths and filenames ====
  
Line 222: Line 247:
  
 ==== numpy related stuff ==== ==== numpy related stuff ====
 +
 +=== Using a numpy array to store arbitrary objects ===
 +
 +The numpy arrays are usually used to store [[https://​numpy.org/​doc/​stable/​reference/​arrays.scalars.html|scalars]] of the same type (see also the [[https://​numpy.org/​doc/​stable/​reference/​arrays.dtypes.html|Data type objects (dtype)]]), very often numerical values.
 +
 +It is also possible to store **arbitrary** Python objects in an array, rather than using nested lists or dictionaries!
 +
 +<​code>>>>​ some_array = np.empty((2,​ 3), dtype=object)
 +>>>​ some_array
 +array([[None,​ None, None],
 +       ​[None,​ None, None]], dtype=object)
 +>>>​ some_array.shape
 +(2, 3)
 +>>>​ print(some_array[-1,​ -1])
 +None
 +>>>​ some_array[-1,​ 0] = filled_contour # e.g. save an existing cartopy filled contour object
 +>>>​ some_array
 +array([[None,​ None, None],
 +       ​[<​cartopy.mpl.contour.GeoContourSet object at 0x2ab679e8bf10>,​
 +        None, None]], dtype=object)</​code>​
 +        ​
 +=== Dealing with a variable number of indices ===
 +
 +[[https://​numpy.org/​doc/​stable/​user/​basics.indexing.html#​dealing-with-variable-indices|Official reference]]
 +
 +<​code>>>>​ i10 = np.identity(10)
 +>>>​ i10
 +array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
 +       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
 +...
 +       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
 +>>>​ i10.shape
 +(10, 10)
 +
 +>>>​ i10[3:7, 4:6]
 +array([[0., 0.],
 +       [1., 0.],
 +       [0., 1.],
 +       [0., 0.]])
 +       
 +>>>​ s0 = slice(3, 7)
 +>>>​ s1 = slice(4, 6)
 +>>>​ i10[s0, s1]
 +array([[0., 0.],
 +       [1., 0.],
 +       [0., 1.],
 +       [0., 0.]])
 +       
 +>>>​ my_slices = (s0, s1)
 +>>>​ i10[my_slices]
 +array([[0., 0.],
 +       [1., 0.],
 +       [0., 1.],
 +       [0., 0.]])
 +       
 +>>>​ my_fancy_slices = (s0, Ellipsis)
 +>>>​ i10[my_fancy_slices]
 +array([[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
 +       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
 +       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
 +       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]])
 +>>>​ i10[my_fancy_slices].shape
 +(4, 10)
 +
 +>>>​ # WARNING! DANGERRRR! NEVER forget that a VIEW is NOT A COPY
 +>>>​ # and that you can change the content of the original array by mistake
 +>>>​ my_view = i10[my_slices]
 +>>>​ my_view[:, :] = -1
 +>>>​ my_view
 +array([[-1.,​ -1.],
 +       [-1., -1.],
 +       [-1., -1.],
 +       [-1., -1.]])
 +>>>​ i10
 +array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
 +       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
 +       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
 +       [ 0.,  0.,  0.,  1., -1., -1.,  0.,  0.,  0.,  0.],
 +       [ 0.,  0.,  0.,  0., -1., -1.,  0.,  0.,  0.,  0.],
 +       [ 0.,  0.,  0.,  0., -1., -1.,  0.,  0.,  0.,  0.],
 +       [ 0.,  0.,  0.,  0., -1., -1.,  1.,  0.,  0.,  0.],
 +       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
 +       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
 +       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])</​code>​
  
 === Finding and counting unique values === === Finding and counting unique values ===
Line 253: Line 362:
 15.5 15.5
 >>>​ vals.sum() # The usual and easy way to do it >>>​ vals.sum() # The usual and easy way to do it
-15.5</​code>​+15.5 
 + 
 +# Compute the sum of the elements of '​nb_unique'​ 
 +# AND keep (accumulate) the intermediate results 
 +>>>​ nb_unique 
 +array([3, 3, 4]) 
 +>>>​ np.add.accumulate(nb_unique) 
 +array([ 3,  6, 10]) 
 + 
 +# The accumulated values can be used as indices to separate the different groups of sorted values! 
 +>>>​ sorted_vals 
 +array([1. , 1. , 1. , 1.5, 1.5, 1.5, 2. , 2. , 2. , 2. ]) 
 +>>>​ sorted_vals[0:​3] 
 +array([1., 1., 1.]) 
 +>>>​ sorted_vals[3:​6] 
 +array([1.5, 1.5, 1.5]) 
 +>>>​ sorted_vals[6:​10] 
 +array([2., 2., 2., 2.]) 
 + 
 +# Compute the sum of each equal-value group 
 +>>>​ sorted_vals[0:​3].sum(),​ sorted_vals[3:​6].sum(),​ sorted_vals[6:​10].sum() 
 +(3.0, 4.5, 8.0)</​code>​ 
 + 
 +=== Applying a ufunc over specified sections of an array === 
 + 
 +The [[https://​numpy.org/​doc/​stable/​reference/​generated/​numpy.ufunc.reduceat.html#​numpy.ufunc.reduceat|reduceat]] function can be used to avoid explicit python loops, and improve the speed (but not the readability...) of a script. The example below //​improves//​ what has been shown above 
 + 
 +<​code>#​ Define a list with the boundaries of the intervals we want to apply the '​add'​ function to 
 +# We need to add the beginning index (0), AND remove the last index 
 +# (reduceat will automatically go to the end of the input array 
 +>>>​ nb_unique 
 +array([3, 3, 4]) 
 +>>>​ slices_indices = [0] + list(np.add.accumulate(nb_unique)) 
 +>>>​ slices_indices.pop() # Remove last element 
 +10 
 +>>>​ slices_indices 
 +[0, 3, 6] 
 + 
 +# Compute the sums over the selected intervals with just one call 
 +>>>​ np.add.reduceat(np.sort(vals),​ slices_indices) 
 +array([3. , 4.5, 8. ])</​code>​
  
 /* /*
other/python/misc_by_jyp.1645456556.txt.gz · Last modified: 2022/02/21 15:15 by jypeter