User Tools

Site Tools


other:python:misc_by_jyp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
other:python:misc_by_jyp [2023/05/04 09:46]
jypeter [Data representation] Added the Base notions section
other:python:misc_by_jyp [2023/05/04 15:25]
jypeter [Numerical values] Lots of changes
Line 470: Line 470:
 ==== Base notions ==== ==== Base notions ====
  
-  * **Never forget** that all the bits and pieces of information we use are coded in [[https://​en.wikipedia.org/​wiki/​Binary_number#​Counting_in_binary|base 2]] (''​0''​s and ''​1''​s),​ grouped in bytes!+  * **Never forget** that all the bits and pieces of information we use are coded in [[https://​en.wikipedia.org/​wiki/​Binary_number#​Counting_in_binary|base 2]] (''​0''​s and ''​1''​s ​...), grouped in bytes!
     * Some things can be stored exactly (integers, characters, ...)     * Some things can be stored exactly (integers, characters, ...)
     * In other cases (**//real// numbers** that we work with all the time, compressed images/​videos/​music) we only store **//good enough approximation//​**     * In other cases (**//real// numbers** that we work with all the time, compressed images/​videos/​music) we only store **//good enough approximation//​**
Line 476: Line 476:
   * 1 byte <=> 8 bits   * 1 byte <=> 8 bits
     * ''​REAL*4''​ <=> 4 bytes <=> 32 bits     * ''​REAL*4''​ <=> 4 bytes <=> 32 bits
-    * For easier written/​displayed representation,​ 1 byte is usually split into 2 groups of 4 bits, using base 16 and [[https://​en.wikipedia.org/​wiki/​Hexadecimal|hexadecimal representation]] +    * For easier written/​displayed representation,​ 1 byte is usually split into 2 groups of 4 bits, and displayed ​using base 16 and [[https://​en.wikipedia.org/​wiki/​Hexadecimal|hexadecimal representation]] ​(characters ''​0'',​ ''​1'',​ ..., ''​A'',​ ''​B'',​ ..., ''​F''​) 
-      * ''​0000''​ <=> ''​0'',​ ''​0010''​ <=> ''​1'',​ ..., ''​1111''​ <=> ''​F''​+      * ''​0000''​ <=> ''​0'',​\\ ''​0010''​ <=> ''​1'',​ ...,\\ ''​1111''​ <=> ''​F''​
       * ''​1101''​ <=> ''​D''​ in hexadecimal <=> ''​13''​ in decimal (''​**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1''​)       * ''​1101''​ <=> ''​D''​ in hexadecimal <=> ''​13''​ in decimal (''​**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1''​)
-      * ''​11111101''​ <=> ''​1111 1101''​ <=> ''​FC''​ in hexadecimal <=> ''​253'' ​in decimal ​(''​15 * 16 + 13''​)+      * ''​11111101'' ​in //base 2// <=> ''​1111 1101''​ <=> ''​FD''​ in //hexadecimal// <=> ''​253''​ (''​15 * 16 + 13''​) ​in //decimal//
  
-  * Conversion ​with Python+  * Base conversion ​with Python
     * <​code>>>>​ hex(13) # Decimal to Hexadecimal conversion     * <​code>>>>​ hex(13) # Decimal to Hexadecimal conversion
 '​0xd'​ '​0xd'​
->>>​ hex(255+>>>​ hex(253
-'0xff'+'0xfd'
 >>>​ hex(256) >>>​ hex(256)
 '​0x100'​ '​0x100'​
 >>>​ int('​0x100',​ 16) # Hexadecimal to Decimal conversion >>>​ int('​0x100',​ 16) # Hexadecimal to Decimal conversion
 256 256
->>>​ int('​11',​ 2) 
-3 
 >>>​ int('​1111',​ 2) # Binary to Decimal conversion >>>​ int('​1111',​ 2) # Binary to Decimal conversion
 15 15
->>>​ int('​11111101',​ 2) +>>>​ int('​11111101',​ 2) # '​11111101'​ <='1111 1101' <='​FD'​ <=> 15 * 16 + 13 = 253
-253 +
->>>​ 15 * 16 + 13+
 253 253
 >>>​ 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0 >>>​ 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0
Line 502: Line 498:
 >>>​ int('​13',​ 8) # 1*8 + 3 >>>​ int('​13',​ 8) # 1*8 + 3
 11</​code>​ 11</​code>​
 +
 +  * More technical topics
 +    * [[https://​en.wikipedia.org/​wiki/​Bit_numbering|Bit numbering]]:​ the art of ordering bits, everything about MSB (Most Significant Byte) and LSB (Least Significant Byte)
 +    * [[https://​en.wikipedia.org/​wiki/​Endianness|Endianness]]:​ the art of ordering bytes
 ==== Numerical values ==== ==== Numerical values ====
  
-  * Binary data representation of some numbers (not everythin is listed here):+  * Binary data representation of some numbers (only some common types are listed here): 
 +    * Languages and packages **references** used below: 
 +      * Python: [[https://​numpy.org/​doc/​stable/​reference/​arrays.scalars.html#​sized-aliases|NumPy Sized aliases]] 
 +      * NetCDF: [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|Data Types]], [[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|Fortran related Data Types]], [[https://​docs.unidata.ucar.edu/​nug/​current/​_c_d_l.html#​cdl_data_types|CDL Data Types]] 
 +      * Fortran: Intel Fortran Compiler [[https://​www.intel.com/​content/​www/​us/​en/​docs/​fortran-compiler/​developer-guide-reference/​2023-1/​intrinsic-data-types.html|Intrinsic Data Types]]
     * [[https://​en.wikipedia.org/​wiki/​Integer_(computer_science)|Integers]]     * [[https://​en.wikipedia.org/​wiki/​Integer_(computer_science)|Integers]]
       * Range:       * Range:
-        * 4-byte integers: −2,​147,​483,​648 to 2,​147,​483,​647+        * 4-byte ​//​signed// ​integers: ​''​−2,​147,​483,​648'' ​to ''​2,​147,​483,​647''​
           * Python: ''​numpy.int32''​           * Python: ''​numpy.int32''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]], [[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ''​int'',​ ''​NC_INT64'',​ ''​NF90_INT''​ +          * NetCDF: ''​int'',​ ''​NC_INT''​ or ''​NC_LONG'',​ ''​NF90_INT''​ 
-          * Fortran: +          * Fortran: ​''​INTEGER*4''​ 
-        * 8-byte integers: −9,​223,​372,​036,​854,​775,​808 to 9,​223,​372,​036,​854,​775,​807+        * 8-byte ​//​signed// ​integers: ​''​−9,​223,​372,​036,​854,​775,​808'' ​to ''​9,​223,​372,​036,​854,​775,​807''​
           * Python: ''​numpy.int64''​           * Python: ''​numpy.int64''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]]: ''​int64'',​ ''​NC_INT64''​ +          * NetCDF: ''​int64'',​ ''​NC_INT64''​ 
-          * Fortran:+          * Fortran: ​''​INTEGER*8''​
       * Tech note: signed integers use [[https://​en.wikipedia.org/​wiki/​Two%27s_complement|two'​s complement]] for coding negative integers       * Tech note: signed integers use [[https://​en.wikipedia.org/​wiki/​Two%27s_complement|two'​s complement]] for coding negative integers
     * [[https://​en.wikipedia.org/​wiki/​IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//​)     * [[https://​en.wikipedia.org/​wiki/​IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//​)
       * Range:       * Range:
-        * 4-byte float: ~8 significant digits * 10E±38+        * 4-byte float: ​''​~8 significant digits * 10E±38''​
           * Python: ''​numpy.float32''​           * Python: ''​numpy.float32''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]][[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ​ +          * NetCDF''​float''​''​NC-FLOAT'',​ ''​NF90_FLOAT''​ 
-          * Fortran:+          * Fortran:''​REAL*4''​
           * See also [[https://​en.wikipedia.org/​wiki/​Single-precision_floating-point_format|Single-precision floating-point format]]           * See also [[https://​en.wikipedia.org/​wiki/​Single-precision_floating-point_format|Single-precision floating-point format]]
-        * 8-byte float: ~15 significant digits * 10E±308+        * 8-byte float: ​''​~15 significant digits * 10E±308''​
           * Python: ''​numpy.float64''​           * Python: ''​numpy.float64''​
-          * [[https://​docs.unidata.ucar.edu/​nug/​current/​md_types.html|NetCDF]][[https://​docs.unidata.ucar.edu/​netcdf-fortran/​current/​f90-variables.html#​f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ​ +          * NetCDF''​double''​''​NC_DOUBLE'',​ ''​NF90_DOUBLE''​ 
-          * Fortran: +          * Fortran: ​''​REAL*8''​ 
-      * Special values: +      ​* **Special values**
-        * [[https://​en.wikipedia.org/​wiki/​NaN|NaN]] ​(''​numpy.nan''​): //Not a Number// +        * [[https://​en.wikipedia.org/​wiki/​NaN|NaN]]:​ //Not a Number// 
-        * Infinity ​(''​-numpy.inf''​ and ''​numpy.inf''​)+          * Python: ''​numpy.nan''​ 
 +        * Infinity 
 +          * Python: ​''​-numpy.inf''​ and ''​numpy.inf''​
         * Note: it is cleaner to use masks (and [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values !         * Note: it is cleaner to use masks (and [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.generic.html|Numpy masked arrays]]) than NaNs, when you have to deal with missing values !
-    ​* [[https://​en.wikipedia.org/​wiki/​Bit_numbering|Bit numbering]] +      * <wrap hi>The RISKS of working with (the wrong) floats</​wrap>:​ 
-    * [[https://​en.wikipedia.org/​wiki/​Endianness|Endianness]]+        ​* [[https://​en.wikipedia.org/​wiki/​Round-off_error|Round-off error]] 
 +        * [[https://​en.wikipedia.org/​wiki/​Catastrophic_cancellation|Catastrophic cancellation]] 
 +          * [[https://​docs.oracle.com/​cd/​E19957-01/​806-3568/​ncg_goldberg.html|What Every Computer Scientist Should Know About Floating-Point Arithmetic]]
     * A rather technical example: we //play// with a numpy 4-byte integer scalar     * A rather technical example: we //play// with a numpy 4-byte integer scalar
       * <​code>>>>​ one_int32 = np.int32(1)       * <​code>>>>​ one_int32 = np.int32(1)
other/python/misc_by_jyp.txt · Last modified: 2024/04/19 12:02 by jypeter