This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
other:python:misc_by_jyp [2023/04/28 14:16] jypeter Moved the new Data Representation to the end |
other:python:misc_by_jyp [2023/05/04 11:19] jypeter [Base notions] |
||
---|---|---|---|
Line 466: | Line 466: | ||
A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// | A few notes for a future section or page about about //data representation// (bits and bytes) on disk and in memory, vs //data format// | ||
+ | FIXME Add parts (pages 28 to 37) of this [[https://wiki.lsce.ipsl.fr/pmip3/doku.php/other:python:jyp_steps#part_2|old tutorial]] to this section | ||
+ | ==== Base notions ==== | ||
+ | |||
+ | * **Never forget** that all the bits and pieces of information we use are coded in [[https://en.wikipedia.org/wiki/Binary_number#Counting_in_binary|base 2]] (''0''s and ''1''s), grouped in bytes! | ||
+ | * Some things can be stored exactly (integers, characters, ...) | ||
+ | * In other cases (**//real// numbers** that we work with all the time, compressed images/videos/music) we only store **//good enough approximation//** | ||
+ | |||
+ | * 1 byte <=> 8 bits | ||
+ | * ''REAL*4'' <=> 4 bytes <=> 32 bits | ||
+ | * For easier written/displayed representation, 1 byte is usually split into 2 groups of 4 bits, and displayed using base 16 and [[https://en.wikipedia.org/wiki/Hexadecimal|hexadecimal representation]] (characters ''0'', ''1'', ..., ''A'', ''B'', ..., ''F'') | ||
+ | * ''0000'' <=> ''0'',\\ ''0010'' <=> ''1'', ...,\\ ''1111'' <=> ''F'' | ||
+ | * ''1101'' <=> ''D'' in hexadecimal <=> ''13'' in decimal (''**1** * 8 + **1** * 4 + **0** * 2 + **1** * 1'') | ||
+ | * ''11111101'' <=> ''1111 1101'' <=> ''FD'' in hexadecimal <=> ''253'' in decimal (''15 * 16 + 13'') | ||
+ | |||
+ | * Conversion with Python | ||
+ | * <code>>>> hex(13) # Decimal to Hexadecimal conversion | ||
+ | '0xd' | ||
+ | >>> hex(253) | ||
+ | '0xfd' | ||
+ | >>> hex(256) | ||
+ | '0x100' | ||
+ | >>> int('0x100', 16) # Hexadecimal to Decimal conversion | ||
+ | 256 | ||
+ | >>> int('11', 2) | ||
+ | 3 | ||
+ | >>> int('1111', 2) # Binary to Decimal conversion | ||
+ | 15 | ||
+ | >>> int('11111101', 2) | ||
+ | 253 | ||
+ | >>> 15 * 16 + 13 | ||
+ | 253 | ||
+ | >>> 013 # DANGER! Python considers an integer to be in OCTAL base if it starts with a 0 | ||
+ | 11 | ||
+ | >>> int('13', 8) # 1*8 + 3 | ||
+ | 11</code> | ||
==== Numerical values ==== | ==== Numerical values ==== | ||
- | * Binary data representation of some numbers: | + | * Binary data representation of some numbers (not everythin is listed here): |
* [[https://en.wikipedia.org/wiki/Integer_(computer_science)|Integers]] | * [[https://en.wikipedia.org/wiki/Integer_(computer_science)|Integers]] | ||
* Range: | * Range: | ||
- | * 4-byte integers (''numpy.int32''): −2,147,483,648 to 2,147,483,647 | + | * 4-byte integers: −2,147,483,648 to 2,147,483,647 |
- | * 8-byte integers (''numpy.int64''): −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | + | * Python: ''numpy.int32'' |
+ | * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: ''int'', ''NC_INT64'', ''NF90_INT'' | ||
+ | * Fortran: | ||
+ | * 8-byte integers: −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | ||
+ | * Python: ''numpy.int64'' | ||
+ | * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]]: ''int64'', ''NC_INT64'' | ||
+ | * Fortran: | ||
* Tech note: signed integers use [[https://en.wikipedia.org/wiki/Two%27s_complement|two's complement]] for coding negative integers | * Tech note: signed integers use [[https://en.wikipedia.org/wiki/Two%27s_complement|two's complement]] for coding negative integers | ||
* [[https://en.wikipedia.org/wiki/IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//) | * [[https://en.wikipedia.org/wiki/IEEE_754|Floating point numbers]] (//IEEE 754// standard aka //IEEE Standard for Binary Floating-Point for Arithmetic//) | ||
* Range: | * Range: | ||
- | * 4-byte float (''numpy.float32''): ~8 significant digits * 10E±38 | + | * 4-byte float: ~8 significant digits * 10E±38 |
+ | * Python: ''numpy.float32'' | ||
+ | * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: | ||
+ | * Fortran: | ||
* See also [[https://en.wikipedia.org/wiki/Single-precision_floating-point_format|Single-precision floating-point format]] | * See also [[https://en.wikipedia.org/wiki/Single-precision_floating-point_format|Single-precision floating-point format]] | ||
- | * 8-byte float (''numpy.float64''): ~15 significant digits * 10E±308 | + | * 8-byte float: ~15 significant digits * 10E±308 |
+ | * Python: ''numpy.float64'' | ||
+ | * [[https://docs.unidata.ucar.edu/nug/current/md_types.html|NetCDF]], [[https://docs.unidata.ucar.edu/netcdf-fortran/current/f90-variables.html#f90-language-types-corresponding-to-netcdf-external-data-types|NetCDF-Fortran]]: | ||
+ | * Fortran: | ||
* Special values: | * Special values: | ||
* [[https://en.wikipedia.org/wiki/NaN|NaN]] (''numpy.nan''): //Not a Number// | * [[https://en.wikipedia.org/wiki/NaN|NaN]] (''numpy.nan''): //Not a Number// |