User Tools

Site Tools


other:python:jyp_steps

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
other:python:jyp_steps [2021/09/22 14:23]
jypeter Added statsmodels, scikit-learn and scikit-image
other:python:jyp_steps [2024/03/07 10:15] (current)
jypeter Added a Protocol Buffers section to the file formats
Line 13: Line 13:
 You can start using python by reading the {{:​other:​python:​python_intro_ipsl_oct2013_v2.pdf|Bien démarrer avec python}} tutorial that was used during a 2013 IPSL python class: You can start using python by reading the {{:​other:​python:​python_intro_ipsl_oct2013_v2.pdf|Bien démarrer avec python}} tutorial that was used during a 2013 IPSL python class:
   * this tutorial is in French (my apologies for the lack of translation,​ but it should be easy to understand)   * this tutorial is in French (my apologies for the lack of translation,​ but it should be easy to understand)
-    * If you have too much trouble understanding this French Tutorial, you can read the first 6 chapters of the **Tutorial** in [[#​the_official_python_documentation|the official Python documentation]] and chapters 1.2.1 to 1.2.5 in the [[#scipy_lecture_notes|Scipy Lecture Notes]]. Once you have read these, you can try to read the French tutorial again+    * If you have too much trouble understanding this French Tutorial, you can read the first 6 chapters of the **Tutorial** in [[#​the_official_python_documentation|the official Python documentation]] and chapters 1.2.1 to 1.2.5 in the [[#scientific_python_lectures|Scientific Python Lectures]]. Once you have read these, you can try to read the French tutorial again
   * it's an introduction to python (and programming) for the climate scientist: after reading this tutorial, you should be able to do most of the things you usually do in a shell script   * it's an introduction to python (and programming) for the climate scientist: after reading this tutorial, you should be able to do most of the things you usually do in a shell script
     * python types, tests, loops, reading a text file     * python types, tests, loops, reading a text file
Line 44: Line 44:
  
 [[https://​docs.python.org/​3/​|html]] - [[https://​docs.python.org/​3/​download.html|pdf (in a zip file)]] [[https://​docs.python.org/​3/​|html]] - [[https://​docs.python.org/​3/​download.html|pdf (in a zip file)]]
 +
 +
 +===== Scientific Python Lectures =====
 +
 +Summary: //One document to learn numerics, science, and data with Python//
 +
 +Note: this used to be called //Scipy Lecture Notes//
 +
 +Where: [[https://​lectures.scientific-python.org/​_downloads/​ScientificPythonLectures-simple.pdf|pdf]] - [[https://​lectures.scientific-python.org/​|html]]
 +
 +This is **a really nice and useful document** that is regularly updated and used for the [[https://​www.euroscipy.org/​|EuroScipy]] tutorials.
 +
 +This document will teach you lots of things about python, numpy and matplotlib, debugging and optimizing scripts, and about using python for statistics, image processing, machine learning, washing dishes (this is just to check if you have read this page), etc...
 +  * Example: the [[https://​lectures.scientific-python.org/​packages/​statistics/​index.html|Statistics in Python]] tutorial that combines [[other:​python:​jyp_steps#​pandas|Pandas]],​ [[http://​statsmodels.sourceforge.net/​|Statsmodels]] and [[http://​seaborn.pydata.org/​|Seaborn]]
  
  
Line 64: Line 78:
     - Numpy Reference Guide     - Numpy Reference Guide
     - Scipy Reference Guide     - Scipy Reference Guide
 +  - read [[https://​github.com/​rougier/​numpy-100/​blob/​master/​100_Numpy_exercises.ipynb|100 numpy exercises]]
  
 ==== Beware of the array view side effects ==== ==== Beware of the array view side effects ====
Line 123: Line 138:
  
 ==== Extra numpy information ==== ==== Extra numpy information ====
 +
 +<WRAP center round tip 60%>
 +You can also check the [[other:​python:​misc_by_jyp#​numpy_related_stuff|numpy section]] of the //Useful python stuff// page
 +</​WRAP>​
 +
  
   * More information about **array indexing**:​\\ <wrap em>​Always check what you are doing on a simple test case, when you use advanced/​fancy indexing!</​wrap>​   * More information about **array indexing**:​\\ <wrap em>​Always check what you are doing on a simple test case, when you use advanced/​fancy indexing!</​wrap>​
Line 128: Line 148:
       * {{ :​other:​python:​indirect_indexing_2.py.txt |}}: Take a vertical slice in a 3D zyx array, along a varying y '​path'​       * {{ :​other:​python:​indirect_indexing_2.py.txt |}}: Take a vertical slice in a 3D zyx array, along a varying y '​path'​
     * [[https://​numpy.org/​doc/​stable/​user/​basics.indexing.html|Array indexing basics (user guide)]] (//index arrays//, //boolean index arrays//, //​np.newaxis//,​ //​Ellipsis//,​ //variable numbers of indices//, ...)     * [[https://​numpy.org/​doc/​stable/​user/​basics.indexing.html|Array indexing basics (user guide)]] (//index arrays//, //boolean index arrays//, //​np.newaxis//,​ //​Ellipsis//,​ //variable numbers of indices//, ...)
-    * [[https://​numpy.org/​doc/​stable/​reference/​arrays.indexing.html|Array indexing ​(reference manual)]]+    * [[https://​numpy.org/​doc/​stable/​reference/​arrays.indexing.html|Indexing routines ​(reference manual)]]
     * [[https://​numpy.org/​doc/​stable/​user/​quickstart.html#​advanced-indexing-and-index-tricks|Advanced indexing and index tricks]] and [[https://​numpy.org/​doc/​stable/​user/​quickstart.html#​the-ix-function|the ix_() function]]     * [[https://​numpy.org/​doc/​stable/​user/​quickstart.html#​advanced-indexing-and-index-tricks|Advanced indexing and index tricks]] and [[https://​numpy.org/​doc/​stable/​user/​quickstart.html#​the-ix-function|the ix_() function]]
-    * [[https://​numpy.org/​doc/​stable/​reference/​routines.indexing.html#​routines-indexing|Indexing routines]] ​ 
   * More information about arrays:   * More information about arrays:
     * [[https://​numpy.org/​doc/​stable/​reference/​routines.array-creation.html|Array creation routines]]     * [[https://​numpy.org/​doc/​stable/​reference/​routines.array-creation.html|Array creation routines]]
     * [[https://​numpy.org/​doc/​stable/​reference/​routines.array-manipulation.html|Array manipulation routines]]     * [[https://​numpy.org/​doc/​stable/​reference/​routines.array-manipulation.html|Array manipulation routines]]
 +    * [[https://​numpy.org/​doc/​stable/​reference/​routines.sort.html|Sorting,​ searching, and counting routines]]
     * [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.html|Masked arrays]]     * [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.html|Masked arrays]]
       * [[https://​numpy.org/​doc/​stable/​reference/​routines.ma.html|Masked array operations]]       * [[https://​numpy.org/​doc/​stable/​reference/​routines.ma.html|Masked array operations]]
   * [[https://​numpy.org/​doc/​stable/​user/​misc.html#​ieee-754-floating-point-special-values|Dealing with special numerical values]] (//Nan//, //inf//)   * [[https://​numpy.org/​doc/​stable/​user/​misc.html#​ieee-754-floating-point-special-values|Dealing with special numerical values]] (//Nan//, //inf//)
     * If you know that your data has missing values, it is cleaner and safer to handle them with [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.html|masked arrays]]!     * If you know that your data has missing values, it is cleaner and safer to handle them with [[https://​numpy.org/​doc/​stable/​reference/​maskedarray.html|masked arrays]]!
 +    * If you know that some of your data //may// have masked values, play safe by explicitly using ''​np.ma.some_function()''​ rather than just ''​np.some_function()''​
 +      * More details in the [[https://​github.com/​numpy/​numpy/​issues/​18675|Why/​when does np.something remove the mask of a np.ma array ?]] discussion
     * [[https://​numpy.org/​doc/​stable/​user/​misc.html#​how-numpy-handles-numerical-exceptions|Handling numerical exceptions]]     * [[https://​numpy.org/​doc/​stable/​user/​misc.html#​how-numpy-handles-numerical-exceptions|Handling numerical exceptions]]
     * [[https://​numpy.org/​doc/​stable/​reference/​routines.err.html|Floating point error handling]]     * [[https://​numpy.org/​doc/​stable/​reference/​routines.err.html|Floating point error handling]]
  
-===== NetCDF files: using cdms2, xarray and netCDF4 ​=====+===== Using NetCDF files with Python ​=====
  
-There is a good chance that your input array data will come from a file in the [[other:​newppl:​starting#​netcdf_and_file_formats|NetCDF format]]. 
  
-Depending on which [[other:​python:​starting#​some_python_distributions|python distribution]] you are using, you can use the //cdms2//, //xarray// or //netCDF4// modules to read the data.+==== What is NetCDF? ====
  
-==== cdms2 ====+  * If you are working with climate model output data, there is a good chance that your input array data will be stored in a NetCDF file!
  
-Summarycdms2 can read/write netCDF files (and read //grads// dat+ctl ​files) and provides a higher level interface than netCDF4. cdms2 is available in the [[other:​python:​starting#​cdat|CDAT distribution]]and can theoretically be installed independently of CDAT (e.g. it will be installed when you install ​[[https://cmor.llnl.gov/​mydoc_cmor3_conda/|CMOR in conda)]]. When you can use cdms2, you also have access to //cdtime//, that is very useful for handling time axis data.+  * Read the [[other:newppl:​starting#​netcdf_and_related_conventions|NetCDF ​and related Conventions]] for more information 
 + 
 +  * There may be different ways of dealing with NetCDF ​files, depending on which [[other:​python:​starting#​some_python_distributions|python ​distribution]] ​you have access to 
 + 
 + 
 +==== CliMAF ​and C-ESM-EP ==== 
 + 
 +People using **//CMIPn// and model data on the IPSL servers** ​can easily search and process NetCDF files using: 
 + 
 +  * the [[https://climaf.readthedocs.io/|Climate Model Assessment Framework (CliMAF)]] environment 
 + 
 +  * and the [[https://github.com/jservonnat/C-ESM-EP/​wiki|CliMAF Earth System Evaluation Platform (C-ESM-EP)]]
  
-How to get started: 
-  - read [[http://​www.lsce.ipsl.fr/​Phocea/​file.php?​class=page&​file=5/​pythonCDAT_jyp_2sur2_070306.pdf|JYP'​s cdms tutorial]], starting at page 54 
-    - the tutorial is in French (soooorry!) 
-    - you have to replace //cdms// with **cdms2**, and //MV// with **MV2** (sooorry about that, the tutorial was written when CDAT was based on //Numeric// instead of //numpy// to handle array data) 
-  - read the [[http://​cdms.readthedocs.io/​en/​docstanya/​index.html|official cdms documentation]] (link may change) 
  
 ==== xarray ==== ==== xarray ====
  
-Summary: ​[[http://xarray.pydata.org/​en/​stable/​|xarray]] ​is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun! [...] It is particularly tailored to working with netCDF files+[[https://docs.xarray.dev/|xarray]] makes working with labelled multi-dimensional arrays ​in Python ​simple, efficient, and fun! [...] It is particularly tailored to working with netCDF files 
 + 
 +=== Some xarray related resources === 
 + 
 +Note: more packages (than listed below) may be listed in the [[other:​uvcdat:​cdat_conda:​cdat_8_2_1#​extra_packages_list|Extra packages list]] page 
 + 
 +  * [[https://​docs.xarray.dev/​en/​stable/​generated/​xarray.tutorial.load_dataset.html|xarray test datasets]] 
 + 
 +  * **[[https://​xcdat.readthedocs.io/​|xCDAT]]:​ ''​xarray''​ extended with Climate Data Analysis Tools** 
 + 
 +  * [[https://​xoa.readthedocs.io/​en/​latest/​|xoa]]:​ xarray-based ocean analysis library 
 + 
 +  * [[https://​uxarray.readthedocs.io/​|uxarray]]:​ provide xarray styled functionality for unstructured grid datasets following [[https://​ugrid-conventions.github.io/​ugrid-conventions/​|UGRID Conventions]]
  
  
 ==== netCDF4 ==== ==== netCDF4 ====
  
-Summary: //netCDF4 can read/write netCDF files and is available in most python ​distributions/​/+[[http://unidata.github.io/netcdf4-python/|netCDF4]] is a Python interface to the netCDF C library
  
-Where: [[http://​unidata.github.io/​netcdf4-python/​]] 
  
-===== CDAT-related resources =====+==== cdms2 ====
  
-Some links, in case they can't be found easily on the [[https://cdat.llnl.gov|CDAT]] web site...+<note important>​ 
 +  * ''​cdms2''​ is unfortunately not maintained anymore and is slowly being **phased out in favor of a combination of [[#​xarray|xarray]] and [[https://xcdat.readthedocs.io/|xCDAT]]**
  
-  * [[https://cdat.llnl.gov/tutorials.html|Tutorials in ipython notebooks]] +  * ''​cdms2''​ will [[https://github.com/CDAT/​cdms/​issues/​449|not be compatible with numpy after numpy 1.23.5]] :-( 
-  ​* ​[[http://cdat-vcs.readthedocs.io/​en/​latest/|VCSVisualization Control System]] +</​note>​ 
-    * [[https://github.com/CDAT/vcs/​issues/​238|Colormaps ​in vcs examples]] + 
-  ​[[https://github.com/CDAT/cdat-site/blob/master/eztemplate.md|EzTemplate Documentation]]+[[https://cdms.readthedocs.io/​en/​docstanya/|cdms2]] can read/write netCDF files (and read //grads// dat+ctl files) and provides a higher level interface than netCDF4. ''​cdms2''​ is available in the [[other:python:​starting#​cdat|CDAT distribution]], and can theoretically be installed independently of CDAT (e.g. it will be installed when you install ​[[https://cmor.llnl.gov/mydoc_cmor3_conda/|CMOR in conda)]]. When you can use cdms2, you also have access to //cdtime//, that is very useful for handling time axis data. 
 + 
 +How to get started: 
 +  ​- read [[http://www.lsce.ipsl.fr/Phocea/​file.php?​class=page&​file=5/​pythonCDAT_jyp_2sur2_070306.pdf|JYP'​s cdms tutorial]], starting at page 54 
 +    - the tutorial is in French (soooorry!) 
 +    - you have to replace //cdms// with **cdms2**, and //MV// with **MV2** (sooorry about that, the tutorial was written when CDAT was based on //Numeric// instead of //numpy// to handle array data) 
 +  ​read the [[http://cdms.readthedocs.io/en/​docstanya/​index.html|official cdms documentation]] (link may change)
  
 ===== Matplotlib ===== ===== Matplotlib =====
Line 270: Line 315:
  
  
-===== 3D resources =====+===== 3D plots resources =====
  
   * [[https://​ipyvolume.readthedocs.io/​en/​latest/​|Ipyvolume]]   * [[https://​ipyvolume.readthedocs.io/​en/​latest/​|Ipyvolume]]
   * [[https://​zulko.wordpress.com/​2012/​09/​29/​animate-your-3d-plots-with-pythons-matplotlib/​|Animate your 3D plots with Python’s Matplotlib]]   * [[https://​zulko.wordpress.com/​2012/​09/​29/​animate-your-3d-plots-with-pythons-matplotlib/​|Animate your 3D plots with Python’s Matplotlib]]
   * [[https://​stackoverflow.com/​questions/​26796997/​how-to-get-vertical-z-axis-in-3d-surface-plot-of-matplotlib|How to get vertical Z axis in 3D surface plot of Matplotlib?​]]   * [[https://​stackoverflow.com/​questions/​26796997/​how-to-get-vertical-z-axis-in-3d-surface-plot-of-matplotlib|How to get vertical Z axis in 3D surface plot of Matplotlib?​]]
 +
 +===== Data analysis =====
 +
 +==== EDA (Exploratory Data Analysis) ? ====
 +
 +<note tip>
 +The //EDA concept// seems to apply to **time series** (and tabular data), which is not exactly the case of full climate model output data</​note>​
 +
 +  * [[https://​www.geeksforgeeks.org/​what-is-exploratory-data-analysis/​|What is Exploratory Data Analysis ?]]
 +    * //The method of studying and exploring record sets to apprehend their predominant traits, discover patterns, locate outliers, and identify relationships between variables. EDA is normally carried out as a preliminary step before undertaking extra formal statistical analyses or modeling.//
 +
 +  * [[https://​medium.com/​codex/​automate-the-exploratory-data-analysis-eda-to-understand-the-data-faster-not-better-2ed6ff230eed|Automate the exploratory data analysis (EDA) to understand the data faster and easier]]: a nice comparison of some Python libraries listed below ([[#​ydata_profiling|YData Profiling]],​ [[#​d-tale|D-Tale]],​ [[#​sweetviz|sweetviz]],​ [[#​autoviz|AutoViz]])
 +
 +  * [[https://​www.geeksforgeeks.org/​exploratory-data-analysis-in-python/​|EDA in Python]]
 +
 +
 +==== Easy to use datasets ====
 +
 +If you need standard datasets for testing, example, demos, ...
 +
 +  * [[https://​docs.xarray.dev/​en/​stable/​generated/​xarray.tutorial.load_dataset.html|Tutorial datasets]] from [[#​xarray|xarray]] (requires internet)
 +    * Example: [[https://​docs.xarray.dev/​en/​stable/​examples/​visualization_gallery.html|Using the 'air temperature'​ dataset]]
 +
 +  * [[https://​scikit-learn.org/​stable/​datasets.html|Toy,​ real-world and generated datasets]] from [[#​scikit-learn]]
 +    * Example: [[https://​lectures.scientific-python.org/​packages/​scikit-learn/​index.html#​a-simple-example-the-iris-dataset|using the '​iris'​ dataset]]
 +
 +  * [[https://​scikit-image.org/​docs/​stable/​api/​skimage.data.html|Test images and datasets]] from [[#​scikit-image]]
 +    * Example: [[https://​lectures.scientific-python.org/​packages/​scikit-image/​index.html#​data-types|Using the '​camera'​ dataset]]
 +
 +  * [[https://​esgf-node.ipsl.upmc.fr/​search/​cmip6-ipsl/​|CMIP6 data]] on ESGF
 +    * Example : ''​orog_fx_IPSL-CM6A-LR_piControl_r1i1p1f1_gr.nc'':​
 +      * [[http://​vesg.ipsl.upmc.fr/​thredds/​fileServer/​cmip6/​CMIP/​IPSL/​IPSL-CM6A-LR/​piControl/​r1i1p1f1/​fx/​orog/​gr/​v20200326/​orog_fx_IPSL-CM6A-LR_piControl_r1i1p1f1_gr.nc|HTTP]] download link
 +      * [[http://​vesg.ipsl.upmc.fr/​thredds/​dodsC/​cmip6/​CMIP/​IPSL/​IPSL-CM6A-LR/​piControl/​r1i1p1f1/​fx/​orog/​gr/​v20200326/​orog_fx_IPSL-CM6A-LR_piControl_r1i1p1f1_gr.nc.dods|OpenDAP]] download link
 +
 +  * [[https://​github.com/​xCDAT/​xcdat/​issues/​277|xCDAT test data GH discussion]]
 +
 +
 +==== Pandas ====
 +
 +Summary: //pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool//
 +
 +Where: [[http://​pandas.pydata.org|Pandas web site]]
 +
 +JYP's comment: pandas is supposed to be quite good for loading, processing and plotting time series, without writing custom code. It is **very convenient for processing tables in xlsx files** (or csv, etc...). You should at least have a quick look at:
 +
 +  * Some //Cheat Sheets//:
 +    - Basics: [[https://​github.com/​fralfaro/​DS-Cheat-Sheets/​blob/​main/​docs/​files/​pandas_cs.pdf|Pandas Basics Cheat Sheet]] (associated with the [[https://​www.datacamp.com/​cheat-sheet/​pandas-cheat-sheet-for-data-science-in-python#​python-for-data-science-cheat-sheet:​-pandas-basics-useth|Pandas basics]] //​datacamp//​ introduction page)
 +    - Intermediate:​ [[https://​github.com/​pandas-dev/​pandas/​blob/​main/​doc/​cheatsheet/​Pandas_Cheat_Sheet.pdf|Data Wrangling with pandas Cheat Sheet]]
 +  * Some tutorials:
 +    * [[http://​pandas.pydata.org/​docs/​user_guide/​10min.html|10 minutes to pandas]]
 +    * The [[https://​lectures.scientific-python.org/​packages/​statistics/​index.html|Statistics in Python]] tutorial that combines Pandas, [[#​statsmodels|statsmodels]] and [[http://​seaborn.pydata.org/​|Seaborn]]
 +    * More [[http://​pandas.pydata.org/​docs/​getting_started/​tutorials.html|Community tutorials]]...
 +
 +
 +==== statsmodels ====
 +
 +[[https://​www.statsmodels.org/​|statsmodels]] is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.
 +
 +Note: check the example in the [[https://​lectures.scientific-python.org/​packages/​statistics/​index.html|Statistics in Python]] tutorial
 +
 +
 +==== scikit-learn ====
 +
 +[[http://​scikit-learn.org/​|scikit-learn]] is a Python library for machine learning, and is one of the most widely used tools for supervised and unsupervised machine learning. Scikit–learn provides an easy-to-use,​ consistent interface to a large collection of machine learning models, as well as tools for model evaluation and data preparation
 +
 +Note: check the example in [[https://​lectures.scientific-python.org/​packages/​scikit-learn/​index.html|scikit-learn:​ machine learning in Python]]
 +
 +
 +==== scikit-image ====
 +
 +[[https://​scikit-image.org/​|scikit-image]] is a collection of algorithms for image processing in Python
 +
 +Note: check the example in [[https://​lectures.scientific-python.org/​packages/​scikit-image/​index.html|scikit-image:​ image processing]]
 +
 +
 +==== YData Profiling ====
 +
 +[[https://​docs.profiling.ydata.ai/​|YData Profiling]]:​ a leading package for data profiling, that automates and standardizes the generation of detailed reports, complete with statistics and visualizations.
 +
 +
 +==== D-Tale ====
 +
 +[[https://​github.com/​man-group/​dtale|D-Tale]] brings you an easy way to view & analyze Pandas data structures. It integrates seamlessly with ipython notebooks & python/​ipython terminals.
 +
 +
 +==== Sweetviz ====
 +
 +[[https://​github.com/​fbdesignpro/​sweetviz|Sweetviz]] is pandas based Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code.
 +
 +
 +==== AutoViz ====
 +
 +[[https://​github.com/​AutoViML/​AutoViz|AutoViz]]:​ the One-Line Automatic Data Visualization Library. Automatically Visualize any dataset, any size with a single line of code
 +
  
 =====  Data file formats =====  =====  Data file formats ===== 
  
-We list here some resources about non-NetCDF data formats that can be useful+  * We list below some resources about **non-NetCDF data formats** that can be useful 
 + 
 +  * Check the [[#​using_netcdf_files_with_python|Using NetCDF files with Python]] section otherwise
  
 ==== The shelve package ==== ==== The shelve package ====
Line 320: Line 461:
   * [[https://​github.com/​LibraryOfCongress/​bagger|Bagger]] (BagIt GUI)   * [[https://​github.com/​LibraryOfCongress/​bagger|Bagger]] (BagIt GUI)
   * [[https://​github.com/​LibraryOfCongress/​bagit-python|bagit-python]]   * [[https://​github.com/​LibraryOfCongress/​bagit-python|bagit-python]]
-===== Pandas ===== 
  
-Summary: //pandas is a library providing high-performance,​ easy-to-use data structures and data analysis tools//+==== Protocol Buffers ====
  
-Where: [[http://pandas.pydata.org|Pandas web site]]+//Protocol Buffers are (Google'​s) language-neutral,​ platform-neutral extensible mechanisms for serializing structured data//
  
-JYP's comment: pandas is supposed to be quite good for loading, processing and plotting time series, without writing custom code. It is **very convenient for processing tables in xlsx files** (or csv, etc...). You should at least have a quick look at: +  ​* https://protobuf.dev/ 
- +  * [[https://protobuf.dev/getting-started/pythontutorial/|Protocol Buffer Basics: Python]] 
-  * Some //Cheat Sheets// (in the following order): +    ''​mamba install protobuf''​
-    - Basics: [[http://​datacamp-community-prod.s3.amazonaws.com/​dbed353d-2757-4617-8206-8767ab379ab3|Pandas basics]] (associated with the [[https://www.datacamp.com/​community/​blog/​python-pandas-cheat-sheet|Pandas Cheat Sheet for Data Science in Python]] pandas introduction page) +
-    - Intermediate:​ [[https://​github.com/​pandas-dev/pandas/​tree/​master/​doc/​cheatsheet|github Pandas doc page]] +
-    - Advanced: the cheat sheet on the [[https://​www.enthought.com/​services/​training/​pandas-mastery-workshop/​|Enthought workshops advertising page]] +
-  ​* Some tutorials:​ +
-    ​* [[https://www.datacamp.com/community/​blog/​python-pandas-cheat-sheet|Pandas Cheat Sheet for Data Science in Python]] pandas introduction page +
-    * The [[http://www.scipy-lectures.org/​packages/​statistics/​index.html|Statistics in Python]] tutorial that combines Pandas, [[http://​statsmodels.sourceforge.net/​|Statsmodels]] and [[http://​seaborn.pydata.org/​|Seaborn]] +
- +
-===== Scipy Lecture Notes ===== +
- +
-Summary: //One document to learn numerics, science, and data with Python// +
- +
-Where: [[http://​www.scipy-lectures.org/​_downloads/​ScipyLectures-simple.pdf|pdf]] - [[http://​www.scipy-lectures.org/​|html]] +
- +
-This is **a really nice and useful document** that is regularly updated and used for the [[https://​www.euroscipy.org/​|EuroScipy]] tutorials. +
- +
-This document will teach you even more things about python, numpy and matplotlib, debugging and optimizing scripts, and about using python for statistics, image processing, machine learning, washing dishes (this is just to check if you have read this page), etc... +
-  * Example: the [[http://​www.scipy-lectures.org/​packages/​statistics/​index.html|Statistics in Python]] tutorial that combines [[other:​python:​jyp_steps#​pandas|Pandas]],​ [[http://​statsmodels.sourceforge.net/​|Statsmodels]] and [[http://​seaborn.pydata.org/​|Seaborn]] +
- +
-===== statsmodels ===== +
- +
-[[https://​www.statsmodels.org/​|statsmodels ]] is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. +
- +
-===== scikit-learn ===== +
- +
-[[http://​scikit-learn.org/​|scikit-learn]] is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing,​ model selection and evaluation, and many other utilities. +
- +
-===== scikit-image ===== +
- +
-[[https://​scikit-image.org/​|scikit-image]] is a collection of algorithms for image processing in Python+
  
 ===== Quick Reference and cheat sheets ===== ===== Quick Reference and cheat sheets =====
Line 465: Line 576:
  
 You can do a lot more with python! But if you have read at least a part of this page, you should be able to find and use the modules you need. Make sure you do not reinvent the wheel! Use existing packages when possible, and make sure to report bugs or errors in the documentations when you find some You can do a lot more with python! But if you have read at least a part of this page, you should be able to find and use the modules you need. Make sure you do not reinvent the wheel! Use existing packages when possible, and make sure to report bugs or errors in the documentations when you find some
 +
 +
 +===== Out-of-date stuff =====
 +
 +
 +==== CDAT-related resources ====
 +
 +Some links, in case they can't be found easily on the [[https://​cdat.llnl.gov|CDAT]] web site...
 +
 +  * [[https://​cdat.llnl.gov/​tutorials.html|Tutorials in ipython notebooks]]
 +  * [[http://​cdat-vcs.readthedocs.io/​en/​latest/​|VCS:​ Visualization Control System]]
 +    * [[https://​github.com/​CDAT/​vcs/​issues/​238|Colormaps in vcs examples]]
 +  * [[https://​github.com/​CDAT/​cdat-site/​blob/​master/​eztemplate.md|EzTemplate Documentation]]
 +
  
 /* standard page footer */ /* standard page footer */
other/python/jyp_steps.1632320607.txt.gz · Last modified: 2021/09/22 14:23 by jypeter