other:python:jyp_steps
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
other:python:jyp_steps [2016/01/19 17:51] – jypeter | other:python:jyp_steps [2017/03/08 10:48] – Rewrote the netcdf part to add a link to the tuto for new people jypeter | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== JYP' |
+ | |||
+ | <note tip>If you don't know which python distribution to use and how to start the python interpreter, | ||
As can be expected, there is **a lot** of online python documentation available, and it's easy to get lost. You can always use google to find an answer to your problem, and you will probably end up looking at lots of answers on [[http:// | As can be expected, there is **a lot** of online python documentation available, and it's easy to get lost. You can always use google to find an answer to your problem, and you will probably end up looking at lots of answers on [[http:// | ||
- | This page tries to list some //python for the scientist// related resources, in a suggested reading order. | + | This page tries to list some //python for the scientist// related resources, in a suggested reading order. |
+ | |||
+ | ===== JYP's introduction to python ===== | ||
+ | |||
+ | ==== Part 1 ==== | ||
+ | |||
+ | You can start using python by reading the {{: | ||
+ | * this tutorial is in French (my apologies for the lack of translation, | ||
+ | * If you have too much trouble understanding this French Tutorial, you can read the first 6 chapters of the **Tutorial** in [[# | ||
+ | * it's an introduction to python (and programming) for the climate scientist: after reading this tutorial, you should be able to do most of the things you usually do in a shell script | ||
+ | * python types, tests, loops, reading a text file | ||
+ | * the tutorial is very detailed about string handling, because strings offer an easy way to practice working with indices (indexing and slicing), before indexing numpy arrays. And our usual pre/ | ||
+ | * after reading this tutorial, you should practice with the following: | ||
+ | * [[https:// | ||
+ | * {{: | ||
+ | * {{: | ||
+ | |||
+ | ==== Part 2 ==== | ||
+ | |||
+ | Once you have done your first steps, you should read [[http:// | ||
+ | * this tutorial is in French (sorry again) | ||
+ | * after reading this tutorial, you will be able to do more than you can do in a shell script, in an easier way | ||
+ | * advanced string formatting | ||
+ | * creating functions and using modules | ||
+ | * working with file paths and handling files without calling external Linux programs\\ (e.g. using '' | ||
+ | * using command-line options for scripts, or using configuration files | ||
+ | * calling external programs | ||
+ | |||
+ | ===== The official python documentation ===== | ||
+ | |||
+ | You do not need to read all the python documentation at this step, but it is really well made and you should at least have a look at it. The **Tutorial** is very good, and you should have a look at the table of content of the **Python Standard Library**. There is a lot in the default library that can make your life easier | ||
+ | |||
+ | ==== Python 2.7 ==== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | ==== Python 3 ==== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | |||
+ | ===== Numpy and Scipy ===== | ||
+ | |||
+ | Summary: Python provides //ordered// objects (e.g. lists, strings, basic arrays, ...) and some math operators, but you can't do real heavy computation with these. **Numpy** makes it possible to work with multi-dimensional data arrays, and using array syntax and masks (instead of explicit nested loops and tests) and the apropriate numpy functions will allow you to get performance similar to what you would get with a compiled program! **Scipy** adds more scientific functions | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | ==== Getting started ==== | ||
+ | |||
+ | - always remember that indices start at '' | ||
+ | - if you are a Matlab user (but the references are interesting for others as well), you can read the following: | ||
+ | - [[https:// | ||
+ | - [[http:// | ||
+ | - read the [[https:// | ||
+ | - have a quick look at the full documentation to know where things are | ||
+ | - Numpy User Guide | ||
+ | - Numpy Reference Guide | ||
+ | - Scipy Reference Guide | ||
+ | |||
+ | ==== Beware of the array view side effects ==== | ||
+ | |||
+ | <note warning> | ||
+ | |||
+ | That is not a problem when you only read the values, but **if you change the values of the //View//, you change the values of the first array** (and vice-versa)! If that is not what want, do not forget to **make a copy** of the data before working on it! | ||
+ | |||
+ | //Views// are a good thing most of the time, so only make a copy of your data when needed, because otherwise copying a big array will just be a waste of CPU and computer memory. Anyway, it is always better to understand what you are doing... :-P | ||
+ | |||
+ | Check the example below and the [[https:// | ||
+ | |||
+ | <code python> | ||
+ | >>> | ||
+ | >>> | ||
+ | >>> | ||
+ | array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], | ||
+ | [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], | ||
+ | [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]]) | ||
+ | |||
+ | >>> | ||
+ | >>> | ||
+ | array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) | ||
+ | |||
+ | >>> | ||
+ | >>> | ||
+ | array([10, 11, 12, 0, 0, 0, 0, 17, 18, 19]) | ||
+ | |||
+ | >>> | ||
+ | array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], | ||
+ | [10, 11, 12, 0, 0, 0, 0, 17, 18, 19], | ||
+ | [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]]) | ||
+ | |||
+ | >>> | ||
+ | >>> | ||
+ | array([[ 0, 1, -1, -1, 4, 5, 6, 7, 8, 9], | ||
+ | [10, 11, -1, -1, 0, 0, 0, 17, 18, 19], | ||
+ | [20, 21, -1, -1, 24, 25, 26, 27, 28, 29]]) | ||
+ | |||
+ | >>> | ||
+ | array([10, 11, -1, -1, 0, 0, 0, 17, 18, 19]) | ||
+ | |||
+ | >>> | ||
+ | >>> | ||
+ | array([10, 11, -1, -1, 0, 0, 0, 17, 18, 19]) | ||
+ | |||
+ | >>> | ||
+ | >>> | ||
+ | array([9, 9, 9, 9, 9, 9, 9, 9, 9, 9]) | ||
+ | |||
+ | >>> | ||
+ | array([10, 11, -1, -1, 0, 0, 0, 17, 18, 19]) | ||
+ | |||
+ | >>> | ||
+ | array([[ 0, 1, -1, -1, 4, 5, 6, 7, 8, 9], | ||
+ | [10, 11, -1, -1, 0, 0, 0, 17, 18, 19], | ||
+ | [20, 21, -1, -1, 24, 25, 26, 27, 28, 29]]) | ||
+ | </ | ||
+ | |||
+ | ===== cdms2 and netCDF4 ===== | ||
+ | |||
+ | There is a good chance that your input array data will come from a file in the [[other: | ||
+ | |||
+ | Depending on which [[other: | ||
+ | |||
+ | ==== cdms2 ==== | ||
+ | |||
+ | Summary: cdms2 can read/write netCDF files (and read //grads// dat+ctl files) and provides a higher level interface than netCDF4. Unfortunately, | ||
+ | |||
+ | How to get started: | ||
+ | - read [[http:// | ||
+ | - the tutorial is in French (soooorry!) | ||
+ | - you have to replace //cdms// with **cdms2**, and //MV// with **MV2** (sooorry about that, the tutorial was written when CDAT was based on //Numeric// instead of //numpy// to handle array data) | ||
+ | - read the [[http:// | ||
+ | - ask questions and get answers on the [[http:// | ||
+ | |||
+ | |||
+ | ==== netCDF4 ==== | ||
+ | |||
+ | Summary: netCDF4 can read/write netCDF files and is available in most python distributions | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | |||
+ | ===== Matplotlib ===== | ||
+ | |||
+ | Summary: there are lots of python libraries that you can use for plotting, but Matplotlib has become a //de facto// standard | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | The documentation is good, but not always easy to use. A good way to start with matplotlib is to: | ||
+ | - Look at the [[http:// | ||
+ | - Use the free hints provided by JY! | ||
+ | - a Matplotlib //Figure// is a graphical window in which you make your plots... | ||
+ | - a Matplotlib //Axis// is a plot inside a Figure... [[http:// | ||
+ | - some examples are more // | ||
+ | - sometimes the results of the python/ | ||
+ | - the documentation may mention [[http:// | ||
+ | - Read the [[http:// | ||
+ | - Download the [[http:// | ||
+ | |||
+ | ===== Basemap ===== | ||
+ | |||
+ | <note warning> | ||
+ | </ | ||
+ | |||
+ | Summary: Basemap is an extension of Matplotlib that you can use for plotting maps, using different projections | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | How to use basemap? | ||
+ | - look at the [[http:// | ||
+ | - check the [[http:// | ||
+ | - read some documentation! | ||
+ | - the **really nice** [[http:// | ||
+ | - look at the [[http:// | ||
+ | |||
+ | ===== Cartopy ===== | ||
+ | |||
+ | Summary: //Cartopy makes use of the powerful PROJ.4, numpy and shapely libraries and has a simple and intuitive drawing interface to matplotlib for creating publication quality maps// | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | ===== Pandas ===== | ||
+ | |||
+ | Summary: //pandas is a library providing high-performance, | ||
+ | |||
+ | Where: [[http:// | ||
+ | |||
+ | JYP's comment: pandas is supposed to be quite good for loading, processing and plotting time series, without writing custom code. You should at least have a quick look at: | ||
+ | * The [[http:// | ||
+ | * the cheat sheet on the [[https:// | ||
+ | * the cheat sheet on the [[https:// | ||
===== Scipy Lecture Notes ===== | ===== Scipy Lecture Notes ===== | ||
Line 11: | Line 202: | ||
Where: [[http:// | Where: [[http:// | ||
- | This is a really nice document that is regularly updated and used for the [[https:// | + | This is **a really nice and useful |
===== Quick Reference ===== | ===== Quick Reference ===== | ||
- | * The nice Python 2.7 Quick Reference: [[http:// | + | * The nice and convenient |
===== Some good coding tips ===== | ===== Some good coding tips ===== | ||
Line 22: | Line 213: | ||
* [[http:// | * [[http:// | ||
+ | |||
+ | ===== Debugging your code ===== | ||
+ | |||
+ | There is only so much you can do with staring at your code in your favorite text editor, and adding '' | ||
+ | |||
+ | ==== Debugging in text mode ==== | ||
+ | |||
+ | - Start the script with: '' | ||
+ | - Type '' | ||
+ | - Type '' | ||
+ | - Use '' | ||
+ | - Type '' | ||
+ | - Use '' | ||
+ | - Type '' | ||
+ | - Use '' | ||
+ | * '' | ||
+ | * '' | ||
+ | - Check the [[https:// | ||
+ | |||
+ | ==== Using pydebug ==== | ||
+ | |||
+ | Depending on the distribution, | ||
===== Improving the performance of your code ===== | ===== Improving the performance of your code ===== | ||
Line 29: | Line 242: | ||
* **make sure that your script is not using too much memory** (the amount depends on the computer you are using)! Your script should be scalable (e.g. keeps on working even when your data gets bigger), so it's a good idea to load only the data you need in memory (e.g. not all the time steps), and learn how to load chunks of data | * **make sure that your script is not using too much memory** (the amount depends on the computer you are using)! Your script should be scalable (e.g. keeps on working even when your data gets bigger), so it's a good idea to load only the data you need in memory (e.g. not all the time steps), and learn how to load chunks of data | ||
- | * **make sure that you are using array/ | + | * **make sure that you are using array/ |
If your script is still not fast enough, there is a lot you can do to improve it, without resorting to parallelization (that may introduce extra bugs rather that extra performance). See the sections below | If your script is still not fast enough, there is a lot you can do to improve it, without resorting to parallelization (that may introduce extra bugs rather that extra performance). See the sections below | ||
+ | |||
+ | Hint: before optimizing your script, you should spent some time // | ||
+ | |||
+ | ==== Useful packages ==== | ||
+ | |||
+ | * [[https:// | ||
+ | * [[http:// | ||
==== Tutorials by Ian Osvald ==== | ==== Tutorials by Ian Osvald ==== | ||
Line 41: | Line 261: | ||
The official [[https:// | The official [[https:// | ||
+ | |||
+ | ===== What now? ===== | ||
+ | |||
+ | You can do a lot more with python! But if you have read at least a part of this page, you should be able to find and use the modules you need. Make sure you do not reinvent the wheel! Use existing packages when possible, and make sure to report bugs or errors in the documentations when you find some | ||
/* standard page footer */ | /* standard page footer */ |
other/python/jyp_steps.txt · Last modified: 2025/02/26 11:40 by jypeter