Data Analysis with Open Source Tools is an excellent book for experienced analysts of data. The author is obviously enthusiastic about his topic and presents the information in a lucid, readable format. The book is not, however, a cookbook, or step-by-step guide to using a tool or suite for delving into data and would not be suitable for a neophyte.
Within the first two chapters I had jotted down a number of ideas to apply to my own work ( DBA primarily applying analyses as a data quality tool ), and continued this pattern throughout the book. However, if you are math-phobic, as I am, you may quickly find yourself skimming portions which delve into formulas and mathematical concepts. The author does indicate in his introduction that none of the math involved is particularly difficult, it’s just that if you don’t view the world in mathematical terms, you’ll have some work to do in order to fully grasp the concepts presented. References are provided in most chapters for further reading.
“Workshops” using open source products are provided in the majority of chapters; these primarily use various Python libraries but obviously use other tools for appropriate tasks (e.g., R and gnuplot). To get the most value from the text you should have a working Python environment and be able to handle the installation of various Python libraries within your environment.