Python, NumPy and Pandas

 


When I decided to dive into Machine Learning, my first challenge was mastering Python, the language that powers it all. With so many resources available, from MOOCs to books, I had to choose where to start. I've always believed that books are the best learning tools because they tend to be exhaustive. Pick a good one, and you’ll learn almost everything you need to know—enough to get you going! Initially, I thought MOOCs diluted their content to fit into a short format, but I've started to change my mind. More on that later!

So, I began by making a list of books that suited my learning needs and included them in my learning plan. These books helped me get up to speed on my Python skills and learn Pandas and Numpy, the two most popular packages used extensively in the field of ML. They are:

  • Head First Python by Paul Barry
  • Python Crash Course by Eric Matthes
  • Learning Python by Mark Lutz
  • Python for Data Analysis by Wes McKinney
  • Fluent Python by Luciano Ramalho

I’m still reading the last one and finding it delightful. In the next sections, I’ll highlight each book’s strengths and appeal. I’ll also try to assess what kind of readers each book is best suited for.

Head First Python


Head First Python is a book meant for readers without any programming experience. For some, it may bring back cringeworthy college memories - arey yaar! yeh to bahut bachpan mein padha tha. But this post is for those who may or may not have a coding background.

Without being intimidating, this book covers the basics of Python, from data structures like lists, tuples, sets, and dictionaries to connecting and querying a MySQL database. It also introduces classes and object-oriented programming. Considering its target audience, that’s a tall order, and it achieves it deftly.

The book maintains a casual tone and avoids heavy jargon. It leverages various visual aids, interesting anecdotes, and learning tips to explain several concepts. While it doesn’t delve into deeper topics like lambda functions, advanced data structure capabilities, or how functions can return multiple items and be passed as arguments, for an introductory book, it doesn’t need to. Overall, it keeps its readers engaged and motivated.

Where the book comes up short is the lack of supplementary resources, such as a code repository for readers to experiment with, cheat sheets, glossaries, or further reading suggestions.

I recommend this book to anyone who hasn’t coded before and wants to grasp the language. It provides a good starting point and helps build comfort with Python.

Python Crash Course

  

Although Python Crash Course claims to be for people of any age who have never programmed in Python or at all, I feel it does a better job than its predecessor in terms of sheer topic coverage.

Contrary to its name, the book doesn't feel rushed and covers a wide range of functionalities that Python’s basic data structures offer. For a beginner-level book, that is impressive, providing significant value to its target readers.

The book’s content is well-structured, offering a clear learning path from basic concepts to fairly advanced ones. It has entire chapters dedicated to if statements and dictionaries. It also introduces inheritance and its application through classes with practical examples and clear code snippets, allowing readers to follow along and refine their understanding through hands-on coding.

However, the book doesn't rely much on visual aids and illustrations beyond code snippets, which might be a downside. It also lacks supplementary resources and reading suggestions. Moreover, being an all black-and-white print, it can sometimes feel a bit disengaging and a drag to read. Nevertheless, these points are trivial for serious learners considering the volume of content packed into a limited number of pages.

Overall, I recommend this book to anyone, regardless of their background, who is looking to learn more than just the basic syntax. It not only builds familiarity with the language but also provides a deeper understanding of Python’s capabilities.


Learning Python 


Alright, now we come to Learning Python. This book is one of the most extensive and exhaustive resources available. This massive work can be compared to The Complete Reference series of books!

The book comprises eight major sections, each with 5-7 chapters. It explains Python’s strengths, execution, and how to launch code in just three chapters in section one. Entire chapters are dedicated to concepts like dynamic typing, string fundamentals, and loops in subsequent sections. As you progress, the sections cover deeper topics like managed attributes and decorators. I regard this book highly for its coverage and content structure.

However, this book can often be disengaging because it covers every minute capability of Python. It lacks useful tips and relies purely on code snippets for illustration. Thus, it feels more like a reference guide than something to read end-to-end. Struggling to understand numeric data types or comprehensions? Mark Lutz wala kitaab mein dekh lo.

I’d say this book should be used as a supplementary read while learning the language from another primary source to further refine your understanding. It may very well act as your own Python Bhagavad Gita and won’t disappoint.


Python for Data Analysis

 

We finally come to my favorite book for learning Python, NumPy, and Pandas i.e. Python for Data Analysis! This is my go-to book for anyone who wants to get started with these tools as quickly as possible. True to its name, it focuses on Python for data analysis, making for an engaging and non-intimidating read.

Chapters 2 and 3 act as a primer for Python, and the book quickly dives into NumPy basics and Pandas. It gives ample attention to all the important functionalities of these packages. Topics like data loading, data cleaning, data wrangling, and visualization each get their own chapters. Later chapters cover time-series, advanced Pandas, advanced NumPy, and popular modeling libraries in Python like scikit-learn and statsmodels. Finally, it offers a chapter focused on practical data analysis using real-world datasets.

The book is very engaging, using visual aids, code snippets, and tips to keep readers’ attention. It relies heavily on a hands-on learning approach, minimizing verbosity and illustrating the packages' capabilities through actual code and outputs using real-world datasets.

Moreover, the datasets used in each chapter are available in a GitHub repository, which also contains notebooks for each chapter with the code snippets. This makes the learning experience smoother, allowing readers to execute the code and examine the output while learning.

With focused, valuable content presented lucidly in 14 concise chapters, this book ensures every page is packed with information without distracting readers. Full paisa vasool!!


Fluent Python

 

I've recently discovered this gem of a book through my colleagues: Fluent Python. The author claims it is written for practicing Python programmers who want to become proficient in Python 3. However, it is not for beginners. The book aims to help readers fully utilize Python 3's capabilities. For example, the first chapter teaches how to make a custom class mimic the behavior of built-in Python sequences by implementing special methods like '__len__' and '__getitem__'. This powerful feature opens up many possibilities for developers.

I must admit, I have only recently begun reading and grasping the lessons in this book. As I write this post, I am still only in chapter 2. Therefore, it would be unfair to make definitive assessments about the book's utility. However, my initial impression is promising. I plan to write more once I have finished it and applied its teachings in my day-to-day work.


Conclusion

So, armed with these resources, I was able to update my Python, NumPy, and Pandas skills to begin my ML journey fully. I hope that some or all of the books mentioned above become valuable resources in your own Python learning journey. In my subsequent blog posts, I will cover more valuable resources in the domains of ML and Linear Algebra.


 






 



Comments

Popular posts from this blog

Statistics-Watistics

Another must read by O'Reilly!