This book provides insight into the Pythonic way of writing programs: the best way to use python. It builds on a fundamental understanding of the language that I assume you already have. Novice programmers will learn the best practices of Python’s capabilities. Experienced programmers will learn how to embrace the strangeness of a new tool with confidence
Chapter 1. Pythonic Thinking
PEP 8 is commonly referred to the style guide for how to format Python code. I personally like use the Ruff Formatter so I don’t have to worry about if I am properly formatting my code
You don’t need to explicitly test if a container data type length is 0 or > than 1 when using in an if statement because an empty container with result in False and a container with a length > 0 will result in True
bytes contains sequences of 8-bit values, and str contains sequences of Unicode code points.
Why go through the trouble of assigning the default value for the .get() to [''] instead of 0?
Ah I see, it appears the default return value for parse_qs if there is no value for the color is a list with an empty string. So that’s why we index into the list and if it’s '' which also means False we can use the or 0 to default it to 0
Make use of helper functions in Python. Here is an example of how to solve the above problem with a helper function
I starting from the beginning of this book again it’s been awhile since I put the the book down.
Chapter 1. Pythonic Thinking
isinstance function allows you to compare variable to a specific type and will return True or False
When working with bytes and strings make sure to convert them to the same type because they don’t work well with each other. For example you can’t add bytes to string and you can’t compare them
All things talk about bytes and strings must of been caused by the Python 2 to Python 3 migration because I rarely have to use bytes for anything.
Interesting I could use this parse_qs for my Harlequin ADBC parameter to help parse some of the optional kwargs that the user can enter
from urllib.parse import parse_qsmy_values = parse_qs("red=5&blue=0&green=", keep_blank_values=True)print(my_values)
{'red': ['5'], 'blue': ['0'], 'green': ['']}
I know I wrote about it the first time but I think the inplace swap with unpacking is so clean in python. Eliminates the need of a temporary variable. The bubble sort algorithm is a great example of when you would use something like this.
When iterating ove a list and you also want access to the index use the enumerate function
zip consumes the iterators it wraps one time at a time, which means it can be used with infinitely long inputs without risk of a program using too much memory and crashing
zips output is as long as the shortest input, you can use the itertools.zip_longest() instead if you want to go the longest input
Chapter 2. Lists and Dictionaries
The dict type is often called an associative array or a hash table
Python provides a stable sorting algorithm so when you first sort on something and then sort on another thing. If there is a tie in the second sort it will preserve the order from the first sort. You just need to execute the sorts in the opposite sequence of what you want the final list to contain.
I don’t understand how votes.get works here with out the use of a lambda function
votes = {"otter": 1281,"polar bear": 587,"fox": 863,}names =list(votes.keys())print(names)names.sort(key=votes.get, reverse=False)print(names)# I would have thought that you would have to do thisnames.sort(key=lambda x: votes.get(x,0), reverse=True)print(names)
It appears key automatically applies any function to all elements in the list so in a way I should think of key kind of like a mapping function where I can specify any function to be applied to all elements in the list that return a scalar value that I want to sort on
I am not familiar with this iter function
ranks = {}for i, name inenumerate(names, 1): ranks[name] = iprint(ranks)print(next(iter(ranks)))# I would have done something like this, so good to know about next(iter())print(list(ranks.keys())[0])
Okay I get it now, so essentially this is an easier way to iterate over the keys in a dict since we don’t want to iterate over all the items we can create and iter object and call next on it. IIRC, this is what for in does under the hood. My method would have been to turn rank.keys() into list and subscript first element. This is a a little more verbose and has the done sides have having to store the whole list of keys in memory whereas I think the iter(ranks) is lazily evaluated.
Not sure if I follow why the SortedDict Class is returning the wrong value when calling the get winner
from collections.abc import MutableMappingclass SortedDict(MutableMapping):def__init__(self):self.data = {}def__getitem__(self, key):returnself.data[key]def__setitem__(self, key, value):self.data[key] = valuedef__delitem__(self, key):delself.data[key]def__iter__(self): keys =list(self.data.keys()) keys.sort()for key in keys:yield keydef__len__(self):returnlen(self.data)def populate_ranks(votes, ranks): names =list(votes.keys()) names.sort(key=votes.get, reverse=True)for i, name inenumerate(names, 1): ranks[name] = idef get_winner(ranks):returnnext(iter(ranks)) regular_ranks = {}sorted_ranks = SortedDict()populate_ranks(votes, regular_ranks)populate_ranks(votes, sorted_ranks)print(regular_ranks)print(sorted_ranks.data)regular_winner = get_winner(regular_ranks)sorted_winner = get_winner(sorted_ranks)print(regular_winner)# returns the first alphabetic result because of we implemented __iter__print(sorted_winner)
I understand now, the winner results wasn’t being return properly because in this new SortedDict class we implemented the __iter__ to sort the keys alphabetically
the setdefault method is very handy and I always forget about it
votes = {"baguette": ["Bob", "Alice"],"ciabatta": ["Coco", "Deb"],}key ="brioche"who ="Elmer"names = votes.setdefault(key, [])names.append(who)print(votes)# another way to do it (preferred by author)key ="wheat"who ="Tyler"if (names := votes.get(key)) isNone:# never seen this triple assignment first votes[key] = names = []names.append(who)print(votes)
Interesting, after reading further on the Author other actually discourages the use of setdefault because it hurts readability
the default value passed to setdefault is assigned directly into the dictionary when the key is missing instead of being copied…This means that you need to make sure that you are always constructing a new default value for each key I access with setdefault. This leads to significant performance overhead in this example because I have to allocate a list instance for each call.
I am not sure I follow here. This is making more confused and now I am starting to wonder why I need setdefault when I could always use get and pass in a default value when the key doesn’t exists?
Also I don’t get the get method in the counter example only requires one access and one assignment while the `setdefault requires one access and two assignments? Does the default value always get assigned no matter what?
The author states is best to only use the setdefault when the default values are cheap to construct, mutable and there’s no potential for raising exceptions. In most cases it’s better to use the defaultdict over the setdefault
The __missing__ dunder method can be helpful when you can’t use setdefault because it always creates the default object regardless if the key is already present which can be problematic and you can’t use defaultdict because you can’t pass in a parameter into the function reference.
pictures = {}path ="profile.png"# setdefault method (doesn't work)try:# open function always gets called here even when path is# already present in the dictionary. This results in an# additional file handle that may conflict with existing# open handles in the same program handle = pictures.setdefault(path, open(path, "a+b"))exceptOSError:print(f"Failed to open path {path}")raiseelse: handle.seek(0) image_data = handle_read()# defaultdict methodfrom collections import defaultdictdef open_picture(profile_path):try:returnopen(profile_path, "a+b")exceptOSError:print(f"Failed to open path {path}")raise# will error out because open_picture func ref requires# argument profile_picturepictures = defaultdict(open_picture)handle = pictures[path]handle.seek(0)image_data = handle.read()# __missing__class Pictures(dict):def__missing__(self, key): value = open_picture(key)self[key] = valuereturn valuepictures = Pictures()handle = pictures[path]handle.seek(0)image_data = handle.read()
Chapter 3. Functions
I finally know what closures are, functions that refer to variables from the scope in why there were defined. In the below example this is why the helper function is able to access the group argument for sort_priority
def sort_priority(values, group):def helper(x):# here I am inside the helper function and yet I can# reference group without having to pass it in as an# argument to the helper functionif x in group:return (0, x)return (1, x) values.sort(key=helper)numbers = [8,3,1,2,5,4,7,6]group = {2,3,5,7}sort_priority(numbers, group)print(numbers)
[2, 3, 5, 7, 1, 4, 6, 8]
When someone says Functions are first-class objects in Python it means you can do the following
Refer to them directly
Assign them to variables
pass them as arguments to other functions
compare them in expressions and if statements
When comparing sequences in python if first compares the items as index zero and keeps moving on to each sequence if there are equal. This is why the (0, x) and (1, x) creates two distinct groups
One thing I not a big fan of is how this function modifies the original object. I always thought that was an anti-patter. Shouldn’t we instead prefer creating a copy instead and return the sorted list?
Being careful with scoping bugs, python has a specific order when it goes through when referencing a variable before you hit the NameError. Assignments work different in closures then referring to a variable directly. Use the nonlocal variable to assign data outside the closure.
When looking to extend a functions capabilities look at using a key word argument with a default to maintain backwards capability to existing callers.
Caution
A default argument value is evaluated once per module load, which usually happens when a program starts. This trips people up a lot so you have to be careful with the default values you provide kwargs. Instead use the default value of None and add a conditional if to check if is None then call assign the default value.
Especially important for default arguments are mutual e.g. a list or it’s a function e.g. datetime.now()
To define key-word only arguments you can force the caller to make the spell out the key word by adding a * to indicate the end of all the positional arguments and the beginning of the keyword arguments.
Very interesting, I have heard of key-word only arguments but this is the first time of hearing positional-only arguments where arguments can only be supplied by their position. The / indicates where the positional-only arguments ends.
Python decorators has the ability to run additional code before and after each call to a function it wraps.
Chapter 4. Comprehensions and Generators
I was not aware of generator expressions, which are like list comprehensions for generators
You have to be caution when using generators as they are stateful so if you iterator over an entire generator it’s done. It’s best way to prevent this if you want to create a function that requires iteratoring over a generator multiple times is to provide a new container class that implements the iterator protocol by implementing the __iter__ method as a generator.
I’ll be honest I having a hard time understanding some of this more advanced generator topics 😅
Chapter 5. Classes and Interfaces
I need understood why I would use a named tuple over a dict. I guess if you want to keep it immutable?
I am not sure if I am completely grokking what polymorphism is? My mental model as right now is that it enables a class to inherit a method from a parent class but change the functionality of that method. Unsure if that’s the right way to think about it.
mix-in is a class that defines only a small set of additional methods for its child classes to provide.
If you defining your own class that you want to behave similar to a container type in Python but you don’t want to inherit from that class you can use the collections.abc module which stands for Abstract Base Class. This will provided errors if you don’t implement all the necessary methods and attributes that the abc you are inheriting from expects. Creating your own abc is a common technique for when you want others to adhere to building a class with the same methods and attributes (e.g. Harlequin DatabaseAdapter uses and abc to make sure other adapter authors implement the adapter correctly)
Chapter 6. Metaclasses and Attributes
Descriptor protocol defines how attribute access is interpreted by the language.
Not many notes in this chapter simply because much of it went over my head. I am starting to feel a little detached from the book and having a hard time imagining how I can use some of these tips in my day to day work.
Chapter 7. Concurrency and Parallelism
Concurrency enables a compute to do many different things seemingly at the same time
Parallelism involves actually doing many different things at the same time
The key difference between concurrency and parallelism is speedup.
When two distinct paths of execution in a program make forward progress in parallel, the time it takes to do the total work is cut in half; the speed of execution is faster by a fact of two
Concurrent programs may run thousands of separate paths of execution seemingly in parallel but provide no speed for the total work
The standard implementation of python is called CPython. CPython runs a Python program into two steps. First, it parses and compiles the source text into bytecode, which is low-level representation of the program as 8-bit instructions…The, CPython runs the bytecode using a stack-based interpreter. The bytecode interpreter has state that must be maintained and coherent while the Python program executes. CPython enforces coherence with a mechanism called global interpreter lock (GIL).
Essentially, the GIL is a mutual-exclusion lock (mutex) that prevents CPython from being affected by preemptive multithreading where one takes control of a program by interrupting another thread. Such an interruption could corrupt interpreter state (e.g. garbage collection reference counts).
The GIL has an important negative side effect. With programs written in languages like C++ or Java, having multiple threads of execution means that a program could utilize multiple CPU cores at the same time. Although Python supports multiple threads of execution, the GIL causes only one of them to ever make forward progress at a time. This means that when you reach for threads to do parallel computation and speed up your Python programs, you will be sorely disappointed.
Why does python support threads at all if it has the GIL?
Makes it easy to for a program to seem like it’s doing multiple things at the same time.
To do deal with blocking I/O, which happens when Python does certain types of system calls. Examples include reading/writing files, interacting with networks, communicating with devices like displays etc.
Fan-out is the process of spawning a concurrent line of execution for each unit of work (generating new units of concurrency)
Fan-in is the process of waiting for all of those concurrent units of work to finish before moving on to the next phase in a coordinated process (waiting for existing units of concurrency to complete)
Chapter 8. Robustness and Performance
Because of how python represents floating point numbers as IEEE 754 that can lead to interesting results
When debugging or logging make sure to use the repr version of an object so you can tell the differences between types e.g. '5' or 5.
When using f strings you can use !r to get the repr version
Chapter 10. Collaboration
No notes
Review
I was shocked to learn about much there really is to Python. I considered myself to be an intermediate python programmer but after reading this book I realized I have plenty to learn. I can confidently recommend this book to anyone looking to improve their python knowledge. I envision myself rereading this book again in the future.