Python Tricks - Common Data Structures in Python(1)

Dictionaries, Maps, and Hashtables

In Python, dictionaries (or “dicts” for short) are a central data structure. Dicts store an arbitrary number of objects, each identified by a unique dictionary key.
字典在python是一种重要的数据结构。字典存储一个任意数量的对象,每一个都对应一个字典键。

Dictionaries are also often called maps, hashmaps, lookup tables, or associative arrays. They allow for the efficient lookup, insertion, and deletion of any object associated with a given key.
字典通常也称为映射、哈希映射、查找表或关联数组。它们允许有效的查找、插入和删除与给定键关联的任何对象。

What does this mean in practice? It turns out that phone books make a decent real-world analog for dictionary objects:

Phone books allow you to quickly retrieve the information (phone number) associated with a given key (a person’s name). So, instead of having to read a phone book front to back in order to find someone’s number, you can jump more or less directly to a name and look up the associated information.

在电话表查询里面直接就用姓名进行查询。

This analogy breaks down somewhat when it comes to how the information is organized in order to allow for fast lookups. But the fundamental performance characteristics hold: Dictionaries allow you to quickly find the information associated with a given key.

In summary, dictionaries are one of the most frequently used and most important data structures in computer science.

So, how does Python handle dictionaries?

Let’s take a tour of the dictionary implementations available in core
Python and the Python standard library.

Let’s take a tour of the dictionary implementations available in core
Python and the Python standard library.

Let’s take a tour of the dictionary implementations available in core
Python and the Python standard library.

dict – Your Go-To Dictionary

Because of their importance, Python features a robust dictionary
implementation that’s built directly into the core language: the dict data type.

Python also provides some useful “syntactic sugar” for working with dictionaries in your programs. For example, the curly-braces dictionary expression syntax and dictionary comprehensions allow you to conveniently define new dictionary objects:

phonebook = {
  'bob': 7387,
  'alice': 3719,
  'jack': 7052,
}

squares = {x: x * x for x in range(6)}

>>> phonebook['alice']
3719

>>> squares
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

语法糖:字典表达式

There are some restrictions on which objects can be used as valid keys.

Python’s dictionaries are indexed by keys that can be of any hashable type: A hashable object has a hash value which never changes during its lifetime (see __hash__), and it can be compared to other objects (see __eq__). In addition, hashable objects which compare as equal must have the same hash value.

Immutable types like strings and numbers are hashable and work well as dictionary keys. You can also use tuple objects as dictionary keys, as long as they contain only hashable types themselves.
字符串和数字这种不可变的类型是可哈希的,而且适合做字典的键。你也可以用元组做字典键,只要元组中的类型都是可哈希类型。

For most use cases, Python’s built-in dictionary implementation will do everything you need. Dictionaries are highly optimized and underlie many parts of the language, for example class attributes and variables in a stack frame are both stored internally in dictionaries.
对于大多数用例,Python的内置字典实现将完成您所需要的一切。字典是高度优化的,是语言许多部分的基础,例如类属性和堆栈框架中的变量都存储在字典的内部。

Python dictionaries are based on a well-tested and finely tuned hash
table implementation that provides the performance characteristics
you’d expect: O(1) time complexity for lookup, insert, update, and
delete operations in the average case.
字典在查询,插入,更新和删除操作上平均都是O(1)的时间复杂度。意思就是很快。

There’s little reason not to use the standard dict implementation included with Python. However, specialized third-party dictionary implementations exist, for example skip lists or B-tree based dictionaries.

Besides “plain” dict objects, Python’s standard library also includes a
number of specialized dictionary implementations. These specialized
dictionaries are all based on the built-in dictionary class (and share its performance characteristics), but add some convenience features
on top of that.
专门的字典实现。这些特殊字典都是基于字典类进行开发的,带来了不少方便。

Let’s take a look at them.

collections.OrderedDict – Remember the Insertion Order of Keys

Python includes a specialized dict subclass that remembers the insertion
order of keys added to it: collections.OrderedDict.
这种特殊字典可以字典元素添加插入的顺序。

While standard dict instances preserve the insertion order of keys in CPython 3.6 and above, this is just a side effect of the CPython implementation and is not defined in the language spec. So, if key order is important for your algorithm to work, it’s best to communicate this clearly by explicitly using the OrderDict class.
虽然标准dict实例保留了cpython 3.6及更高版本中键的插入顺序,但这只是cpython实现的副作用,在语言规范中没有定义。因此,如果键顺序对于算法的工作很重要,最好通过显式使用orderdict类来清楚地传达这一点。

By the way, OrderedDict is not a built-in part of the core language and must be imported from the collections module in the standard library.

>>> import collections
>>> d = collections.OrderedDict(one=1, two=2, three=3)

>>> d
OrderedDict([('one', 1), ('two', 2), ('three', 3)])

>>> d['four'] = 4
>>> d
OrderedDict([('one', 1), ('two', 2),
('three', 3), ('four', 4)])

>>> d.keys()
odict_keys(['one', 'two', 'three', 'four'])
collections.defaultdict – Return Default Values for Missing Keys

The defaultdict class is another dictionary subclass that accepts a callable in its constructor whose return value will be used if a requested key cannot be found.

This can save you some typing and make the programmer’s intentions more clear, as compared to using the get() methods or catching a KeyError exception in regular dictionaries.

>>> from collections import defaultdict
>>> dd = defaultdict(list)

# Accessing a missing key creates it and
# initializes it using the default factory,
# i.e. list() in this example:
>>> dd['dogs'].append('Rufus')
>>> dd['dogs'].append('Kathrin')
>>> dd['dogs'].append('Mr Sniffles')
>>> dd['dogs']
['Rufus', 'Kathrin', 'Mr Sniffles']
collections.ChainMap – Search Multiple Dictionaries as a Single Mapping

The collections.ChainMap data structure groups multiple dictionaries into a single mapping. Lookups search the underlying mappings one by one until a key is found. Insertions, updates, and deletions only affect the first mapping added to the chain.
字典集合,但是插入,更新和删除操作只会影响到第一个字典。

>>> from collections import ChainMap
>>> dict1 = {'one': 1, 'two': 2}
>>> dict2 = {'three': 3, 'four': 4}
>>> chain = ChainMap(dict1, dict2)

>>> chain
ChainMap({'one': 1, 'two': 2}, {'three': 3, 'four': 4})
# ChainMap searches each collection in the chain
# from left to right until it finds the key (or fails):
>>> chain['three']
3
>>> chain['one']
1
>>> chain['missing']
KeyError: 'missing'
types.MappingProxyType – A Wrapper for Making Read-Only Dictionaries

MappingProxyType is a wrapper around a standard dictionary that provides a read-only view into the wrapped dictionary’s data. This class was added in Python 3.3, and it can be used to create immutable proxy versions of dictionaries.
字典的只读版本

For example, this can be helpful if you’d like to return a dictionary carrying internal state from a class or module, while discouraging write access to this object. Using MappingProxyType allows you to put these restrictions in place without first having to create a full copy of the dictionary.

>>> from types import MappingProxyType
>>> writable = {'one': 1, 'two': 2}
>>> read_only = MappingProxyType(writable)

# The proxy is read-only:
>>> read_only['one']
1
>>> read_only['one'] = 23
TypeError:
"'mappingproxy' object does not support item assignment"

# Updates to the original are reflected in the proxy:
>>> writable['one'] = 42
>>> read_only
mappingproxy({'one': 42, 'two': 2})

只读模式下我们可以通过修改只读包装前的原字典从而完成对只读字典的修改。

Dictionaries in Python: Conclusion

All of the Python dictionary implementations listed in this chapter are valid implementations that are built into the Python standard library.

If you’re looking for a general recommendation on which mapping type to use in your programs, I’d point you to the built-in dict data type. It’s a versatile and optimized hash table implementation that’s built directly into the core language.

I would only recommend that you use one of the other data types listed here if you have special requirements that go beyond what’s provided by dict.

Yes, I still believe all of them are valid options—but usually your code will be more clear and easier to maintain by other developers if it relies on standard Python dictionaries most of the time.

Key Takeaways
  • Dictionaries are the central data structure in Python.
  • The built-in dict type will be “good enough” most of the time.
  • Specialized implementations, like read-only or ordered dicts, are available in the Python standard library.

转载于:https://www.jianshu.com/p/b3827ca473a8

    原文作者:weixin_33813128
    原文地址: https://blog.csdn.net/weixin_33813128/article/details/91186809
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞