django middleware简单分析

在阅读资料的时候,经常见到资料上说,django处理请求流程的时候,是先middleware处理,如果没有返回response,那么才到我们写的视图view中去处理(包括函数视图和对象视图【基于django-restframe-work】)

那么django的middleware是在什么时候,如何被加载,middleware又做了些什么处理呢?
首先要明确middleware是一个类,他有一些固定名字的一系列方法(process_系列),从django1.10版本起,middleware是继承自django/utils/deprecations中的MiddlewareMixin类,这是一个可调用的对象,其代码如下:

class MiddlewareMixin(object):
    def __init__(self, get_response=None):
        self.get_response = get_response
        super(MiddlewareMixin, self).__init__()

    def __call__(self, request):
        response = None
        if hasattr(self, 'process_request'):
            response = self.process_request(request)
        if not response:
            response = self.get_response(request)
        if hasattr(self, 'process_response'):
            response = self.process_response(request, response)
        return response

其他中间件类可以继承这个类,然后自己实现中间件中固定的方法,从而实现自己的中间件。

现在我们从头开始梳理django处理request的流程,进而窥探中间件的处理过程。
先看WSGIHandler类

class WSGIHandler(base.BaseHandler):
    request_class = WSGIRequest

    def __init__(self, *args, **kwargs):
        super(WSGIHandler, self).__init__(*args, **kwargs)
        self.load_middleware()

    def __call__(self, environ, start_response):
        set_script_prefix(get_script_name(environ))
        signals.request_started.send(sender=self.__class__, environ=environ)
        try:
            request = self.request_class(environ)
            print "request.COOKIES: ", request.COOKIES
            print "request.HTTP_AUTHORIZATION: ", request.META.get('HTTP_AUTHORIZATION','No HTTP_AUTHORIZATION')
        except UnicodeDecodeError:
            logger.warning(
                'Bad Request (UnicodeDecodeError)',
                exc_info=sys.exc_info(),
                extra={
                    'status_code': 400,
                }
            )
            response = http.HttpResponseBadRequest()
        else:
            response = self.get_response(request)

        response._handler_class = self.__class__

        status = '%d %s' % (response.status_code, response.reason_phrase)
        response_headers = [(str(k), str(v)) for k, v in response.items()]
        for c in response.cookies.values():
            response_headers.append((str('Set-Cookie'), str(c.output(header=''))))
        start_response(force_str(status), response_headers)
        if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
            response = environ['wsgi.file_wrapper'](response.file_to_stream)
        
        print "type(response), response: ", type(response), response
#        print "response.cookies: ", response.items()[0][1]
#        print "response.headers: ", response._headers
        return response

这里面我们先重点关注_init_函数中的self.load_middleware()和_call_函数中的response = self.get_response(request)
load_middleware的源码如下:

    def load_middleware(self):
        """
        Populate middleware lists from settings.MIDDLEWARE (or the deprecated
        MIDDLEWARE_CLASSES).

        Must be called after the environment is fixed (see __call__ in subclasses).
        """
        self._request_middleware = []
        self._view_middleware = []
        self._template_response_middleware = []
        self._response_middleware = []
        self._exception_middleware = []

        if settings.MIDDLEWARE is None:
            warnings.warn(
                "Old-style middleware using settings.MIDDLEWARE_CLASSES is "
                "deprecated. Update your middleware and use settings.MIDDLEWARE "
                "instead.", RemovedInDjango20Warning
            )
            handler = convert_exception_to_response(self._legacy_get_response)
            for middleware_path in settings.MIDDLEWARE_CLASSES:
                mw_class = import_string(middleware_path)
                try:
                    mw_instance = mw_class()
                except MiddlewareNotUsed as exc:
                    if settings.DEBUG:
                        if six.text_type(exc):
                            logger.debug('MiddlewareNotUsed(%r): %s', middleware_path, exc)
                        else:
                            logger.debug('MiddlewareNotUsed: %r', middleware_path)
                    continue

                if hasattr(mw_instance, 'process_request'):
                    self._request_middleware.append(mw_instance.process_request)
                if hasattr(mw_instance, 'process_view'):
                    self._view_middleware.append(mw_instance.process_view)
                if hasattr(mw_instance, 'process_template_response'):
                    self._template_response_middleware.insert(0, mw_instance.process_template_response)
                if hasattr(mw_instance, 'process_response'):
                    self._response_middleware.insert(0, mw_instance.process_response)
                if hasattr(mw_instance, 'process_exception'):
                    self._exception_middleware.insert(0, mw_instance.process_exception)
        else:
            handler = convert_exception_to_response(self._get_response)
            for middleware_path in reversed(settings.MIDDLEWARE):
                middleware = import_string(middleware_path)
                try:
                    mw_instance = middleware(handler)
                except MiddlewareNotUsed as exc:
                    if settings.DEBUG:
                        if six.text_type(exc):
                            logger.debug('MiddlewareNotUsed(%r): %s', middleware_path, exc)
                        else:
                            logger.debug('MiddlewareNotUsed: %r', middleware_path)
                    continue

                if mw_instance is None:
                    raise ImproperlyConfigured(
                        'Middleware factory %s returned None.' % middleware_path
                    )

                if hasattr(mw_instance, 'process_view'):
                    self._view_middleware.insert(0, mw_instance.process_view)
                if hasattr(mw_instance, 'process_template_response'):
                    self._template_response_middleware.append(mw_instance.process_template_response)
                if hasattr(mw_instance, 'process_exception'):
                    self._exception_middleware.append(mw_instance.process_exception)

                handler = convert_exception_to_response(mw_instance)

        # We only assign to this when initialization is complete as it is used
        # as a flag for initialization being complete.
        self._middleware_chain = handler

self.load_middleware()主要作用就是去settings配置文件读取设置的middleware,然后初始化WSGIHandler类中的各个middleware的相关变量,这些变量主要包括self._request_middleware,self._view_middleware,self._response_middleware等存放中间件方法的列表。

WSGIHandler的_call_函数中的response = self.get_response(request),这也是django处理request的入口

    def get_response(self, request):
        """Return an HttpResponse object for the given HttpRequest."""
        # Setup default url resolver for this thread
        set_urlconf(settings.ROOT_URLCONF)

        response = self._middleware_chain(request)

        # This block is only needed for legacy MIDDLEWARE_CLASSES; if
        # MIDDLEWARE is used, self._response_middleware will be empty.
        try:
            # Apply response middleware, regardless of the response
            for middleware_method in self._response_middleware:
                response = middleware_method(request, response)
                # Complain if the response middleware returned None (a common error).
                if response is None:
                    raise ValueError(
                        "%s.process_response didn't return an "
                        "HttpResponse object. It returned None instead."
                        % (middleware_method.__self__.__class__.__name__))
        except Exception:  # Any exception should be gathered and handled
            signals.got_request_exception.send(sender=self.__class__, request=request)
            response = self.handle_uncaught_exception(request, get_resolver(get_urlconf()), sys.exc_info())

        response._closable_objects.append(request)

        # If the exception handler returns a TemplateResponse that has not
        # been rendered, force it to be rendered.
        if not getattr(response, 'is_rendered', True) and callable(getattr(response, 'render', None)):
            response = response.render()

        if response.status_code == 404:
            logger.warning(
                'Not Found: %s', request.path,
                extra={'status_code': 404, 'request': request},
            )

        return response

get_response函数中重点关注response = self._middleware_chain(request)这句。self._middleware_chain在WSGIHandler调用_init_的时候调用self.load_middleware时完成初始化的。当settings中的middleware是用MIDDLEWARE_CLASSES 表示时,_middleware_chain其实就是一个被装饰的_get_response函数,当settings中的middleware是MIDDLEWARE表示时,_middleware_chain是一个middleware对象,这个middleware对象中的get_response方法是前面加载的middleware的一个合集(个人理解表述)。具体可以参见上面self.load_middleware的源码。

下面看_get_response,也就是真正处理request的函数,看明白了这个函数,也就基本明白了django处理request的流程

    def _get_response(self, request):
        """
        Resolve and call the view, then apply view, exception, and
        template_response middleware. This method is everything that happens
        inside the request/response middleware.
        """
        response = None

        if hasattr(request, 'urlconf'):
            urlconf = request.urlconf
            set_urlconf(urlconf)
            resolver = get_resolver(urlconf)
        else:
            resolver = get_resolver()

        resolver_match = resolver.resolve(request.path_info)
        callback, callback_args, callback_kwargs = resolver_match
        request.resolver_match = resolver_match

        # Apply view middleware
        for middleware_method in self._view_middleware:
            response = middleware_method(request, callback, callback_args, callback_kwargs)
            if response:
                break

        if response is None:
            wrapped_callback = self.make_view_atomic(callback)
            try:
                response = wrapped_callback(request, *callback_args, **callback_kwargs)
            except Exception as e:
                response = self.process_exception_by_middleware(e, request)

        # Complain if the view returned None (a common error).
        if response is None:
            if isinstance(callback, types.FunctionType):    # FBV
                view_name = callback.__name__
            else:                                           # CBV
                view_name = callback.__class__.__name__ + '.__call__'

            raise ValueError(
                "The view %s.%s didn't return an HttpResponse object. It "
                "returned None instead." % (callback.__module__, view_name)
            )

        # If the response supports deferred rendering, apply template
        # response middleware and then render the response
        elif hasattr(response, 'render') and callable(response.render):
            for middleware_method in self._template_response_middleware:
                response = middleware_method(request, response)
                # Complain if the template response middleware returned None (a common error).
                if response is None:
                    raise ValueError(
                        "%s.process_template_response didn't return an "
                        "HttpResponse object. It returned None instead."
                        % (middleware_method.__self__.__class__.__name__)
                    )

            try:
                response = response.render()
            except Exception as e:
                response = self.process_exception_by_middleware(e, request)

        return response

在_get_response函数中,首先解析访问的url,从而获得后台开发者自己写的view处理函数,也就是callback, callback_args, callback_kwargs = resolver_match中的callback,真正调用在wrapped_callback = self.make_view_atomic(callback),从_get_response的执行顺序我们就可以看出,只有在所有的middleware执行完后还没有获得response,才会执行开发者所写的view函数,这也是开头说的,django处理request流程,现有middleware开始,最后才到view函数。
在django的1.10版本源码中,并没有看到谁去显示的调用各个中间件的各种函数,比如process_request,那么middleware中的process_request等一些列函数谁去调用呢?其实关键点在_middleware_chain函数。前面提到,在django的1.10版本以前,各个中间件中的函数在load_middleware的时候放到固定的函数列表中,然后在固定的流程去执行这些函数,但是从1.10版本起,并没有地方显示的调用,刚刚说了,关键点在于1.10版本以后,_middleware_chain已经变成了一个特殊的middleware对象了,这个middleware对象中的get_response函数在每一次加载新的中间件时被迭代更新,从而包含了前面加载的中间件。所以在最后执行middleware_chain的时候就相当于调用了中间件类的_call_方法,这个_call_去递归调用前面加载的中间件的_call_方法,从而调用每一个中间件的定义的process*系列函数。这是一个难以理解的地方,好好理解load_middelware函数中的函数convert_exception_to_response,就可以明白这个点。

def convert_exception_to_response(get_response):
    """
    Wrap the given get_response callable in exception-to-response conversion.

    All exceptions will be converted. All known 4xx exceptions (Http404,
    PermissionDenied, MultiPartParserError, SuspiciousOperation) will be
    converted to the appropriate response, and all other exceptions will be
    converted to 500 responses.

    This decorator is automatically applied to all middleware to ensure that
    no middleware leaks an exception and that the next middleware in the stack
    can rely on getting a response instead of an exception.
    """
    @wraps(get_response, assigned=available_attrs(get_response))
    def inner(request):
        try:
            response = get_response(request)
        except Exception as exc:
            response = response_for_exception(request, exc)
        return response
    return inner

当难以理解某段代码的时候,可以写一个小例子测试实验一下。

from functools import wraps

def available_attrs(fn):
    """
    Return the list of functools-wrappable attributes on a callable.
    This is required as a workaround for http://bugs.python.org/issue3445
    under Python 2.
    """
    WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')

    return tuple(a for a in WRAPPER_ASSIGNMENTS if hasattr(fn, a))

def convert_exception_to_response(get_response):
    """
    Wrap the given get_response callable in exception-to-response conversion.

    All exceptions will be converted. All known 4xx exceptions (Http404,
    PermissionDenied, MultiPartParserError, SuspiciousOperation) will be
    converted to the appropriate response, and all other exceptions will be
    converted to 500 responses.

    This decorator is automatically applied to all middleware to ensure that
    no middleware leaks an exception and that the next middleware in the stack
    can rely on getting a response instead of an exception.
    """
    @wraps(get_response, assigned=available_attrs(get_response))
    def inner():
        response = None
        try:
            response = get_response()
        except Exception as exc:
            print exc
        return response
    return inner

def get_response():
    print "xxxxx"

class A1(object):
    def __init__(self, f):
        self.f = f
        print "A1 init"

    def __call__(self, *args, **kwargs):
        self.f()
        print "A1 call"

class A2(object):
    def __init__(self, f):
        self.f = f
        print "A2 init"

    def __call__(self, *args, **kwargs):
        self.f()
        print "A2 call"

class A3(object):
    def __init__(self, f):
        self.f = f
        print "A3 init"

    def __call__(self, *args, **kwargs):
        self.f()
        print "A3 call"


f = convert_exception_to_response(get_response)
# print dir(f)
f = convert_exception_to_response(A1(f))
# print dir(f)
# f.f()
f = convert_exception_to_response(A2(f))
# print dir(f)
f.f()
f = convert_exception_to_response(A3(f))
# print type(f)
# print dir(f)
# print type(available_attrs)
# print dir(available_attrs)
f()

输出结果为

A1 init
A2 init
xxxxx
A1 call
A3 init
xxxxx
A1 call
A2 call
A3 call

通过小例子,就比较清晰的看到convert_exception_to_response函数做了什么。

比如django.contrib.auth.middleware.AuthenticationMiddleware中的认证函数process_request就是在这里被调用的。

class AuthenticationMiddleware(MiddlewareMixin):
    def process_request(self, request):
        assert hasattr(request, 'session'), (
            "The Django authentication middleware requires session middleware "
            "to be installed. Edit your MIDDLEWARE%s setting to insert "
            "'django.contrib.sessions.middleware.SessionMiddleware' before "
            "'django.contrib.auth.middleware.AuthenticationMiddleware'."
        ) % ("_CLASSES" if settings.MIDDLEWARE is None else "")
        request.user = SimpleLazyObject(lambda: get_user(request))

网上的资料说middleware继承MiddlewareMixin是从django的1.10版本开始的,前面的版本是没有继承对象的,也就是传统的中间件(legacy middleware)

总结下来就是,django 1.10版本以前,所有的middlware的方法都是加入到特定的数组中的,然后依次调用数组的中方法处理request和response。1.10版本起,middleware是一个可调用对象,process_request,get_response, process_response在直接调用meddleware对象时通过调用call方法调用对应的函数。比如用户认证的AuthenticationMiddleware,就是初始化request.user。
借用网上的一张图片:

《django middleware简单分析》 image.png

中间件的应用场景
由于中间件工作在 视图函数执行前、执行后(像不像所有视图函数的装饰器!)适合所有的请求/一部分请求做批量处理

1、做IP限制
放在中间件类的列表中,阻止某些IP访问了;

2、URL访问过滤
如果用户访问的是login视图(放过)
如果访问其他视图(需要检测是不是有session已经有了放行,没有返回login),这样就省得在 多个视图函数上写装饰器了!

3、缓存(还记得CDN吗?)
客户端请求来了,中间件去缓存看看有没有数据,有直接返回给用户,没有再去逻辑层 执行视图函数

参考来源:
https://docs.djangoproject.com/en/2.0/topics/http/middleware/
https://code.ziqiangxuetang.com/django/django-middleware.html
http://www.cnblogs.com/huchong/p/7819296.html
http://daoluan.net/%E5%AD%A6%E4%B9%A0%E6%80%BB%E7%BB%93/2013/09/13/decode-django-have-look-at-middleware.html

    原文作者:llicety
    原文地址: https://www.jianshu.com/p/4cc8935ebe0b
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞