linux 学习笔记-042-访问日志不记录静态文件,访问日志切割,静态元素过期时间

发布于 2018-03-05  399 次阅读


访问日志不记录静态文件

网站大多元素为静态文件,如图片、css、js 等,这些元素可以不用记录

如果静态文件也记录日志,每时每刻都有大量的请求,会导致磁盘 I/O 过高,消耗服务器资源等,所以一般没必要的话建议取消记录静态文件的请求日志信息

查看一个页面有哪些元素:浏览器按 F12,刷新页面,看 Network 一栏的信息可以看到,每一个图片、css、js 等都会是一个请求,都会记录在服务器日志中,但很多这些东西是没必要记录的

linux 学习笔记-042-访问日志不记录静态文件,访问日志切割,静态元素过期时间

例:

[root@am-01:~#] vim /usr/local/apache2.4/conf/extra/httpd-vhosts.conf

  <VirtualHost *:80>

      DocumentRoot "/data/wwwroot/111.com"

      ServerName 111.com

      ServerAlias www.example.com

      ErrorLog "logs/111.com-error_log"

      SetEnvIf Request_URI ".*\.gif$" img

      SetEnvIf Request_URI ".*\.jpg$" img

      SetEnvIf Request_URI ".*\.png$" img

      SetEnvIf Request_URI ".*\.bmp$" img

      SetEnvIf Request_URI ".*\.swf$" img

      SetEnvIf Request_URI ".*\.js$" img

      SetEnvIf Request_URI ".*\.css$" img

      CustomLog "logs/111.com-access_log" combined env=!img

  #    <Directory /data/wwwroot/111.com>

  #    <FilesMatch 123.php>

  #        AllowOverride AuthConfig

  #        AuthName "111.com user auth"

  #        AuthType Basic

  #        AuthUserFile /data/.htpasswd

  #        require valid-user

  #    </filesMatch>

  #    </Directory>

      <IfModule mod_rewrite.c>

          RewriteEngine on

          RewriteCond %{HTTP_HOST} !^111.com$

          RewriteRule ^/(.*)$ http://111.com/$1 [R=301,L]

      </IfModule>

  </VirtualHost>

#修改虚拟主机配置文件,定义 URI 为 gif、jpg 等结尾的做一个标签为 img,然后在 CustomLog 结尾引入 env=!img,指的是标签为非 img 的信息都记录到日志中,也可以理解为标签为 img 的信息都不记录到日志中

测试:

[root@am-01:~#] curl -x172.17.1.240:80 111.com/1.jpg -I

HTTP/1.1 404 Not Found

Date: Mon, 05 Mar 2018 14:43:13 GMT

Server: Apache/2.4.29 (Unix) PHP/7.1.6

Content-Type: text/html; charset=iso-8859-1

[root@am-01:~#] tail /usr/local/apache2.4/logs/111.com-access_log

172.17.1.1 - - [03/Mar/2018:00:48:47 +0800] "GET /123.php HTTP/1.1" 200 75087 "https://www.itwordsweb.com/276.html" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"

172.17.1.1 - - [03/Mar/2018:00:48:47 +0800] "GET /robots.txt HTTP/1.1" 404 208 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"

172.17.1.1 - - [03/Mar/2018:00:48:48 +0800] "GET /favicon.ico HTTP/1.1" 404 209 "http://111.com/123.php" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"

172.17.1.240 - - [03/Mar/2018:00:55:39 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:56:03 +0800] "HEAD HTTP://www.example.com/hjhjh HTTP/1.1" 301 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:56:06 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:57:50 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:01:09:14 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:01:09:22 +0800] "HEAD HTTP://www.example.com/hjhjh HTTP/1.1" 301 - "-" "curl/7.29.0"

172.17.1.240 - - [05/Mar/2018:22:43:13 +0800] "HEAD HTTP://111.com/1.jpg HTTP/1.1" 404 - "-" "curl/7.29.0"

#修改完虚拟主机配置文件后,先不重新加载配置文件,可以见到,默认是记录 jpg 结尾的日志信息的
[root@am-01:~#] /usr/local/apache2.4/bin/apachectl -t

Syntax OK

[root@am-01:~#] /usr/local/apache2.4/bin/apachectl graceful

[root@am-01:~#] curl -x172.17.1.240:80 111.com/1.jpg -I

HTTP/1.1 404 Not Found

Date: Mon, 05 Mar 2018 14:44:53 GMT

Server: Apache/2.4.29 (Unix) PHP/7.1.6

Content-Type: text/html; charset=iso-8859-1

[root@am-01:~#] curl -x172.17.1.240:80 111.com/2.gif -I

HTTP/1.1 404 Not Found

Date: Mon, 05 Mar 2018 14:45:13 GMT

Server: Apache/2.4.29 (Unix) PHP/7.1.6

Content-Type: text/html; charset=iso-8859-1

[root@am-01:~#] curl -x172.17.1.240:80 111.com/1 -I

HTTP/1.1 404 Not Found

Date: Mon, 05 Mar 2018 14:46:59 GMT

Server: Apache/2.4.29 (Unix) PHP/7.1.6

Content-Type: text/html; charset=iso-8859-1

[root@am-01:~#] tail /usr/local/apache2.4/logs/111.com-access_log

172.17.1.1 - - [03/Mar/2018:00:48:47 +0800] "GET /robots.txt HTTP/1.1" 404 208 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"

172.17.1.1 - - [03/Mar/2018:00:48:48 +0800] "GET /favicon.ico HTTP/1.1" 404 209 "http://111.com/123.php" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"

172.17.1.240 - - [03/Mar/2018:00:55:39 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:56:03 +0800] "HEAD HTTP://www.example.com/hjhjh HTTP/1.1" 301 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:56:06 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:00:57:50 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:01:09:14 +0800] "HEAD HTTP://111.com/hjhjh HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [03/Mar/2018:01:09:22 +0800] "HEAD HTTP://www.example.com/hjhjh HTTP/1.1" 301 - "-" "curl/7.29.0"

172.17.1.240 - - [05/Mar/2018:22:43:13 +0800] "HEAD HTTP://111.com/1.jpg HTTP/1.1" 404 - "-" "curl/7.29.0"

172.17.1.240 - - [05/Mar/2018:22:46:59 +0800] "HEAD HTTP://111.com/1 HTTP/1.1" 404 - "-" "curl/7.29.0"

#把配置文件重载后,测试是可以见到,已经不会记录 jpg、gif 等结尾的日志信息了

访问日志切割

日志一直记录总有一天会把整个磁盘占满,所以有必要让它自动切割,并删除老的日志文件

data 命令可以查看当前系统时区,中国为 CST,美国为 UTC

例:

[root@am-01:~#] vim /usr/local/apache2.4/conf/extra/httpd-vhosts.conf

  CustomLog "|/usr/local/apache2.4/bin/rotatelogs -l logs/111.com-access_%Y%m%d_log 86400" combined env=!img

#修改这一句,调用 apache 自带的 rotatelogs 工具用来切割日志,-l 指以当前系统的时间为准(如果不指定,则会以 UTC 为标准),同时以时间日期定义日志的名字,最后定义每天生成一个日志信息文件(默认以秒计算,一天就是 86400 秒)

测试:

[root@am-01:~#] vim /usr/local/apache2.4/conf/extra/httpd-vhosts.conf

[root@am-01:~#] /usr/local/apache2.4/bin/apachectl -t

Syntax OK

[root@am-01:~#] ls /usr/local/apache2.4/logs/

111.com-access_20180305_log  111.com-error_log   abc.com-error_log  error_log

111.com-access_log           abc.com-access_log  access_log         httpd.pid

[root@am-01:~#] cat /usr/local/apache2.4/logs/111.com-access_20180305_log

127.0.0.1 - - [05/Mar/2018:23:11:06 +0800] "HEAD HTTP://111.com HTTP/1.1" 200 - "-" "curl/7.29.0"

127.0.0.1 - - [05/Mar/2018:23:12:03 +0800] "HEAD HTTP://111.com/123.php HTTP/1.1" 200 - "-" "curl/7.29.0"

#可以见到日志切割的功能已经生效,在 logs 目录下已经随着你对网站的访问而生成了一个当天的日志记录文件

#最后还建议写一个任务计划把超过某时间段的日志删除(预防日志文件过多,占用磁盘空间)

静态元素过期时间

浏览器访问网站的图片时会把静态的文件缓存在本地电脑里,这样下次再访问时就不用去远程下载了

例:

[root@am-01:~#] cd /data/wwwroot/111.com/

[root@am-01:/data/wwwroot/111.com#] ls

123.php  index.php

[root@am-01:/data/wwwroot/111.com#] wget http://cdn.itwordsweb.com/wp-content/uploads/2018/02/%E6%9C%AA%E6%A0%87%E9%A2%98-1-1-1.png

--2018-03-05 23:22:09--  http://cdn.itwordsweb.com/wp-content/uploads/2018/02/%E6%9C%AA%E6%A0%87%E9%A2%98-1-1-1.png

正在解析主机 cdn.itwordsweb.com (cdn.itwordsweb.com)... 115.231.71.234, 115.231.71.231, 115.231.71.233, ...

正在连接 cdn.itwordsweb.com (cdn.itwordsweb.com)|115.231.71.234|:80... 已连接。

已发出 HTTP 请求,正在等待回应... 200 OK

长度:5381 (5.3K) [image/png]

正在保存至: “未标题-1-1-1.png”



100%[====================================================================>] 5,381       --.-K/s 用时 0s      



2018-03-05 23:22:14 (136 MB/s) - 已保存 “未标题-1-1-1.png” [5381/5381])



[root@am-01:/data/wwwroot/111.com#] mv 未标题-1-1-1.png 1.png

[root@am-01:/data/wwwroot/111.com#] ls

123.php  1.png  index.php

#在服务端存放网站的目录下下载一张图片

在客户端浏览器做实验,第一次访问的时候可以见到是返回 200 状态码,第二次访问的时候返回 304 状态码(304 状态码意思是指所访问的内容并没有变化,所以不需要从服务器上重新下载)

但是,假如没定义缓存过期时间,就算服务器上的图片做了更改也不会重新下载

linux 学习笔记-042-访问日志不记录静态文件,访问日志切割,静态元素过期时间

linux 学习笔记-042-访问日志不记录静态文件,访问日志切割,静态元素过期时间

[root@am-01:~#] vim /usr/local/apache2.4/conf/extra/httpd-vhosts.conf

    <IfModule mod_expires.c>

        ExpiresActive on

#打开缓存时间过期功能开关

        ExpiresByType image/gif  "access plus 1 days"

        ExpiresByType image/jpeg "access plus 24 hours"

        ExpiresByType image/png "access plus 24 hours"

        ExpiresByType text/css "now plus 2 hour"

        ExpiresByType application/x-javascript "now plus 2 hours"

        ExpiresByType application/javascript "now plus 2 hours"

        ExpiresByType application/x-shockwave-flash "now plus 2 hours"

        ExpiresDefault "now plus 0 min"

    </IfModule>

#在虚拟主机配置文件中添加以上设置,打开缓存时间过期功能开关,分别为 gif、jpeg、css 等静态内容设置缓存的过期时间
[root@am-01:~#] /usr/local/apache2.4/bin/apachectl -t

Syntax OK

[root@am-01:~#] vim /usr/local/apache2.4/conf/httpd.conf

  LoadModule expires_module modules/mod_expires.so

[root@am-01:~#] /usr/local/apache2.4/bin/apachectl graceful

[root@am-01:~#] /usr/local/apache2.4/bin/apachectl -M | grep expires

 expires_module (shared)

#测试一下配置的正确性,在住配置文件中启用 mod_expires.so 模块,重新加载配置文件,测试模块是否打开
[root@am-01:~#] curl -x127.0.0.1:80 111.com/1.png -I

HTTP/1.1 200 OK

Date: Mon, 05 Mar 2018 15:41:39 GMT

Server: Apache/2.4.29 (Unix) PHP/7.1.6

Last-Modified: Mon, 05 Mar 2018 15:22:14 GMT

ETag: "1505-566abe402ca24"

Accept-Ranges: bytes

Content-Length: 5381

Cache-Control: max-age=86400

Expires: Tue, 06 Mar 2018 15:41:39 GMT

Content-Type: image/png

#使用 curl 测试,可以见到多了个 Cache-Control 参数,指定了缓存过期时间为 86400

扩展

apache 日志记录代理 IP 以及真实客户端 IP:

http://ask.apelearn.com/question/960

apache 只记录指定 URI 的日志:

http://ask.apelearn.com/question/981

apache 日志记录客户端请求的域名:

http://ask.apelearn.com/question/1037

apache 日志切割问题:

http://ask.apelearn.com/question/566