shell脚本分析nginx日志实例教程_shell日志脚本

shell脚本分析nginx日志实例教程: 发布时间：2021-01-21编辑：脚本学堂

本文介绍了shell脚本分析nginx日志的方法，用于分析nginx log日志文件内容的shell脚本代码，需要的朋友参考下。

如何在shell中用脚本分析nginx log日志呢？

假定日志格式如下：

178.255.215.86 - - [04/Jul/2013:00:00:31 +0800] "GET /tag/316/PostgreSQL HTTP/1.1" 200 4779 "-" "Mozilla/5.0 (compatible; Exabot/3.0 (BiggerBetter); +http://www.exabot.com/go/robot)" "-"-
178.255.215.86 - - [04/Jul/2013:00:00:34 +0800] "GET /tag/317/edit HTTP/1.1" 303 5 "-" "Mozilla/5.0 (compatible; Exabot/3.0 (BiggerBetter); +http://www.exabot.com/go/robot)" "-"-
103.29.134.200 - - [04/Jul/2013:00:00:34 +0800] "GET /code-snippet/2022/edit HTTP/1.0" 303 0 "-" "Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/17.0 Firefox/17.0" "-"-
103.29.134.200 - - [04/Jul/2013:00:00:35 +0800] "GET /user/login?url=http%3A//outofmemory.cn/code-snippet/2022/edit HTTP/1.0" 200 4748 "-" "Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/17.0 Firefox/17.0" "-"-

一下脚本都是基于上面日志格式的，如果你的日志格式不同需要调整linuxjishu/13830.html target=_blank class=infotextkey>awk后面的参数。

1，分析日志中的UserAgent

复制代码代码示例:

cat access_20130704.log | awk -F """ '{print $(NF-3)}' | sort | uniq -c | sort -nr | head -20

 

上面的脚本将分析出日志文件中最多的20个UserAgent

2，分析日志中那些IP访问最多

复制代码代码示例:

cat access_20130704.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -20

3，分析日志中那些Url请求访问次数最多

复制代码代码示例:

cat access_20130704.log | awk -F """ '{print $(NF-5)}' | sort | uniq -c | sort -nr | head -20

第二部分，用shell脚本分析nginx日志
第一种情况是Nginx作为最前端的负载均衡器，其集群架构为Nginx+Keepalived时，脚本内容：

复制代码代码示例:

log-nginx.sh
#!/bin/bash
if [$# -eq 0 ]; then
　 echo "Error: please specify logfile."
　 exit 0
else
　 LOG=￥1
fi
if [ ! -f $1 ]; then
　 echo "Sorry， sir， I can""t find this apache log file， pls try again!"
exit 0
fi
################################
echo "Most of the ip:"
echo "-------------------------------------------"
awk ""{ print $1 }""$LOG| sort| uniq -c| sort -nr| head -10
echo
echo
###################
echo "Most of the time:"
echo "--------------------------------------------"
awk ""{ print $4 }""$LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10
echo
echo
#######################
echo "Most of the page:"
echo "--------------------------------------------"
awk ""{print $11}""$LOG| sed ""s/^.*（.cn*）"/1/g""| sort| uniq -c| sort -rn| head -10
echo
echo
#####################3
echo "Most of the time / Most of the ip:"
echo "--------------------------------------------"
awk ""{ print $4 }""$LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog
for i in ""awk ""{ print $2 }"" timelog""
do
　 num=""grep $i timelog| awk ""{ print $1 }""""
　 echo "$i $num"
　 ip=""grep $i $LOG| awk ""{ print $1}""| sort -n| uniq -c| sort -nr| head -10""
　 echo "$ip"
　 echo
done
rm -f timelog

第二种情况是以Nginx作为Web端，置于LVS后面，这时要剔除掉LVS的IP地址，比如LVS服务器的公网IP地址(像203.93.236.141、203.93.236.145等)。

将第一种情况的脚本略微调整：

复制代码代码示例:

#!/bin/bash

# www.jb200.com

#

if ［$# -eq 0 ］; then

　 echo "Error: please specify logfile."

　 exit 0

else

　 cat$1| egrep -v '203.93.236.141|145' > LOG

fi

if ［ ! -f$1 ］; then

　 echo "Sorry, sir, I can't find this apache log file, pls try again!"

exit 0

fi

###################################################

echo "Most of the ip:"

echo "-------------------------------------------"

awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10

echo

echo

####################################################

echo "Most of the time:"

echo "--------------------------------------------"

awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10

echo

echo

####################################################

echo "Most of the page:"

echo "--------------------------------------------"

awk '{print$11}' LOG| sed 's/^.*(.cn*)"/1/g'| sort| uniq -c| sort -rn| head -10

echo

echo

####################################################

echo "Most of the time / Most of the ip:"

echo "--------------------------------------------"

awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog

for i in 'awk '{ print$2 }' timelog'

do

　 num='grep$i timelog| awk '{ print$1 }''

　 echo "$i$num"

　 ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'

　 echo "$ip"

　 echo

done

rm -f timelog   

#!/bin/bash

if ［$# -eq 0 ］; then

　 echo "Error: please specify logfile."

　 exit 0

else

　 cat$1| egrep -v '203.93.236.141|145' > LOG

fi

if ［ ! -f$1 ］; then

　 echo "Sorry, sir, I can't find this apache log file, pls try again!"

exit 0

fi

###################################################

echo "Most of the ip:"

echo "-------------------------------------------"

awk '{ print$1 }' LOG| sort| uniq -c| sort -nr| head -10

echo

echo

####################################################

echo "Most of the time:"

echo "--------------------------------------------"

awk '{ print$4 }' LOG| cut -c 14-18| sort| uniq -c| sort -nr| head -10

echo

echo

####################################################

echo "Most of the page:"

echo "--------------------------------------------"

awk '{print$11}' LOG| sed 's/^.*(.cn*)"/1/g'| sort| uniq -c| sort -rn| head -10

echo

echo

####################################################

echo "Most of the time / Most of the ip:"

echo "--------------------------------------------"

awk '{ print$4 }' LOG| cut -c 14-18| sort -n| uniq -c| sort -nr| head -10 > timelog

for i in 'awk '{ print$2 }' timelog'

do

　 num='grep$i timelog| awk '{ print$1 }''

　 echo "$i$num"

　 ip='grep$i LOG| awk '{ print$1}'| sort -n| uniq -c| sort -nr| head -10'

　 echo "$ip"

　 echo

done

rm -f timelog

可以用此脚本分析文件名为www_tomcat_20110331.log的文件。[root@localhost 03]# sh counter_nginx.sh　www_tomcat_20110331.log
需要关注脚本运行后的第一项和第二项结果，即访问我们网站最多的IP和哪个时间段IP访问比较多，如下所示：

Most of the ip:
-------------------------------------------
　 5440 117.34.91.54
　9 119.97.226.226
　4 210.164.156.66
　4 173.19.0.240
　4 109.230.251.35
　2 96.247.52.15
　2 85.91.140.124
　2 74.168.71.253
　2 71.98.41.114
　2 70.61.253.194
Most of the time:
--------------------------------------------
12 15:31
11 09:45
10 23:55
10 21:45
10 21:37
10 20:29
10 19:54
10 19:44
10 19:32
10 19:13

如果对日志的要求不高，可以直接通过Awk和Sed来分析Linux日志(如果对Perl熟练也可以用它来操作)，还可以通过Awstats来进行详细分析，后者尤其适合Web服务器和邮件服务器。
另外，如果对日志有特殊需求的话，还可以架设专用的日志服务器来收集Linux服务器日志。

您可能感兴趣的文章：

nginx日志分割与分析脚本实例分享
nginx 日志分析脚本一例

上一篇：nginx日志分割与分析脚本实例分享
下一篇：shell脚本抓取远程日志的实例代码

与 shell脚本分析nginx日志实例教程有关的文章

本文标题：shell脚本分析nginx日志实例教程
本页链接：http://www.jb200.com/article/24256.html

浏览排行

栏目分类

热点文章

shell脚本分析nginx日志实例教程