python进行文件比较的脚本

发布时间:2021-01-14编辑:脚本学堂
前言
经常爱收集一些电子书,而且为了不让自己这些电子书丢失,我不仅在公司的电脑中有一份,家里的移动硬盘中也保留了一份。
但电子书通常都是慢慢收集来的,今天一书,

前言
经常爱收集一些电子书,而且为了不让自己这些电子书丢失,我不仅在公司的电脑中有一份,家里的移动硬盘中也保留了一份。
但电子书通常都是慢慢收集来的,今天一书,明天一本。时间久了,公司电脑里的保存的电子书跟移动硬盘上的电子书不一致了,公司里的电子书在慢慢增长,而且分布在不同的目录之下,很难知道哪些是新增的,哪些是原来就有的。为了检查在两个目录中,有哪些文件是不一致的,所以有了以下的脚本
 

复制代码 代码如下:

#Platform: WinXp + python2.5.2
#File compare
#Author Jerry Chou
#Date Aug-9-2008
#Revison: 0.1
"""
    This program compare the source directory file with the destination directory file
    and print the different file to screen.
   
    ----------------Module Variable Introduction-----------------
    [SrcDic], Source dicrectory, default is 'D:Jeffreyebook'
    [DestDir], Destination directory, default is 'I:ebook'
    [ExtList], which file extensions shoud be checked.
        default is ['.doc','.chm','.pdf','.zip','.7z']
    -------------------------------------------------------------
"""
import os
SrcDir = r'D:Jeffreyebook'
DestDir = r'I:ebook'
ExtList = ['.doc','.chm','.pdf','.zip','.7z']
_outlist=[]
def _getfilelist(dir):
    filelist = [os.path.join(dir,fname) for fname in os.listdir(dir)]
    basefilelist = [file for file in filelist if os.path.isfile(file) and str.lower(os.path.splitext(file)[1]) in ExtList]
    global _outlist
    _outlist = _outlist + basefilelist
    dirlist = [file for file in filelist if os.path.isdir(file)]
    for file_dir in dirlist:
        _getfilelist(file_dir)
def main():
    _getfilelist(SrcDir)
    global _outlist
    srclist = _outlist
    _outlist = []
    #
    _getfilelist(DestDir)
    #global _outlist
    destlist = _outlist
    _outlist = []
    #base file
    srcfilelist = [os.path.split(file)[1] for file in srclist]
    destfilelist = [os.path.split(file)[1] for file in destlist]
    #
    difflist = [file for file in srcfilelist if file not in destfilelist] +
               [file for file in destfilelist if file not in srcfilelist]
    #query dict
    dictSrc = dict(zip(srcfilelist,srclist))
    dictDest = dict(zip(destfilelist,destlist))
    for item in difflist:
        if dictSrc.has_key(item):
            print r'[Source]' + dictSrc[item]
        elif dictDest.has_key(item):
            print r'[Destination]' + dictDest[item]
        else:
            print '-'*20 + 'Key Not Found' + '_'*20

if __name__ == '__main__':
    main()

用法示例:
>>> import comparefile as cfile
>>> print cfile.__doc__
    This program compare the source directory file with the destination directory file
    and print the different file to screen.
  
    ----------------Module Variable Introduction-----------------
    [SrcDic], Source dicrectory, default is 'D:Jeffreyebook'
    [DestDir], Destination directory, default is 'I:ebook'
    [ExtList], which file extensions shoud be checked.
        default is ['.doc','.chm','.pdf','.zip','.7z']
    -------------------------------------------------------------

>>> cfile.SrcDir = r'D:BackupELMELM_Project0219'
>>> cfile.DestDir =r'D:BackupELMELM_Project'
>>> cfile.ExtList = ['.config','.cs','.vb']
>>> cfile.main()
[Source]D:BackupELMELM_Project0219ELM_UploaderFrmUploader.Designer.vb
[Source]D:BackupELMELM_Project0219ELM_UploaderFrmUploader.vb
[Destination]D:BackupELMELM_ProjectELMSProjectInstaller.cs
[Destination]D:BackupELMELM_ProjectELMSProjectInstaller.Designer.cs
[Destination]D:BackupELMELM_ProjectELMSServerCommon.vb
[Destination]D:BackupELMELM_ProjectELMSServerProgram.vb
[Destination]D:BackupELMELM_ProjectELMSServerbinDebugELMSServer.exe.config
[Destination]D:BackupELMELM_ProjectELMSServerbinDebugELMSServer.vshost.exe.config
[Destination]D:BackupELMELM_ProjectELM_FERTELM_COMMON.vb
[Destination]D:BackupELMELM_ProjectELM_FERTELM_FERT_1.vb
[Destination]D:BackupELMELM_ProjectELM_FERTELM_ROH.vb
[Destination]D:BackupELMELM_ProjectELM_FERTbinDebugELM.dll.config
 
上面的红色部分,给出了两个目录([D:BackupELMELM_Project0219]和[D:BackupELMELM_Project])下面的vb及cs源文件存在的异同。