Google


   


You are here: CodeIdol.com > Other > Ruby Cookbook > System Administration > Deleting Files That Match A Regular Expression

SAVE
Digg
Shown on del.icio.us del.icio.us
See Whos Talking About This on Technorati Technorati
I've Reddit reddit

Recipe 23.5. Deleting Files That Match a Regular Expression

Credit: Matthew Palmer

Problem

You have a directory full of files and you need to remove some of them. The patterns you want to match are too complex to represent as file globs, but you can represent them as a regular expression.

Solution

The Dir.entries method gives you an array of all files in a directory, and you can iterate over this array with #each. A method to delete the files matching a regular expression might look like this:

	def delete_matching_regexp(dir, regex)
	  Dir.entries(dir).each do |name|
	    path = File.join(dir, name)
	    if name =~ regex
	      ftype = File.directory?(path) ? Dir : File
	      begin
	        ftype.delete(path)
	      rescue SystemCallError => e
	        $stderr.puts e.message
	      end
	    end
	  end
	end

Here's an example. Let's create a bunch of files and directories beneath a temporary directory:


	require ' 
fileutils'
	tmp_dir = 'tmp_buncha_files'
	files = ['A', 'A.txt', 'A.html', 'p.html', 'A.html.bak']
	directories = ['text.dir', 'Directory.for.html']

	Dir.mkdir(tmp_dir) unless File.directory? tmp_dir
	files.each { |f| FileUtils.touch(File.join(tmp_dir,f)) }
	directories.each { |d| Dir.mkdir(File.join(tmp_dir, d)) }

Now let's delete some of those files and directories. We'll delete a file or directory if its name starts with a capital letter, and if its extension (the string after its last period) is at least four characters long. This corresponds to the regular expression /^[A-Z].*\.[^.]{4,}$/:

	Dir.entries(tmp_dir)
	# => [".", "..", "A", "A.txt", "A.html", "p.html", "A.html.bak",
	#     "text.dir", "Directory.for.html"]

	delete_matching_regexp(tmp_dir, /^[A-Z].*\.[^.]{4,}$/)

	Dir.entries(tmp_dir)
	# => [".", "..", "A", "A.txt", "p.html", "A.html.bak", "text.dir"]

Discussion

Like most good things in Ruby, Dir.entries takes a code block. It yields every file and subdirectory it finds to that code block. Our particular code block uses the regular expression match operator =~ to match every real file (no subdirectories) against the regular expression, and File.delete to remove offending files.

File.delete won't delete directories; for that, you need Directory.delete. So delete_ matching_regexp uses the File predicates to check whether a file is a directory. We also have error reporting, to report cases when we don't have permission to delete a file, or a directory isn't empty.

Of course, once we've got this basic "find matching files" thing going, there's no reason why we have to limit ourselves to deleting the matched files. We can move them to somewhere new:

	def move_matching_regexp(src, dest, regex)
	  Dir.entries(dir).each do |name|
	    File.rename(File.join(src, name), File.join(dest, name)) if name =~ regex
	  end
	end

Or we can append a suffix to them:

	def append_matching_regexp(dir, suffix, regex)
	  Dir.entries(dir).each do |name|
	    if name =~ regex
	      File.rename(File.join(dir, name), File.join(dir, name+suffix))
	    end
	  end
	end

Note the common code in both of those implementations. We can factor it out into yet another method that takes a block:

	def each_matching_regexp(dir, regex)
	  Dir.entries(dir).each { |name| yield name if name =~ regex }
	end

We no longer have to tell Dir.each how to match the files we want; we just need to tell each_matching_regexp what to do with them:

	def append_matching_regexp(dir, suffix, regex)
	  each_matching_regexp(dir, regex) do |name|
	     
File.rename(File.join(dir, name), File.join(dir, name+suffix))
	  end
	end

This is all well and good, but these methods only manipulate files directly beneath the directory you specify. "I've got a whole tree full of files I want to get rid of!" I hear you cry. For that, you should use Find.find instead of Dir.each. Apart from that change, the implementation is nearly identical to delete_matching_regexp:

	def delete_matching_regexp_recursively(dir, regex)
	  Find.find(dir) do |path|
	    dir, name = File.split(path)
	    if name =~ regex
	      ftype = File.directory?(path) ? Dir : File
	      begin
	        ftype.delete(path)
	      rescue SystemCallError => e
	        $stderr.puts e.message
	      end
	    end
	  end
	end

If you want to recursively delete the contents of directories that match the regular expression (even if the contents themselves don't match), use FileUtils.rm_rf instead of Dir.delete.

See Also

  • Dir.delete will only remove an empty directory; see Recipe 6.18 for information on how to remove one that's not empty

  • Recipe 6.20, "Finding the Files You Want"


SAVE
Digg
Shown on del.icio.us del.icio.us
See Whos Talking About This on Technorati Technorati
I've Reddit reddit

You are here: CodeIdol.com > Other > Ruby Cookbook > System Administration > Deleting Files That Match A Regular Expression


ADBRITE ads links
   
Related tags







Popular Categories
Unix books and guides

AJAX popular information
C# language guides
Windows books and cookbooks

.......








Business Key Top Sites

be number one
rate your site




    С 2009 года мы стали переводить структура сайта на различные языки. Сайт теперь будет содержать книги не только на английском языке, но также и на других европейских языках, в том числе и на Русском языке.

    Русский Polski Francais Deutsch
    support sitemap terms

© CodeIdol Labs, 2007 - 2009