Rename files using regular expressions

General

This tool renames files, using a regular expression. This allows you to make complex renaming operations quickly and simply. The latest version is of 2008-07-27. It has the same function as the original version, but shows some diagnostics. If you don't know what regular expressions are, learn about them first. You definitely need this knowledge to use this command. For particular codes scroll down to the last chapter of this document.

Installation

Download regexprename.zip, unzip it and put the .vbs file somewhere inside your search path, for example in the Windows folder. The .htm file is a short regular expression cheat sheet. Alternatively you can put the .vbs program file into the same folder where you want to rename files. The program will not rename itself, even if the pattern matches.

Use

This command requires that the Windows Scripting Host, either WScript or CScript, is installed. Windows XP and later have it installed by default. The command requires exactly two parameters. If a parameter contains one or more spaces, it has to be enclosed in quotes (" "). Parameters:
  1. Pattern. All files in the current folders are searched for the pattern.
  2. Replacement string. If a pattern match is found, the file is renamed and its name changed into the replacement string.
For information on the regular expression pattern codes double-click on the documentation file: regexprename.htm

Details and hints

Putting a group into the replacement string

Groups from the pattern, i.e. parenthesized expressions, are copied into the replacement string with $1, $2, etc., rather than \1, \2, etc., as in Perl, Linux, Unix.

Shorter command

If you need the program often and want to make the commands shorter, you can rename it, for example from regexprename.vbs to: rr.vbs

Case sensitivity

The parameters are case-sensitive. The program could be changed. The RegExp object has a property, "IgnoreCase", to make it case-insensitive. You can add the following line to the program just after the regEx object is created, to make it case-insensitive:
regEx.IgnoreCase = True

Global replacement of multiple pattern occurrences

As usual with regular expressions, if the pattern occurs more than once, only the first occurrence is replaced. The program could be changed. The RegExp object has a property, "Global", to replace every occurrence. You can add the following line to the program just after the regEx object is created, to make it replace every occurrence:
regEx.Global = True
This is normally not necessary, as you can put more than one occurrence of a string in the pattern.

Please extend the program and distribute it here

Of course you could also add more parameters to control these properties from the command line. If you do this, please upload the extended program here for the benefit of others. Create a new forum topic to be able to attach a file.

Simple examples

Picture files from a digital camera

Picture files coming from a digital camera should be renamed. The original file names look like this: P0101021.JPG Enter the following command as one line. You can copy it from here and paste it into the command line window, using the right mouse button:
regexprename P "Xmas 2008 "
The resulting file is: Xmas 2008 0101021.JPG Note the intended space after 2008. Also note that the renaming works only on the first P, not on the second.

Change to lower case and remove spaces

Taking the preceding result file as an example, we want to have a lower case X, spaces replaced with underscores, and JPG changed to the lower case jpg. Enter the following command as one line. You can copy it from here and paste it into the command line window, using the right mouse button:
regexprename "Xmas 2008 (.*\.)JPG$" xmas_2008_$1jpg
The resulting file is: xmas_2008_0101021.jpg The parenthesized group included the long number and the period. It is put into the result with the code $1. Note that the literal period in the pattern, the first parameter, has to be escaped with a preceding backslash: \. This is not necessary in the replacement string, the second parameter. An alternative, more flexible command that yields exactly the same result here would be:
regexprename X(.*?)\s+(.*?)\s+(.*\.)JPG$ x$1_$2_$3jpg
You could drop the JPG$ from the pattern string if you were sure that all files ended in: .JPG

Complex example and tutorial

Intention

We want to rename 100 files such that they are named: x000.txt, x001.txt, x002.txt, ..., x100.txt

Test preparation

  1. Use or create an empty folder. Create or copy 128 test files into it. A simple method to achieve this is:
    1. Create one file named: y.txt
    2. Press: Ctrl + a
    3. Press: Ctrl + c
    4. Press: Ctrl + v
    5. Repeat the preceding three steps another 6 times. You can keep holding down the Ctrl key throughout this procedure.
    You should now have 128 files with inconsistent names.
  2. We will use a little trick to rename and renumber all of them.
    1. Select all files. Press: Ctrl + a
    2. Press [F2] to edit the file name.
    3. Rename the file(s) to: x.txt
    4. Press the return key.
    The files should now have names like x.txt, x (1).txt, x (2).txt, etc.
  3. Note that the first file, which you renamed, has no number. Select only this single file x.txt and rename it to x (0).txt to have consistent file names.
Open a command line window and navigate to the test folder. The renaming procedure will be done in that window. Enter the dir command to make absolutely sure that the test folder is the current folder and you're not renaming any important files.

Remove the space and the parentheses and add leading zeros

Note that each of the two parameters has to be enclosed in quotes (" "), if it contains one or more spaces. Since we don't have any spaces in this case, we don't need quotes. Enter the following command as one line. You can copy it from here and paste it into the command line window, using the right mouse button:
regexprename ^.*\((.*)\)(\..*) x00$1$2
The files should now be named like this: x000.txt, x001.txt, x002.txt, ..., x00127.txt Note that we didn't need the end-of-string designator $ at the end of the pattern, because the .* pattern is "greedy" and covers the end anyway. However, the pattern would work just as well like this:
^.*\((.*)\)(\..*)$
i.e. with the end-of-string marker $.

Remove the superfluous leading zeros, leaving exactly three digits

Enter the following command as one line. You can copy it from here and paste it into the command line window, using the right mouse button:
regexprename x.*(\d{3}\..*) x$1
The files should now be named like this: x000.txt, x001.txt, x002.txt, ..., x127.txt This is what we wanted to achieve. Note that we needed neither the beginning of string marker ^, because the x is unambiguous, nor the end of string marker $, because the .* expression is "greedy". However, the pattern would work just as well with begin and end of string markers:
^x.*(\d{3}\..*)$

Visual Basic Script Regular Expressions

Character  Description

\

Marks the next character as either a special character or a literal. For example, "n" matches the character "n". "\n" matches a newline character. The sequence "\\" matches "\" and "\(" matches "(".

^

Matches the beginning of input.

$

Matches the end of input.

*

Matches the preceding character zero or more times. For example, "zo*" matches either "z" or "zoo".

+

Matches the preceding character one or more times. For example, "zo+" matches "zoo" but not "z".

?

Matches the preceding character zero or one time. For example, "a?ve?" matches the "ve" in "never".

.

Matches any single character except a newline character.

(pattern)

Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "\)".

x|y

Matches either x or y. For example, "z|wood" matches "z" or "wood". "(z|w)oo" matches "zoo" or "wood".

{n}

n is a nonnegative integer. Matches exactly n times. For example, "o{2}" does not match the "o" in "Bob," but matches the first two o's in "foooood".

{n,}

n is a nonnegative integer. Matches at least n times. For example, "o{2,}" does not match the "o" in "Bob" and matches all the o's in "foooood." "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".

{ n , m }

m and n are nonnegative integers. Matches at least n and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood." "o{0,1}" is equivalent to "o?".

[ xyz ]

A character set. Matches any one of the enclosed characters. For example, "[abc]" matches the "a" in "plain".

[^ xyz ]

A negative character set. Matches any character not enclosed. For example, "[^abc]" matches the "p" in "plain".

[ a-z ]

A range of characters. Matches any character in the specified range. For example, "[a-z]" matches any lowercase alphabetic character in the range "a" through "z".

[^ m-z ]

A negative range characters. Matches any character not in the specified range. For example, "[m-z]" matches any character not in the range "m" through "z".

\b

Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb".

\B

Matches a non-word boundary. "ea*r\B" matches the "ear" in "never early".

\d

Matches a digit character. Equivalent to [0-9].

\D

Matches a non-digit character. Equivalent to [^0-9].

\f

Matches a form-feed character.

\n

Matches a newline character.

\r

Matches a carriage return character.

\s

Matches any white space including space, tab, form-feed, etc. Equivalent to "[ \f\n\r\t\v]".

\S

Matches any nonwhite space character. Equivalent to "[^ \f\n\r\t\v]".

\t

Matches a tab character.

\v

Matches a vertical tab character.

\w

Matches any word character including underscore. Equivalent to "[A-Za-z0-9_]".

\W

Matches any non-word character. Equivalent to "[^A-Za-z0-9_]".

\num

Matches num, where num is a positive integer. A reference back to remembered matches. For example, "(.)\1" matches two consecutive identical characters.

\ n

Matches n, where n is an octal escape value. Octal escape values must be 1, 2, or 3 digits long. For example, "\11" and "\011" both match a tab character. "\0011" is the equivalent of "\001" & "1". Octal escape values must not exceed 256. If they do, only the first two digits comprise the expression. Allows ASCII codes to be used in regular expressions.

\xn

Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be exactly two digits long. For example, "\x41" matches "A". "\x041" is equivalent to "\x04" & "1". Allows ASCII codes to be used in regular expressions.

$n Inserts nth group into replace string (\n in Unix/Linux/Perl). Use only in replace string, not in pattern.