July 07, 2005


Regular expression snippet for datetime format

Note that due to line-wrapping you may miss the embedded spaces in this. So to make it explicit there's a space after the date portion (that ends with "(?:0[5-9]|[1-2][0-9])") and there's a space followed by a question mark right before the "AM/PM" stuff.

Here you go, this is a long one...
/^(?:0?[1-9]|1[0-2])[-\/](?:0?[1-9]|[1-2][0-9]|3[0-1])[-\/](?:20)?(?:0[5-9]|[1-2][0-9]) (?:(?:0|0?[1-9]|1[0-1]):[0-5][0-9]:[0-5][0-9] ?(?:[AP]M[ap]m))(?:(?:0|0?[1-9]|1[0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9])$/

If you're wondering what the "?:" is all about, it's a notation to the regex engine telling it to not worry about keeping track of any backreferences (which speeds things up). So "(?:expression)" means that the captured expression won't go into any \1 or $1 reference (if you don't need it), and the regex evaluation will run faster.

In Perl for example you'd use this expression to validate a variable holding a datetime value (making sure the variable is left- and right-trimmed of whitespace; or you can modify the regular expression to account for possible left- or right-whitespace), like this:
if ($dt_str !~ m/^{mess}$/)

which means you're checking to see if the "$dt_str" variable value does not match the "{mess}" pattern (where {mess} is the long regular expression above).

Note that the year part as I've written it here validates for years from 2005 to 2029 (and the "20" is optional) which fits what I need it to do, so you may need to change the years being checked for your own needs. As I've written it it allows for a date format like "mm-dd/yy" or "mm/dd-yy" but to force consistency in the delimiters would have made it just so much longer and I didn't feel like that one was very important, because I can still get the relevant data from that. In Perl, you could force consistency on the delimiters like this
$dt_str =~ s/^(\d?\d)[-\/](\d?\d)[-\/]((?:\d\d)?\d\d)/\1-\2-\3/;

which forces the date delimiters to be dashes. Note also that the expression here allows for either AM/PM format or for military time (00:00:00 to 23:59:59), and it allows for upper- or lowercase AM/PM.

I hope this is useful to you.

This page is powered by Blogger. Isn't yours?