Capture your Regex

If coding is about manipulating strings, than one of its essential tools is regular expressions (regexes). Up until now, I have been traversing and parsing strings via .split, .each, .select, .find_index, .unshift, .push, .join, .collect — all of which are fascinating ways to rearrange data — but regexes can be a better alternative, especially when once basic iteration techniques are mastered.

The ideal case for employing regexes presented itself yesterday while working on a team project to parse through SubRip (.srt) file data set. .srt files are used to synchronize video subtitles with specific intervals of time. Because they are always the same length, .srt files are a great candidate for capturing data with regexes. .srt files consist of strings in timecode format hours:minutes:seconds,milliseconds, plus the subtitle.You can apply the same regex to any file and it will always capture the right data in the right place.

A big challenge is understanding how to access captured data, which is returned in a MatchData object that has its own special set of traversal tools. The example below uses several methods for catching data inside the MatchData object.

The example below uses regular expressions to parse through a string in timecode format hours:minutes:seconds,milliseconds. This format is used in SubRip (.srt) files to denote the interval of time that a subtitle should appear in a video. Note that this example builds on an already great tutorial from rubylearning.com

Need to polish up on your regex? Here are some great resources.

1. In his book The Bastards Book of Regular Expressions” Dan Nguyen defines regexes as “a way to describe patterns in text, either to find or to replace.”

2. Rubular is a great way to learn regex through trial and error.

3. Regex Crossword is fun!

Fun fact: If there exists at least one regex that matches a particular string, then there exists an infinite number of other regex that also match it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s