The following regular expression builds on the previous one. The extra ? after the * character makes sure that the capture group stops the first time it encounters the terminating " character, and doesn't look for more such characters in the line. Find a group of characters that start with the regular characters bgcolor=" followed by any character one or more times, but stop after encountering the first " character. The following regular expression declares a capture group that executes the following logic: Process the text from the file named regex-content-01.html. The difference in this example is that the pattern declaration captures the groups with words that begin with uppercase M, followed by a space character, and then words that begin with uppercase J: $ echo "John Lennon and Mick Jagger" |grep -Po '(M\w*\sJ\w*)' The following regular expression is similar to the previous one. $ echo "John Lennon and Mick Jagger" |grep -Po '(J\w*\sL\w*)' A set of characters that match the logic is returned as a capture group. Finally, the regular expression captures a set of characters that match text in which the uppercase L character is followed by zero or more word characters. The following regular expression uses the \w metacharacters to capture occurrences of the character J followed by zero or more word characters, which are then followed by a space character. $ echo "John Lennon and Mick Jagger" |grep -Po '(J\w*)' In this case, the expression \w* means: Find zero or more word characters. Remember, the * metacharacter means: Find zero or more of the preceding character. Thus, matching stops when it encounters a space character or the. Other punctuation and white space characters are not word characters. (A word character is an uppercase or lowercase letter, a numeric character, or the underscore character. The following regular expression uses the \w metacharacters to capture a group starting with the character J and followed by zero or more word characters. The following example uses the quantifier metacharacters )' $ echo "John Lennon and Mick Jagger" |grep -Po '(.)' The regular expression returns the following output: JohĮr tags in the echoed string. In this case, the text is a snippet of HTML echoed like so: $ echo "John Lennon and Mick Jagger" |grep -Po '(.)' The following capture group matches and groups together any 12 characters in a string of text. The command returns the following output: 212 The regular expression uses the \d metacharacters, which indicate any numeric digit: (\d\d\d)Īgain, we feed a string to grep that executes the regular expression like so: $ echo "My telephone number is 2" | grep -Po '(\d\d\d)' The following regular expression returns capture groups in which each group is made up of three numeric characters. The commands shown above return the following result: abc means any character.) Consider the following command set, which is an echo command that pipes a string to a grep command that executes the regular expression: $ echo "abcdef" | grep -Po '(.)' This capture group represents the following logic: Match any of the characters in a string and return the matches in groups of three characters. The regular expression logic for a capture group is written between opening and closing parentheses. Capture groupsĪ capture group, as the name implies, is a regular expression that matches and returns groups of characters according to a pattern. You can execute an example immediately by copying and pasting the code directly into your computer's terminal window running under Linux. The benefit of demonstrating regular expressions using grep is that you don't need to set up any special programming environment. The grep utility uses a regular expression to filter content. ![]() As in the previous articles in the series, the sample commands here execute regular expressions by piping string output from an echo command to the grep utility. ![]() ![]() In those articles, you learned about regular characters, metacharacters, quantifiers, pattern collections, and word groups. Part 3: Filter content in HTML using regular expressions in grep.Part 2: Regex how-to: Quantifiers, pattern collections, and word boundaries.Part 1: A beginner’s guide to regular expressions with grep.This is the fourth article in a series about regular expressions: This article explains capture groups, lookaheads, and lookbehinds, along with the fundamental syntax you need to know in order to write them. Capture groups, lookaheads, and lookbehinds provide a powerful way to filter and retrieve data according to advanced regular expression matching logic.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |