Matching Characters in Unix Shell

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Locked
RobertX
Level 4
Level 4
Posts: 259
Joined: Thu Apr 12, 2012 6:09 pm

Matching Characters in Unix Shell

Post by RobertX »

Hi,

I've been studying up on my old textbooks I bought from college. It has exercises that I have difficulty with, believe it or not. Classroom work was not as hard, but that's probably because I've been trying every question there are and not skipping each one. I would like to make a thread full of answers I need to make the most out of my time here. Don't worry, I'll try my best not to ask you "dumb" questions like "how do I program in UNIX" or "how do I turn my computer on."

Let's start with this:

Here are the contents of a sample file

Code: Select all

123
142
19988
11111111111
Here's a command that I used:

Code: Select all

grep '[0-9]\{5,5\}' numbers
My expectation was that only 19988 would show up, not the last line which has all ones. Is this normal, or have I done something wrong?
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
User avatar
Pilosopong Tasyo
Level 6
Level 6
Posts: 1432
Joined: Mon Jun 22, 2009 3:26 am
Location: Philippines

Re: Matching Characters in Unix Shell

Post by Pilosopong Tasyo »

RobertX wrote:My expectation was that only 19988 would show up, not the last line which has all ones. Is this normal, or have I done something wrong?
It worked as expected since the regex satisfied the condition, to wit:
  • 11111111111
    11111111111
    11111111111
    ...
    11111111111
You need to include some options to alter the default behavior. From the grep man page:
-w, --word-regexp

Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.
-o, --only-matching

Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Adding these options ought to do it:

Code: Select all

grep -w -o '[0-9]\{5\}' numbers
No need for the extra ,5. It's redundant: {x,y} = {x} if y = x.

The -o is optional, however. But it will come in handy if you have a mix of matched and unmatched items on the same line.

HTH.
o Give a man a fish and he will eat for a day. Teach him how to fish and he will eat for a lifetime!
o If an issue has been fixed, please edit your first post and add the word [SOLVED].
RobertX
Level 4
Level 4
Posts: 259
Joined: Thu Apr 12, 2012 6:09 pm

Re: Matching Characters in Unix Shell

Post by RobertX »

Thanks for the quick reply.

Here's another problem I've been struggling with.

The command:

Code: Select all

'^\(.\).*\1$'
Here is the file that is subject to the match.

Code: Select all

Code:
Tony
Barbara
Harry
Dick
In here, the book says that this regular expression matches all lines in which the first character on the line is the same as the last character. However, this matching set alone shows nothing when combined with grep. When I changed the 1 in 1$, it lists the line where its last character matches. So if i replace 1$ to d$, a line with the word Rod would show up because it ends with a d. When I did a grep with the d$ into a b$, all of its contents are shown, no matter what.

What is going on?
User avatar
Pilosopong Tasyo
Level 6
Level 6
Posts: 1432
Joined: Mon Jun 22, 2009 3:26 am
Location: Philippines

Re: Matching Characters in Unix Shell

Post by Pilosopong Tasyo »

RobertX wrote:...this regular expression matches all lines in which the first character on the line is the same as the last character.
Pay attention at your test data. None of the samples matches the regex, hence, no output. However, if you include, say:
Tony
Abba
Barbara
Harry
RadaR
Dick
The regex will match RadaR. Note that Abba didn't make it even though the first and last letters (characters) are both [alphabetically] the same; the only difference, being, is the case.
o Give a man a fish and he will eat for a day. Teach him how to fish and he will eat for a lifetime!
o If an issue has been fixed, please edit your first post and add the word [SOLVED].
Locked

Return to “Scripts & Bash”