Glam Prestige Journal

Bright entertainment trends with youth appeal.

I have the following list that I want to break on the digit. For example:

From:

103Ru
103mRh
104
1041

To:

103
Ru
103
mRh
104
1041

I would like to use Regx with sed or maybe awk in order to achieve this result. But most of my approaches failed. I need some advice or possibly some solution. Thanks

2 Answers

$ sed -r 's/([0-9])([^0-9])/\1\n\2/g' filename
103
Ru
103
mRh
104
1041

The above regex looks for a number followed by not a number. If found, it inserts a newline between them.

In more detail, sed commands of the form s/old/new/ look for old and replace it with new. In our case, old consists of two characters: ([0-9]) matches any number and, because it is enclosed in parens, it saves the value. ([^0-9]) matches anything other than a number and saves it also. Those two characters, if found, are replaced with \1\n\2 which means the first match (the number), a newline, and the second match (not-a-number).

MORE: If we want to break at the beginnings of numbers as well as at the end, then we add one more substitution command:

$ echo xyz541wpk | sed -r 's/([0-9])([^0-9])/\1\n\2/g; s/([^0-9])([0-9])/\1\n\2/g'
xyz
541
wpk

The second substitution command is just like the first but it looks for the reverse pattern: not-a-number followed by a number.

0

Here are two more choices:

  1. grep

    grep -oP '\d+|.*' file

    Explanation:

    • -P : activates Perl Compatible Regular Expressions, which lets us use \d for digits. The | symbol, logical OR, means that grep will first try to match one or more (+) digits, and then everything else (.*).
    • -o : This causes grep to only print the matching part of the input line. A side effect is that if a line has multiple matches, it will print each of them on a new line, so it will produce the desired output.
  2. Perl

    perl -lne 's/(\d+)(\D+)/$1\n$2/; print;' file

    Explanation:

    • The -n means, read the file line by line and apply the script given by -e to each line. -l i) removes newlines (\n) from the end of the line and ii) adds a \n to each print.
    • s/pattern/replacement/ : replaces pattern with replacement.
    • (\d+)(\D+) : Match one or more digits (\d) followed by one or more non-digits (\D). The parentheses () means that the matches are captured so we can then refer to them as $1 and $2.
    • Taken together, the substitution will simply insert a newline between a string of digits, and the following non-digits. The print just prints the line.
0

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy