I have the following list that I want to break on the digit. For example:
From:
103Ru
103mRh
104
1041To:
103
Ru
103
mRh
104
1041I would like to use Regx with sed or maybe awk in order to achieve this result. But most of my approaches failed. I need some advice or possibly some solution. Thanks
2 Answers
$ sed -r 's/([0-9])([^0-9])/\1\n\2/g' filename
103
Ru
103
mRh
104
1041The above regex looks for a number followed by not a number. If found, it inserts a newline between them.
In more detail, sed commands of the form s/old/new/ look for old and replace it with new. In our case, old consists of two characters: ([0-9]) matches any number and, because it is enclosed in parens, it saves the value. ([^0-9]) matches anything other than a number and saves it also. Those two characters, if found, are replaced with \1\n\2 which means the first match (the number), a newline, and the second match (not-a-number).
MORE: If we want to break at the beginnings of numbers as well as at the end, then we add one more substitution command:
$ echo xyz541wpk | sed -r 's/([0-9])([^0-9])/\1\n\2/g; s/([^0-9])([0-9])/\1\n\2/g'
xyz
541
wpkThe second substitution command is just like the first but it looks for the reverse pattern: not-a-number followed by a number.
0Here are two more choices:
grepgrep -oP '\d+|.*' fileExplanation:
-P: activates Perl Compatible Regular Expressions, which lets us use\dfor digits. The|symbol, logicalOR, means thatgrepwill first try to match one or more (+) digits, and then everything else (.*).-o: This causesgrepto only print the matching part of the input line. A side effect is that if a line has multiple matches, it will print each of them on a new line, so it will produce the desired output.
Perl
perl -lne 's/(\d+)(\D+)/$1\n$2/; print;' fileExplanation:
- The
-nmeans, read the file line by line and apply the script given by-eto each line.-li) removes newlines (\n) from the end of the line and ii) adds a\nto eachprint. s/pattern/replacement/: replacespatternwithreplacement.(\d+)(\D+): Match one or more digits (\d) followed by one or more non-digits (\D). The parentheses()means that the matches are captured so we can then refer to them as$1and$2.- Taken together, the substitution will simply insert a newline between a string of digits, and the following non-digits. The
printjust prints the line.
- The