Saturday, September 29, 2018

regex - how to use sed, awk, or gawk to print only what is matched?




I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk.



But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example:



Example regular expression:



.*abc([0-9]+)xyz.*



Example input file:



a
b
c
abc12345xyz
a
b
c



As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have:



myvalue=$( sed <...something...> input.txt )


Things I've tried include:



sed -e 's/.*([0-9]).*/\\1/g' example.txt # extracts the entire input file
sed -n 's/.*([0-9]).*/\\1/g' example.txt # extracts nothing


Answer



My sed (Mac OS X) didn't work with +. I tried * instead and I added p tag for printing match:



sed -n 's/^.*abc\([0-9]*\)xyz.*$/\1/p' example.txt


For matching at least one numeric character without +, I would use:



sed -n 's/^.*abc\([0-9][0-9]*\)xyz.*$/\1/p' example.txt


No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...