Issue
I have a urlwatch
.yaml
file that has this format:
name: 01_urlwatch update released
url: "https://github.com/thp/urlwatch/releases"
filter:
- xpath:
path: '(//div[contains(@class,"release-timeline-tags")]//h4)[1]/a'
- html2text: re
---
name: 02_urlwatch webpage
url: "https://thp.io/2008/urlwatch/"
filter:
- html2text: re
- grep: (?i)current\sversion #\s Matches a whitespace character
- strip # Strip leading and trailing whitespace
---
name: 04_RansomWhere? Objective-See
url: "https://objective-see.com/products/ransomwhere.html"
filter:
- html2text: re
- grep: (?i)current\sversion #\s Matches a whitespace character
- strip #Strip leading and trailing whitespace
---
name: 05_BlockBLock Objective-See
url: "https://objective-see.com/products/blockblock.html"
filter:
- html2text: re
- grep: (?i)current\sversion #(?i) \s
- strip #Strip leading and trailing whitespace
---
I need to "re-index" the two digit number depending on the occurrence of name:
. In this example the first and second occurrence of name:
are followed by the correct index numbers but the third and fourth are not.
In the example above the third and fourth occurrence of name:
would have their index number re-indexed to have 03_
and 04_
before the text string. That is: a two digit index number, and an underscore.
Also, there are instances of this string #name:
which should not be counted in the re-indexing. (They have been commented out so those lines are not acted upon by urlwatch
)
I tried using sed but had trouble with generating an index number based on occurrence of the string. I don't have GNU sed but can install if that is the only method.
Solution
awk '/^name/{sub(/[0-9]{2}/,sprintf("%02d", ++c))}1' file
For any line starting with "name" we replace the first 2-digit number with our counter, which increments on every occurrence, with the help of the GNU awk sprintf
function to print it with leading zeros when needed.
Answered By - thanasisp Answer Checked By - Terry (WPSolving Volunteer)