Issue
There have been dozens of similar questions that was asked but my question is about a specific selection between the tags. I don't want the entire selection from <a href
to </a>
, I only need to target the ">
between those tags itself.
I am trying to convert a href
links into wikilinks. For example, if the sample text has:
<a href="./light.html">Light</a> is light.
<div class="reasons">
I wanted to edit the file itself and change from <a href="link.html">Link</a>
into [[link.html|Link]]
. The basic idea that I have right now uses 3 sed edits as follows:
<a href="link.html">Link</a>
-><a href="link.html|Link</a>
<a href="link.html|Link</a>
->[[link.html|Link</a>
[[link.html|Link</a>
->[[link.html|Link]]
My problem lies with the first step; I can't find the regex that only targets ">
between <a href
and </a>
.
I understand that the basic idea would need to be the search target between lookaround and lookbehind. But trying it on regexr showed a fail. I also tried using conditional regex. I can't find the syntax I used but it either turned an error or it worked but it also captured the div class.
Edit: I'm on Ubuntu and using a bash script using sed to do the text manipulation.
Solution
The basic idea that I have right now uses 3 sed edits
Assuming you've also read the answers underneath those dozens of similar questions, you could've known that it's a bad idea to parse HTML with sed
(regex).
With an HTML-parser like xidel this would be as simple as:
$ xidel -s '<a href="link.html">Link</a>' -e 'concat("[[",//a/@href,"|",//a,"]]")'
$ xidel -s '<a href="link.html">Link</a>' -e '"[["||//a/@href||"|"||//a||"]]"'
$ xidel -s '<a href="link.html">Link</a>' -e 'x"[[{//a/@href}|{//a}]]"'
[[link.html|Link]]
Three different queries to concatenate strings. The 1st query uses the XPath concat() function, the 2nd query uses the XPath ||
operator and the 3rd uses xidel
's extended string syntax.
Answered By - Reino Answer Checked By - Robin (WPSolving Admin)