Issue
How can I remove the beginning of a word using grep? For example, I have a file that contains this:
www.abc.com
abc.com
Sorry for the basic question, but I have no experience with Linux.
Solution
You don't edit strings with grep
in Unix shell, grep
is usually used to find or remove some lines from the text. You'd rather use sed
instead:
$ echo www.example.com | sed 's/^[^\.]\+\.//'
example.com
You'll need to learn regular expressions to use it effectively.
Sed can also edit file in-place (modify the file), if you pass -i
argument, but be careful, you can easily lose data if you write the wrong sed
command and use -i
flag.
An example
From your comments guess you have a TeX document, and your want to remove the first part of all .com domain names. If it is your document test.tex
:
\documentclass{article}
\begin{document}
www.example.com
example.com www.another.domain.com
\end{document}
then you can transform it with this sed
command (redirect output to file or edit in-place with -i
):
$ sed 's/\([a-z0-9-]\+\.\)\(\([a-z0-9-]\+\.\)\+com\)/\2/gi' test.tex
\documentclass{article}
\begin{document}
example.com
example.com another.domain.com
\end{document}
Please note that:
- A common sequence of allowed symbols followed by a dot is matched by
[a-z0-9-]\+\.
- I used groups in the regular expression (parts of it within
\(
and\)
) to indicate the first and the second part of the URL, and I replace the entire match with its second group (\2
in the substitution pattern) - The domain should be at least 3rd level .com domain (every
\+
repition means at least one match) - The search is case insensitive (
i
flag in the end) - It can do more than match per line (
g
flag in the end)
Answered By - sastanin Answer Checked By - Marie Seifert (WPSolving Admin)