Friday, October 7, 2022

[SOLVED] grep from first part of a line before delimiter

Issue

I have to grep from this data- test1.txt:

1 - Billing_Type
604 - Customer_Name
2 - Contact_Name
3 - Customer_Phone_Number
4 - Contact_Phone_Number
5 - Customer_Type
6 - Reason_Code
7 - CALLE 1
8 - CALLE 2
9 - NUMERO
10 - ID
11 - Service Address
1700001031 - Serial_Number
1700001008 - STB_REF_AP_ID
1700001027 - Smart_Card_ID

I am comparing the first part of the file e.g. 1700001031, 1, 8 etc in a loop from a file and then copying the second part of the file in a variable like the Serial_Number, Billing_Type, CALLE 2.

This is the statement i have used : sample statement

grep -w 1 test1.txt | cut -d'-' -f2 |tr -d ' '

but the problem with this statement is that for values 1 and 2 is will output two lines. for 1 as ID,it will print:

Billing_Type
CALLE 1

as the ATTR_NAME also contains the word value 1 in 'CALLE 1'.

how do i search in the first part only and get the second without making any extra files?


Solution

You really want to use awk not grep for this:

$ awk -F' - ' '$1==1{print $2}' file
Billing_Type
                                                                                                                                            
$ awk -F' - ' '$1==7{print $2}' file
CALLE 1
                                                                                                                                           
$ awk -F' - ' '$1==1700001031{print $2}' file
Serial_Number

This does a numeric equality test against the first field $1 and if the line matches it's prints the second field $2 using - as the field separator.


With GNU Grep you could do the following but the awk approach is definitely the way to go:

$ grep -Po '^1\s+-\s+\K.*' file
Billing_Type
                                                                                                                                       
$ grep -Po '^7\s+-\s+\K.*' file
CALLE 1
                                                                                                                                      
$ grep -Po '^1700001031\s+-\s+\K.*' file
Serial_Number

This matches the start of string of the string ^ then a given number followed by one or more spaces, a dash and more spaces \s+-\s+, \K is part of perl compliant regular expressions so don't count on it being widely available, what it does is makes all the previously matched part of the string be forgotten about. Finally we match the rest of the line .* and only this is printed thanks to the -o option and the \K.


The approach with sed would be to match the line then subsitute the start of the line with an empty string:

$ sed -rn '/^1\s/{s/^[0-9]+\s+-\s+//p}' file
Billing_Type
                                                                                                                                            
$ sed -rn '/^7\s/{s/^[0-9]+\s+-\s+//p}' file
CALLE 1
                                                                                                                                        
$ sed -rn '/^1700001031\s/{s/^[0-9]+\s+-\s+//p}' file
Serial_Number


Answered By - Chris Seymour
Answer Checked By - Mildred Charles (WPSolving Admin)