Issue
I want to do some post-processing on an output file.
My command "grep ' MACROSCOPIC STATIC DIELECTRIC TENSOR' -A6 OUTCAR
" on the terminal prints three matrices (numbering matrix (i)
is just to make it more clear while it was not printed from the output):
MACROSCOPIC STATIC DIELECTRIC TENSOR (including local field effects in DFT)
------------------------------------------------------
21.231535 -0.000000 -0.000000 ---- matix. (1)
-0.000000 21.231535 -0.000000
-0.000000 -0.000000 21.231535
------------------------------------------------------
--
MACROSCOPIC STATIC DIELECTRIC TENSOR (including local field effects in DFT)
------------------------------------------------------
21.231535 -0.000000 -0.000000 ---- matix. (2)
-0.000000 21.231535 -0.000000
-0.000000 -0.000000 21.231535
------------------------------------------------------
--
MACROSCOPIC STATIC DIELECTRIC TENSOR IONIC CONTRIBUTION
------------------------------------------------------
4.671391 -0.000004 0.000000 ---- matix. (3)
-0.000004 4.671584 0.000855
0.000000 0.000855 4.670146
------------------------------------------------------
Below is what I want.
I want to convert the second data grid (the first and second both are the same. So I want to convert one of them only) into a matrix form as
- [21.231535, -0.000000, -0.000000] ---- matix. (4). This is the same as the matrix (2) but is enclosed in `- [] ` and with `,` at the end of columns one and two.
- [-0.000000, 21.231535, -0.000000]
- [-0.000000, -0.000000, 21.231535]
and in the second output, I want to add second and third data points and the resultant output should be
- [25.902926, -0.000004, -0.000000]] ---- matix. (5). This is the sum of the matrix (1) and (2) and is enclosed in `- []` and with `,` at the end of columns one and two.
- [-0.000004, 25.903119, -0.000855]
- [-0.000000, -0.000855, 25.901681]
Being a novice, I could not try it on myself. So, I am posting this without my efforts.
Solution
Here's an awk
solution that makes use of getline
, which is a kind of an anti-pattern but it allows to define a function for reading the matrices, so the main code is easier to understand, IMHO.
I define a matrix as a single-dim array with the x,y
coordinates plus "dimX"
and "dimY"
as keys. The get_matrix
and print_matrix
custom functions deal with arrays in that format.
You can select the matrices that you want to print or sum in the BEGIN
block.
Here's the code:
awk '
BEGIN {
toPrint[1]
toSum[1]
toSum[3]
}
/MACROSCOPIC STATIC DIELECTRIC TENSOR/ {
++id
if ( ! (id in toPrint || id in toSum) )
next
ok = get_matrix( matx )
if ( !ok )
exit 1
if ( id in toPrint )
print_matrix( matx )
if ( id in toSum ) {
if ( ! ("dimX" in _sum) ) {
_sum["dimX"] = matx["dimX"]
_sum["dimY"] = matx["dimY"]
}
if ( matx["dimX"] != _sum["dimX"] || matx["dimY"] != _sum["dimY"] )
exit 1
for ( x = 1; x <= matx["dimX"]; x++ )
for (y = 1; y <= matx["dimY"]; y++)
_sum[x,y] += matx[x,y]
}
}
END { print_matrix( _sum) }
function print_matrix( matrix, x,y ) {
for ( y = 1; y <= matrix["dimY"]; y++ ) {
printf( "- [" )
for ( x = 1; x <= matrix["dimX"]; x++) {
printf("%s%f", (x > 1 ? "," : ""), matrix[x,y])
}
print "]"
}
}
function get_matrix( matrix, x,y,ok ) {
ok = 0
while ( !ok && getline > 0 )
ok = /---/
if ( !ok )
return 0
delete matrix
x = y = ok = 0
while ( !ok && getline > 0 )
if ( /---/ )
ok = 1
else if ( NF ) {
if ( ! y )
x = NF
else if ( x != NF )
return 0
++y
for (x = 1; x <= NF; x++)
matrix[x,y] = $x
--x
}
if ( !ok )
return 0
matrix["dimX"] = x
matrix["dimY"] = y
return 1
}
' OUTCAR
- [21.231535,-0.000000,-0.000000]
- [-0.000000,21.231535,-0.000000]
- [-0.000000,-0.000000,21.231535]
- [25.902926,-0.000004,0.000000]
- [-0.000004,25.903119,0.000855]
- [0.000000,0.000855,25.901681]
Answered By - Fravadona Answer Checked By - Senaida (WPSolving Volunteer)