Tuesday, November 16, 2021

[SOLVED] Mysql INTO OUTFILE nested unions with NULL causing a formatting issue with integer

Issue

I have strange behavior with SELECT INTO OUFILE and I didn't find an answer.

To resume I have a table like this :

CREATE TABLE `mytable` (
  `id` int(11) NOT NULL auto_increment,
  `field1` decimal(10,2) default NULL,
  `field2` int(11) default NULL,
  `field3` tinyint(4) default NULL,
  PRIMARY KEY  (`id`),
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

When I execute a query like this :

SELECT * INTO OUTFILE '/tmp/output.dat' 
FIELDS TERMINATED BY '|' 
OPTIONALLY ENCLOSED BY '\"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n' 
FROM mytable;

I obtain a result like this in the output.dat :

"12345678"|"20.00^@^@^@^@^@^@^@"|"1426513906^@"|"0^@^@^@"
"95863214"|"20.00^@^@^@^@^@^@^@"|"1426514075^@"|"1^@^@^@"

without ESCAPED BY :

SELECT * INTO OUTFILE '/tmp/output.dat' 
FIELDS TERMINATED BY '|' 
OPTIONALLY ENCLOSED BY '\"' 
LINES TERMINATED BY '\n' 
FROM mytable;

I have a result like this :

"12345678"|"20.00\0\0\0\0\0\0\0"|"1426513906\0"|"0\0\0\0"
"95863214"|"20.00\0\0\0\0\0\0\0"|"1426514075\0"|"1\0\0\0"

Mysql environment:

"protocol_version";"10"
"version";"5.0.67-community-log"
"version_comment";"MySQL Community Edition (GPL)"
"version_compile_machine";"x86_64"
"version_compile_os";"redhat-linux-gnu"

It seem like mysql try to fill with this special characters to have the size set in the structure of mytble. With TRIM I don't have this characters. But I want to know if it's normal to have this behavior only with integer and decimal or it's bug ? I want to know also if there are another solution to avoid using TRIM with each field ? Because I have a lot of queries more complex then this one.

Thank you for your help and the time that you took to read my post

Regards

EDIT: I thought that the problem probably comes from INTO OUFILE and type fields only. I didn't think to mention nested unions in this case. So I changed the title and I'm responding to my question.


Solution

Last EDIT : To resume, the best solution found is to force escape with backslash and not leave it empty :

SELECT * INTO OUTFILE '/tmp/output1.dat' 
FIELDS TERMINATED BY '|' 
ENCLOSED BY '\"' 
ESCAPED BY '\\'
LINES TERMINATED BY '\n' 
FROM 
    ((SELECT
    id_test,
    field2,
    field3
    FROM mytable1)

    UNION

    (SELECT 
    * 
    FROM 
        ((SELECT
        id_test,
        field2,
        NULL AS field3
        FROM mytable2)

        UNION

        (SELECT
        id_test,
        NULL AS field2,
        NULL AS field3
        FROM mytable3)
        ) test)
    ) tmptable;

Notes from mysql documentation :

If the FIELDS ESCAPED BY character is empty, no characters are escaped and NULL is output as NULL, not \N. It is probably not a good idea to specify an empty escape character, particularly if field values in your data contain any of the characters in the list just given.

....

Also note that if you specify an empty ESCAPED BY value, it is possible to inadvertently generate output that cannot be read properly by LOAD DATA INFILE.

Previous solutions with explanation :

I found the answer to my problem. It's come from the nested unions with NULL values. I'll explain with example it's better than long speach.

Here the mysql context :

DROP TABLE IF EXISTS `mytable1`;

CREATE TABLE `mytable1` (
  `id_test` INT(11) NOT NULL AUTO_INCREMENT,
  `field1` DECIMAL(10,2) DEFAULT NULL,
  `field2` INT(11) DEFAULT NULL,
  `field3` TINYINT(4) DEFAULT NULL,
  PRIMARY KEY  (`id_test`)
) ENGINE=MYISAM AUTO_INCREMENT=95863215 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `mytable` */

INSERT  INTO `mytable1`(`id_test`,`field1`,`field2`,`field3`) VALUES (12345678,20.00,1426513906,0),(95863214,20.00,1426514075,1);

/*Table structure for table `mytable2` */

DROP TABLE IF EXISTS `mytable2`;

CREATE TABLE `mytable2` (
  `id_test` INT(11) NOT NULL AUTO_INCREMENT,
  `field1` DECIMAL(10,2) DEFAULT NULL,
  `field2` INT(11) DEFAULT NULL,
  PRIMARY KEY  (`id_test`)
) ENGINE=MYISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `mytable2` */

INSERT  INTO `mytable2`(`id_test`,`field1`,`field2`) VALUES (1,25.00,12345),(2,11.00,52146);

/*Table structure for table `mytable3` */

DROP TABLE IF EXISTS `mytable3`;

CREATE TABLE `mytable3` (
  `id_test` INT(11) NOT NULL AUTO_INCREMENT,
  `field1` DECIMAL(10,2) DEFAULT NULL,
  `field3` TINYINT(4) DEFAULT NULL,
  PRIMARY KEY  (`id_test`)
) ENGINE=MYISAM AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/*Data for the table `mytable3` */

INSERT  INTO `mytable3`(`id_test`,`field1`,`field3`) VALUES (2,12.00,2),(4,23.00,31);

And with environment :

"protocol_version";"10"
"version";"5.0.67-community-log"
"version_comment";"MySQL Community Edition (GPL)"
"version_compile_machine";"x86_64"
"version_compile_os";"redhat-linux-gnu"

Or with this one :

"protocol_version";"10"
"version";"5.0.95-log"
"version_bdb";"Sleepycat Software: Berkeley DB 4.1.24: (December 16, 2011)"
"version_comment";"Source distribution"
"version_compile_machine";"x86_64"
"version_compile_os";"redhat-linux-gnu"

We reproduce the case with this kind of query :

SELECT * INTO OUTFILE '/tmp/output1.dat' 
FIELDS TERMINATED BY '|' 
ENCLOSED BY '\"' 
ESCAPED BY ''
LINES TERMINATED BY '\n' 
FROM 
    ((SELECT
    id_test,
    field2,
    field3
    FROM mytable1)

    UNION

    (SELECT 
    * 
    FROM 
        ((SELECT
        id_test,
        field2,
        NULL AS field3
        FROM mytable2)

        UNION

        (SELECT
        id_test,
        NULL AS field2,
        NULL AS field3
        FROM mytable3)
        ) test)
    ) tmptable;

We obtain this result :

"12345678"|"1426513906"|"0^@^@^@"
"95863214"|"1426514075"|"1^@^@^@"
"1"|"12345"|NULL
"2"|"52146"|NULL
"2"|NULL|NULL
"4"|NULL|NULL

When we execute the same query with replacing the last NULL with 1 for example or with removing the last union :

SELECT * INTO OUTFILE '/tmp/output1.dat' 
FIELDS TERMINATED BY '|' 
ENCLOSED BY '\"' 
ESCAPED BY ''
LINES TERMINATED BY '\n' 
FROM 
    ((SELECT
    id_test,
    field2,
    field3
    FROM mytable1)

    UNION

    (SELECT 
    * 
    FROM 
        ((SELECT
        id_test,
        field2,
        NULL AS field3
        FROM mytable2)

        UNION

        (SELECT
        id_test,
        NULL AS field2,
        1 AS field3
        FROM mytable3)
        ) test)
    ) tmptable;

We obtain the expected rigth result :

"12345678"|"1426513906"|"0"
"95863214"|"1426514075"|"1"
"1"|"12345"|NULL
"2"|"52146"|NULL
"2"|NULL|1
"4"|NULL|1

This is just a simple example without WHERE to explain. In my case, the nested union is used because we have different values in different conditions.

So be careful with nested UNION with NULL values in the case of export with INTO OUTFILE. For my case the solution was the fuse nested unions in one query without UNION.

Note that I didn't reproduce this case with Mysql 5.6.23 MySQL Community Server on Windows 7.



Answered By - Mansour