You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all
I have an application that, among others, has to store PDF files in a
mysql db. This has been running for almost 10 years now.
After upgrading DBD-mysql to 4.42.0, the PDF files get corrupted when
storing them to the db. It appears that they are somehow encoded in a
character set (presumably utf8), even though the column definition is
"mediumblob".
4.41.0 and earlier versions do not show that behaviour, a downgrade of
DBD::mysql without any other changes restores the correct behaviour.
Here is how the db is connected:
my $dbh = DBI->connect(${dsn},
${username},
${passwd},
{ RaiseError => 1,
AutoCommit => 1,
AutoInactiveDestroy => 1,
mysql_auto_reconnect => 1,
mysql_enable_utf8 => 1,
}
)
or die("DB connect failed: $DBI::errstr");
The code then goes on to insert data into tables like this one:
+------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------------+------+-----+---------+-------+
| id | bigint(20) unsigned | NO | PRI | NULL | |
| name | text | NO | | NULL | |
| data | mediumblob | NO | | NULL | |
| doctype | text | NO | | NULL | |
+------------+---------------------+------+-----+---------+-------+
like this:
my $sql = "INSERT INTO $table (id, name, data, doctype)
VALUES (?, ?, ?, ?)";
my $sth = $dbh->prepare($sql);
# $data_in is the raw PDF file data
$logger->log("File $name_in has " . length($data_in) . " bytes and hash
" . sha256_hex($data_in));
$sth->execute($id_in, $name_in, $data_in, "application/pdf");
The log entry shows the correct size and hash, identical to what the
file looks like on disk.
May 28 14:54:17 dev middleware[26817]: File testdoc.pdf has 493392 bytes
and hash 403da58f84328365c8bdb646bfa008f31b44f2c391dc5d40eefa6963bc49c991
But after retrieving the blob (either via the app or via CLI), the file
size is much larger (700562), the hash is different, and the file is
corrupt and cannot be opened by any reader.
MariaDB [pdfdb]> select data from $table where id = 48 into dumpfile
"/tmp/48.dump";
# ls -l /tmp/48.dump
-rw-rw-rw- 1 mysql mysql 700562 May 28 14:56 /tmp/48.dump
I've tried re-encoding the file (with vim) to latin1, which results in
the original file size 493392, but still leaves the PDF corrupted.
I assume that this is a bug in DBD::mysql.
System: Gentoo Linux ~amd64, kernel 4.10.5-gentoo
Perl 5.24.1
DBD::mysql 4.42.0
Thanks for looking into this.
Markus
On Ned Máj 28 09:22:04 2017, [email protected] wrote:
> Hi all
>
> I have an application that, among others, has to store PDF files in a
> mysql db. This has been running for almost 10 years now.
>
> After upgrading DBD-mysql to 4.42.0, the PDF files get corrupted when
> storing them to the db. It appears that they are somehow encoded in a
> character set (presumably utf8), even though the column definition is
> "mediumblob".
>
> 4.41.0 and earlier versions do not show that behaviour, a downgrade of
> DBD::mysql without any other changes restores the correct behaviour.
>
>
> Here is how the db is connected:
>
> my $dbh = DBI->connect(${dsn},
> ${username},
> ${passwd},
> { RaiseError => 1,
> AutoCommit => 1,
> AutoInactiveDestroy => 1,
> mysql_auto_reconnect => 1,
> mysql_enable_utf8 => 1,
> }
> )
> or die("DB connect failed: $DBI::errstr");
>
> The code then goes on to insert data into tables like this one:
>
> +------------+---------------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +------------+---------------------+------+-----+---------+-------+
> | id | bigint(20) unsigned | NO | PRI | NULL | |
> | name | text | NO | | NULL | |
> | data | mediumblob | NO | | NULL | |
> | doctype | text | NO | | NULL | |
> +------------+---------------------+------+-----+---------+-------+
>
> like this:
>
> my $sql = "INSERT INTO $table (id, name, data, doctype)
> VALUES (?, ?, ?, ?)";
> my $sth = $dbh->prepare($sql);
>
> # $data_in is the raw PDF file data
> $logger->log("File $name_in has " . length($data_in) . " bytes and hash
> " . sha256_hex($data_in));
>
> $sth->execute($id_in, $name_in, $data_in, "application/pdf");
>
> The log entry shows the correct size and hash, identical to what the
> file looks like on disk.
>
> May 28 14:54:17 dev middleware[26817]: File testdoc.pdf has 493392 bytes
> and hash 403da58f84328365c8bdb646bfa008f31b44f2c391dc5d40eefa6963bc49c991
>
> But after retrieving the blob (either via the app or via CLI), the file
> size is much larger (700562), the hash is different, and the file is
> corrupt and cannot be opened by any reader.
>
> MariaDB [pdfdb]> select data from $table where id = 48 into dumpfile
> "/tmp/48.dump";
>
> # ls -l /tmp/48.dump
> -rw-rw-rw- 1 mysql mysql 700562 May 28 14:56 /tmp/48.dump
>
> I've tried re-encoding the file (with vim) to latin1, which results in
> the original file size 493392, but still leaves the PDF corrupted.
>
>
> I assume that this is a bug in DBD::mysql.
>
> System: Gentoo Linux ~amd64, kernel 4.10.5-gentoo
> Perl 5.24.1
> DBD::mysql 4.42.0
>
>
> Thanks for looking into this.
>
> Markus
>
>
Duplicate of:
https://rt.cpan.org/Public/Bug/Display.html?id=120953
https://github.com/perl5-dbi/DBD-mysql/issues/107
See also for more details:
https://rt.cpan.org/Ticket/Display.html?id=25590
https://rt.cpan.org/Ticket/Display.html?id=60987
https://rt.cpan.org/Ticket/Display.html?id=53130
https://rt.cpan.org/Ticket/Display.html?id=87428
The text was updated successfully, but these errors were encountered:
I just wanted to ask what direction you have decided to go with the utf8 support of DBD::mysql.
I see that 4.48.0 is on cpan.
My installed version on Gentoo is now 4.44.0. The old behaviour seems to be still there, I have not yet changed the code, but it still works.
Hi @markuswernig! Based on discussion in #117 this issue will never be fixed in DBD::mysql. Also it is reason why I forked DBD::mysql into DBD::MariaDB https://metacpan.org/pod/DBD::MariaDB where problem with Unicode and binary parameters is fixed.
Migrated from rt.cpan.org#121921 (status was 'open')
Requestors:
Attachments:
From [email protected] on 2017-05-28 13:22:04:
From [email protected] on 2017-06-16 13:03:10:
The text was updated successfully, but these errors were encountered: