Comment #0 by bioinfornatics — 2013-02-13T01:33:51Z
Dear,
when I try to read gzip compressed data with a byChunk modified to uncompress data at second loop uncompress raise an error:
zlib.d(59): data error
code: http://dpaste.1azy.net/0d8f6eac
EXetoC show to me http://d.puremagic.com/issues/show_bug.cgi?id=3191
but this bug is really old is not possible to be this no ?
$ ./zreader file.bam
--> popFront()
--> front()
BAMP�
--> popFront()
--> front()
Comment #1 by bioinfornatics — 2013-02-13T07:15:41Z
i create a little code using etc.c.zlib to be able to use gzipped as a phobos
range: http://dpaste.dzfl.pl/5b8db0a2
I would like your opinion on this code and if it could replace the old std.zlib.Uncompress
if yes I could continue to work on to provides uncompress and compress way in D modern way
you have the power
Comment #2 by bioinfornatics — 2013-02-13T07:55:27Z
to fit more with std.stdio http://dpaste.dzfl.pl/683c053b
I add a:
- ZFile
- rawRead method into ZFile
- byZChunk use ZFile / ZFile.rawRead
Comment #3 by monarchdodra — 2013-02-13T09:25:07Z
Both implementations have the fatal flaw of closing the file on first destruction.
This makes passing a byZChunk (first case) or a (ZFile) second case a dangerous operation.
Look into the "File" implementation, it should be reference counted, and only close the file on the last actual destruction.
nitpick:
Once you've named your type ZFile, calling "byChunk" byZChunk is redudant. Just leave it at byChunk:
auto r1 = File ("SRR077487_1.filt.fastq" ).byChunk();
auto r2 = ZFile("SRR077487_1.filt.fastq.gz").byChunk();
Comment #4 by monarchdodra — 2013-02-13T09:41:39Z
Oh yeah, also, gzread will return -1 in case of an io error. you don't check for that.
You'd probably want to use:
_numberRead = gzread( _file, _buffer.ptr, cast(uint)_buffer.length );
errnoEnforce(numberRead >= 0);
I can only guess a failed decompress sets an errno? Not sure.
Comment #5 by bioinfornatics — 2013-02-13T10:34:22Z
thanks for your much appropriated comment :-)
I will try to enhance this. That was a snippet code to be able to read gz file in D as std.zlib and std.zip is unusable, unmaintened and in not in D philosophy ( use class instead struct, no phobos range and many bug )
Comment #6 by bioinfornatics — 2013-02-14T03:36:31Z
I guess that's because you are using winbits=15 instead of -15. BAM files contain GZIP blocks with custom headers, so processing them is not so straightforward.
(I've developed a library for BAM files during last summer - github.com/lomereiter/biod)
(In reply to comment #0)
> Dear,
> when I try to read gzip compressed data with a byChunk modified to uncompress
> data at second loop uncompress raise an error:
> zlib.d(59): data error
>
>
> code: http://dpaste.1azy.net/0d8f6eac
>
> EXetoC show to me http://d.puremagic.com/issues/show_bug.cgi?id=3191
>
> but this bug is really old is not possible to be this no ?
>
> $ ./zreader file.bam
> --> popFront()
> --> front()
> BAMP�
> --> popFront()
> --> front()
Comment #8 by bioinfornatics — 2013-02-14T09:49:19Z
(In reply to comment #7)
> I guess that's because you are using winbits=15 instead of -15. BAM files
> contain GZIP blocks with custom headers, so processing them is not so
> straightforward.
>
> (I've developed a library for BAM files during last summer -
> github.com/lomereiter/biod)
>
> (In reply to comment #0)
> > Dear,
> > when I try to read gzip compressed data with a byChunk modified to uncompress
> > data at second loop uncompress raise an error:
> > zlib.d(59): data error
> >
> >
> > code: http://dpaste.1azy.net/0d8f6eac
> >
> > EXetoC show to me http://d.puremagic.com/issues/show_bug.cgi?id=3191
> >
> > but this bug is really old is not possible to be this no ?
> >
> > $ ./zreader file.bam
> > --> popFront()
> > --> front()
> > BAMP�
> > --> popFront()
> > --> front()
Thanks a lot for your lib i will take a look. why not to create bioinformatic group for works together ?
the problem show here is not about BGZF block i written a new module for zlib to fit with actual way to process in D as phobos range
In anycase i think this code should replace the old one and give a review as zlib is D1 code
Comment #9 by bioinfornatics — 2013-02-15T06:37:25Z