Context Navigation

source: rtems/cpukit/zlib/examples/zlib_how.html @ f1c8de9

4.115

Last change on this file since f1c8de9 was f1c8de9, checked in by Ralf Corsepius <ralf.corsepius@…>, on 03/18/11 at 10:11:00
Import from zlib-1.2.4
Property mode set to `100644`
File size: 29.1 KB

Rev	Line
[959f7df2]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
	2	"http://www.w3.org/TR/REC-html40/loose.dtd">
	3	<html>
	4	<head>
	5	<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
	6	<title>zlib Usage Example</title>
[f1c8de9]	7	<!-- Copyright (c) 2004, 2005 Mark Adler. -->
[959f7df2]	8	</head>
	9	<body bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#00A000">
	10	<h2 align="center"> zlib Usage Example </h2>
	11	We often get questions about how the <tt>deflate()</tt> and <tt>inflate()</tt> functions should be used.
	12	Users wonder when they should provide more input, when they should use more output,
	13	what to do with a <tt>Z_BUF_ERROR</tt>, how to make sure the process terminates properly, and
	14	so on. So for those who have read <tt>zlib.h</tt> (a few times), and
	15	would like further edification, below is an annotated example in C of simple routines to compress and decompress
	16	from an input file to an output file using <tt>deflate()</tt> and <tt>inflate()</tt> respectively. The
	17	annotations are interspersed between lines of the code. So please read between the lines.
	18	We hope this helps explain some of the intricacies of <em>zlib</em>.
	19	<p>
	20	Without further adieu, here is the program <a href="zpipe.c"><tt>zpipe.c</tt></a>:
	21	<pre><b>
	22	/* zpipe.c: example of proper use of zlib's inflate() and deflate()
	23	Not copyrighted -- provided to the public domain
[f1c8de9]	24	Version 1.4 11 December 2005 Mark Adler */
[959f7df2]	25
	26	/* Version history:
	27	1.0 30 Oct 2004 First version
	28	1.1 8 Nov 2004 Add void casting for unused return values
	29	Use switch statement for inflate() return values
	30	1.2 9 Nov 2004 Add assertions to document zlib guarantees
[f1c8de9]	31	1.3 6 Apr 2005 Remove incorrect assertion in inf()
	32	1.4 11 Dec 2005 Add hack to avoid MSDOS end-of-line conversions
	33	Avoid some compiler warnings for input and output buffers
[959f7df2]	34	*/
	35	</b></pre><!-- -->
	36	We now include the header files for the required definitions. From
	37	<tt>stdio.h</tt> we use <tt>fopen()</tt>, <tt>fread()</tt>, <tt>fwrite()</tt>,
	38	<tt>feof()</tt>, <tt>ferror()</tt>, and <tt>fclose()</tt> for file i/o, and
	39	<tt>fputs()</tt> for error messages. From <tt>string.h</tt> we use
	40	<tt>strcmp()</tt> for command line argument processing.
	41	From <tt>assert.h</tt> we use the <tt>assert()</tt> macro.
	42	From <tt>zlib.h</tt>
	43	we use the basic compression functions <tt>deflateInit()</tt>,
	44	<tt>deflate()</tt>, and <tt>deflateEnd()</tt>, and the basic decompression
	45	functions <tt>inflateInit()</tt>, <tt>inflate()</tt>, and
	46	<tt>inflateEnd()</tt>.
	47	<pre><b>
	48	#include <stdio.h>
	49	#include <string.h>
	50	#include <assert.h>
	51	#include "zlib.h"
	52	</b></pre><!-- -->
[f1c8de9]	53	This is an ugly hack required to avoid corruption of the input and output data on
	54	Windows/MS-DOS systems. Without this, those systems would assume that the input and output
	55	files are text, and try to convert the end-of-line characters from one standard to
	56	another. That would corrupt binary data, and in particular would render the compressed data unusable.
	57	This sets the input and output to binary which suppresses the end-of-line conversions.
	58	<tt>SET_BINARY_MODE()</tt> will be used later on <tt>stdin</tt> and <tt>stdout</tt>, at the beginning of <tt>main()</tt>.
	59	<pre><b>
	60	#if defined(MSDOS) \|\| defined(OS2) \|\| defined(WIN32) \|\| defined(__CYGWIN__)
	61	# include <fcntl.h>
	62	# include <io.h>
	63	# define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
	64	#else
	65	# define SET_BINARY_MODE(file)
	66	#endif
	67	</b></pre><!-- -->
[959f7df2]	68	<tt>CHUNK</tt> is simply the buffer size for feeding data to and pulling data
	69	from the <em>zlib</em> routines. Larger buffer sizes would be more efficient,
	70	especially for <tt>inflate()</tt>. If the memory is available, buffers sizes
	71	on the order of 128K or 256K bytes should be used.
	72	<pre><b>
	73	#define CHUNK 16384
	74	</b></pre><!-- -->
	75	The <tt>def()</tt> routine compresses data from an input file to an output file. The output data
	76	will be in the <em>zlib</em> format, which is different from the <em>gzip</em> or <em>zip</em>
	77	formats. The <em>zlib</em> format has a very small header of only two bytes to identify it as
	78	a <em>zlib</em> stream and to provide decoding information, and a four-byte trailer with a fast
	79	check value to verify the integrity of the uncompressed data after decoding.
	80	<pre><b>
	81	/* Compress from file source to file dest until EOF on source.
	82	def() returns Z_OK on success, Z_MEM_ERROR if memory could not be
	83	allocated for processing, Z_STREAM_ERROR if an invalid compression
	84	level is supplied, Z_VERSION_ERROR if the version of zlib.h and the
	85	version of the library linked do not match, or Z_ERRNO if there is
	86	an error reading or writing the files. */
	87	int def(FILE source, FILE dest, int level)
	88	{
	89	</b></pre>
	90	Here are the local variables for <tt>def()</tt>. <tt>ret</tt> will be used for <em>zlib</em>
	91	return codes. <tt>flush</tt> will keep track of the current flushing state for <tt>deflate()</tt>,
	92	which is either no flushing, or flush to completion after the end of the input file is reached.
	93	<tt>have</tt> is the amount of data returned from <tt>deflate()</tt>. The <tt>strm</tt> structure
	94	is used to pass information to and from the <em>zlib</em> routines, and to maintain the
	95	<tt>deflate()</tt> state. <tt>in</tt> and <tt>out</tt> are the input and output buffers for
	96	<tt>deflate()</tt>.
	97	<pre><b>
	98	int ret, flush;
	99	unsigned have;
	100	z_stream strm;
[f1c8de9]	101	unsigned char in[CHUNK];
	102	unsigned char out[CHUNK];
[959f7df2]	103	</b></pre><!-- -->
	104	The first thing we do is to initialize the <em>zlib</em> state for compression using
	105	<tt>deflateInit()</tt>. This must be done before the first use of <tt>deflate()</tt>.
	106	The <tt>zalloc</tt>, <tt>zfree</tt>, and <tt>opaque</tt> fields in the <tt>strm</tt>
	107	structure must be initialized before calling <tt>deflateInit()</tt>. Here they are
	108	set to the <em>zlib</em> constant <tt>Z_NULL</tt> to request that <em>zlib</em> use
	109	the default memory allocation routines. An application may also choose to provide
	110	custom memory allocation routines here. <tt>deflateInit()</tt> will allocate on the
	111	order of 256K bytes for the internal state.
	112	(See <a href="zlib_tech.html"><em>zlib Technical Details</em></a>.)
	113	<p>
	114	<tt>deflateInit()</tt> is called with a pointer to the structure to be initialized and
	115	the compression level, which is an integer in the range of -1 to 9. Lower compression
	116	levels result in faster execution, but less compression. Higher levels result in
	117	greater compression, but slower execution. The <em>zlib</em> constant Z_DEFAULT_COMPRESSION,
	118	equal to -1,
	119	provides a good compromise between compression and speed and is equivalent to level 6.
	120	Level 0 actually does no compression at all, and in fact expands the data slightly to produce
	121	the <em>zlib</em> format (it is not a byte-for-byte copy of the input).
	122	More advanced applications of <em>zlib</em>
	123	may use <tt>deflateInit2()</tt> here instead. Such an application may want to reduce how
	124	much memory will be used, at some price in compression. Or it may need to request a
	125	<em>gzip</em> header and trailer instead of a <em>zlib</em> header and trailer, or raw
	126	encoding with no header or trailer at all.
	127	<p>
	128	We must check the return value of <tt>deflateInit()</tt> against the <em>zlib</em> constant
	129	<tt>Z_OK</tt> to make sure that it was able to
	130	allocate memory for the internal state, and that the provided arguments were valid.
	131	<tt>deflateInit()</tt> will also check that the version of <em>zlib</em> that the <tt>zlib.h</tt>
	132	file came from matches the version of <em>zlib</em> actually linked with the program. This
	133	is especially important for environments in which <em>zlib</em> is a shared library.
	134	<p>
	135	Note that an application can initialize multiple, independent <em>zlib</em> streams, which can
	136	operate in parallel. The state information maintained in the structure allows the <em>zlib</em>
	137	routines to be reentrant.
	138	<pre><b>
	139	/* allocate deflate state */
	140	strm.zalloc = Z_NULL;
	141	strm.zfree = Z_NULL;
	142	strm.opaque = Z_NULL;
	143	ret = deflateInit(&strm, level);
	144	if (ret != Z_OK)
	145	return ret;
	146	</b></pre><!-- -->
	147	With the pleasantries out of the way, now we can get down to business. The outer <tt>do</tt>-loop
	148	reads all of the input file and exits at the bottom of the loop once end-of-file is reached.
	149	This loop contains the only call of <tt>deflate()</tt>. So we must make sure that all of the
	150	input data has been processed and that all of the output data has been generated and consumed
	151	before we fall out of the loop at the bottom.
	152	<pre><b>
	153	/* compress until end of file */
	154	do {
	155	</b></pre>
	156	We start off by reading data from the input file. The number of bytes read is put directly
	157	into <tt>avail_in</tt>, and a pointer to those bytes is put into <tt>next_in</tt>. We also
	158	check to see if end-of-file on the input has been reached. If we are at the end of file, then <tt>flush</tt> is set to the
	159	<em>zlib</em> constant <tt>Z_FINISH</tt>, which is later passed to <tt>deflate()</tt> to
	160	indicate that this is the last chunk of input data to compress. We need to use <tt>feof()</tt>
	161	to check for end-of-file as opposed to seeing if fewer than <tt>CHUNK</tt> bytes have been read. The
	162	reason is that if the input file length is an exact multiple of <tt>CHUNK</tt>, we will miss
	163	the fact that we got to the end-of-file, and not know to tell <tt>deflate()</tt> to finish
	164	up the compressed stream. If we are not yet at the end of the input, then the <em>zlib</em>
	165	constant <tt>Z_NO_FLUSH</tt> will be passed to <tt>deflate</tt> to indicate that we are still
	166	in the middle of the uncompressed data.
	167	<p>
	168	If there is an error in reading from the input file, the process is aborted with
	169	<tt>deflateEnd()</tt> being called to free the allocated <em>zlib</em> state before returning
	170	the error. We wouldn't want a memory leak, now would we? <tt>deflateEnd()</tt> can be called
	171	at any time after the state has been initialized. Once that's done, <tt>deflateInit()</tt> (or
	172	<tt>deflateInit2()</tt>) would have to be called to start a new compression process. There is
	173	no point here in checking the <tt>deflateEnd()</tt> return code. The deallocation can't fail.
	174	<pre><b>
	175	strm.avail_in = fread(in, 1, CHUNK, source);
	176	if (ferror(source)) {
	177	(void)deflateEnd(&strm);
	178	return Z_ERRNO;
	179	}
	180	flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
	181	strm.next_in = in;
	182	</b></pre><!-- -->
	183	The inner <tt>do</tt>-loop passes our chunk of input data to <tt>deflate()</tt>, and then
	184	keeps calling <tt>deflate()</tt> until it is done producing output. Once there is no more
	185	new output, <tt>deflate()</tt> is guaranteed to have consumed all of the input, i.e.,
	186	<tt>avail_in</tt> will be zero.
	187	<pre><b>
	188	/* run deflate() on input until output buffer not full, finish
	189	compression if all of source has been read in */
	190	do {
	191	</b></pre>
	192	Output space is provided to <tt>deflate()</tt> by setting <tt>avail_out</tt> to the number
	193	of available output bytes and <tt>next_out</tt> to a pointer to that space.
	194	<pre><b>
	195	strm.avail_out = CHUNK;
	196	strm.next_out = out;
	197	</b></pre>
	198	Now we call the compression engine itself, <tt>deflate()</tt>. It takes as many of the
	199	<tt>avail_in</tt> bytes at <tt>next_in</tt> as it can process, and writes as many as
	200	<tt>avail_out</tt> bytes to <tt>next_out</tt>. Those counters and pointers are then
	201	updated past the input data consumed and the output data written. It is the amount of
	202	output space available that may limit how much input is consumed.
	203	Hence the inner loop to make sure that
	204	all of the input is consumed by providing more output space each time. Since <tt>avail_in</tt>
	205	and <tt>next_in</tt> are updated by <tt>deflate()</tt>, we don't have to mess with those
	206	between <tt>deflate()</tt> calls until it's all used up.
	207	<p>
	208	The parameters to <tt>deflate()</tt> are a pointer to the <tt>strm</tt> structure containing
	209	the input and output information and the internal compression engine state, and a parameter
	210	indicating whether and how to flush data to the output. Normally <tt>deflate</tt> will consume
	211	several K bytes of input data before producing any output (except for the header), in order
	212	to accumulate statistics on the data for optimum compression. It will then put out a burst of
	213	compressed data, and proceed to consume more input before the next burst. Eventually,
	214	<tt>deflate()</tt>
	215	must be told to terminate the stream, complete the compression with provided input data, and
	216	write out the trailer check value. <tt>deflate()</tt> will continue to compress normally as long
	217	as the flush parameter is <tt>Z_NO_FLUSH</tt>. Once the <tt>Z_FINISH</tt> parameter is provided,
	218	<tt>deflate()</tt> will begin to complete the compressed output stream. However depending on how
	219	much output space is provided, <tt>deflate()</tt> may have to be called several times until it
	220	has provided the complete compressed stream, even after it has consumed all of the input. The flush
	221	parameter must continue to be <tt>Z_FINISH</tt> for those subsequent calls.
	222	<p>
	223	There are other values of the flush parameter that are used in more advanced applications. You can
	224	force <tt>deflate()</tt> to produce a burst of output that encodes all of the input data provided
	225	so far, even if it wouldn't have otherwise, for example to control data latency on a link with
	226	compressed data. You can also ask that <tt>deflate()</tt> do that as well as erase any history up to
	227	that point so that what follows can be decompressed independently, for example for random access
	228	applications. Both requests will degrade compression by an amount depending on how often such
	229	requests are made.
	230	<p>
	231	<tt>deflate()</tt> has a return value that can indicate errors, yet we do not check it here. Why
	232	not? Well, it turns out that <tt>deflate()</tt> can do no wrong here. Let's go through
	233	<tt>deflate()</tt>'s return values and dispense with them one by one. The possible values are
	234	<tt>Z_OK</tt>, <tt>Z_STREAM_END</tt>, <tt>Z_STREAM_ERROR</tt>, or <tt>Z_BUF_ERROR</tt>. <tt>Z_OK</tt>
	235	is, well, ok. <tt>Z_STREAM_END</tt> is also ok and will be returned for the last call of
	236	<tt>deflate()</tt>. This is already guaranteed by calling <tt>deflate()</tt> with <tt>Z_FINISH</tt>
	237	until it has no more output. <tt>Z_STREAM_ERROR</tt> is only possible if the stream is not
	238	initialized properly, but we did initialize it properly. There is no harm in checking for
	239	<tt>Z_STREAM_ERROR</tt> here, for example to check for the possibility that some
	240	other part of the application inadvertently clobbered the memory containing the <em>zlib</em> state.
	241	<tt>Z_BUF_ERROR</tt> will be explained further below, but
	242	suffice it to say that this is simply an indication that <tt>deflate()</tt> could not consume
	243	more input or produce more output. <tt>deflate()</tt> can be called again with more output space
	244	or more available input, which it will be in this code.
	245	<pre><b>
	246	ret = deflate(&strm, flush); /* no bad return value */
	247	assert(ret != Z_STREAM_ERROR); /* state not clobbered */
	248	</b></pre>
	249	Now we compute how much output <tt>deflate()</tt> provided on the last call, which is the
	250	difference between how much space was provided before the call, and how much output space
	251	is still available after the call. Then that data, if any, is written to the output file.
	252	We can then reuse the output buffer for the next call of <tt>deflate()</tt>. Again if there
	253	is a file i/o error, we call <tt>deflateEnd()</tt> before returning to avoid a memory leak.
	254	<pre><b>
	255	have = CHUNK - strm.avail_out;
	256	if (fwrite(out, 1, have, dest) != have \|\| ferror(dest)) {
	257	(void)deflateEnd(&strm);
	258	return Z_ERRNO;
	259	}
	260	</b></pre>
	261	The inner <tt>do</tt>-loop is repeated until the last <tt>deflate()</tt> call fails to fill the
	262	provided output buffer. Then we know that <tt>deflate()</tt> has done as much as it can with
	263	the provided input, and that all of that input has been consumed. We can then fall out of this
	264	loop and reuse the input buffer.
	265	<p>
	266	The way we tell that <tt>deflate()</tt> has no more output is by seeing that it did not fill
	267	the output buffer, leaving <tt>avail_out</tt> greater than zero. However suppose that
	268	<tt>deflate()</tt> has no more output, but just so happened to exactly fill the output buffer!
	269	<tt>avail_out</tt> is zero, and we can't tell that <tt>deflate()</tt> has done all it can.
	270	As far as we know, <tt>deflate()</tt>
	271	has more output for us. So we call it again. But now <tt>deflate()</tt> produces no output
	272	at all, and <tt>avail_out</tt> remains unchanged as <tt>CHUNK</tt>. That <tt>deflate()</tt> call
	273	wasn't able to do anything, either consume input or produce output, and so it returns
	274	<tt>Z_BUF_ERROR</tt>. (See, I told you I'd cover this later.) However this is not a problem at
	275	all. Now we finally have the desired indication that <tt>deflate()</tt> is really done,
	276	and so we drop out of the inner loop to provide more input to <tt>deflate()</tt>.
	277	<p>
	278	With <tt>flush</tt> set to <tt>Z_FINISH</tt>, this final set of <tt>deflate()</tt> calls will
	279	complete the output stream. Once that is done, subsequent calls of <tt>deflate()</tt> would return
	280	<tt>Z_STREAM_ERROR</tt> if the flush parameter is not <tt>Z_FINISH</tt>, and do no more processing
	281	until the state is reinitialized.
	282	<p>
	283	Some applications of <em>zlib</em> have two loops that call <tt>deflate()</tt>
	284	instead of the single inner loop we have here. The first loop would call
	285	without flushing and feed all of the data to <tt>deflate()</tt>. The second loop would call
	286	<tt>deflate()</tt> with no more
	287	data and the <tt>Z_FINISH</tt> parameter to complete the process. As you can see from this
	288	example, that can be avoided by simply keeping track of the current flush state.
	289	<pre><b>
	290	} while (strm.avail_out == 0);
	291	assert(strm.avail_in == 0); /* all input will be used */
	292	</b></pre><!-- -->
	293	Now we check to see if we have already processed all of the input file. That information was
	294	saved in the <tt>flush</tt> variable, so we see if that was set to <tt>Z_FINISH</tt>. If so,
	295	then we're done and we fall out of the outer loop. We're guaranteed to get <tt>Z_STREAM_END</tt>
	296	from the last <tt>deflate()</tt> call, since we ran it until the last chunk of input was
	297	consumed and all of the output was generated.
	298	<pre><b>
	299	/* done when last data in file processed */
	300	} while (flush != Z_FINISH);
	301	assert(ret == Z_STREAM_END); /* stream will be complete */
	302	</b></pre><!-- -->
	303	The process is complete, but we still need to deallocate the state to avoid a memory leak
	304	(or rather more like a memory hemorrhage if you didn't do this). Then
	305	finally we can return with a happy return value.
	306	<pre><b>
	307	/* clean up and return */
	308	(void)deflateEnd(&strm);
	309	return Z_OK;
	310	}
	311	</b></pre><!-- -->
	312	Now we do the same thing for decompression in the <tt>inf()</tt> routine. <tt>inf()</tt>
	313	decompresses what is hopefully a valid <em>zlib</em> stream from the input file and writes the
	314	uncompressed data to the output file. Much of the discussion above for <tt>def()</tt>
	315	applies to <tt>inf()</tt> as well, so the discussion here will focus on the differences between
	316	the two.
	317	<pre><b>
	318	/* Decompress from file source to file dest until stream ends or EOF.
	319	inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
	320	allocated for processing, Z_DATA_ERROR if the deflate data is
	321	invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
	322	the version of the library linked do not match, or Z_ERRNO if there
	323	is an error reading or writing the files. */
	324	int inf(FILE source, FILE dest)
	325	{
	326	</b></pre>
	327	The local variables have the same functionality as they do for <tt>def()</tt>. The
	328	only difference is that there is no <tt>flush</tt> variable, since <tt>inflate()</tt>
	329	can tell from the <em>zlib</em> stream itself when the stream is complete.
	330	<pre><b>
	331	int ret;
	332	unsigned have;
	333	z_stream strm;
[f1c8de9]	334	unsigned char in[CHUNK];
	335	unsigned char out[CHUNK];
[959f7df2]	336	</b></pre><!-- -->
	337	The initialization of the state is the same, except that there is no compression level,
	338	of course, and two more elements of the structure are initialized. <tt>avail_in</tt>
	339	and <tt>next_in</tt> must be initialized before calling <tt>inflateInit()</tt>. This
	340	is because the application has the option to provide the start of the zlib stream in
	341	order for <tt>inflateInit()</tt> to have access to information about the compression
	342	method to aid in memory allocation. In the current implementation of <em>zlib</em>
	343	(up through versions 1.2.x), the method-dependent memory allocations are deferred to the first call of
	344	<tt>inflate()</tt> anyway. However those fields must be initialized since later versions
	345	of <em>zlib</em> that provide more compression methods may take advantage of this interface.
	346	In any case, no decompression is performed by <tt>inflateInit()</tt>, so the
	347	<tt>avail_out</tt> and <tt>next_out</tt> fields do not need to be initialized before calling.
	348	<p>
	349	Here <tt>avail_in</tt> is set to zero and <tt>next_in</tt> is set to <tt>Z_NULL</tt> to
	350	indicate that no input data is being provided.
	351	<pre><b>
	352	/* allocate inflate state */
	353	strm.zalloc = Z_NULL;
	354	strm.zfree = Z_NULL;
	355	strm.opaque = Z_NULL;
	356	strm.avail_in = 0;
	357	strm.next_in = Z_NULL;
	358	ret = inflateInit(&strm);
	359	if (ret != Z_OK)
	360	return ret;
	361	</b></pre><!-- -->
	362	The outer <tt>do</tt>-loop decompresses input until <tt>inflate()</tt> indicates
	363	that it has reached the end of the compressed data and has produced all of the uncompressed
	364	output. This is in contrast to <tt>def()</tt> which processes all of the input file.
	365	If end-of-file is reached before the compressed data self-terminates, then the compressed
	366	data is incomplete and an error is returned.
	367	<pre><b>
	368	/* decompress until deflate stream ends or end of file */
	369	do {
	370	</b></pre>
	371	We read input data and set the <tt>strm</tt> structure accordingly. If we've reached the
	372	end of the input file, then we leave the outer loop and report an error, since the
	373	compressed data is incomplete. Note that we may read more data than is eventually consumed
	374	by <tt>inflate()</tt>, if the input file continues past the <em>zlib</em> stream.
	375	For applications where <em>zlib</em> streams are embedded in other data, this routine would
	376	need to be modified to return the unused data, or at least indicate how much of the input
	377	data was not used, so the application would know where to pick up after the <em>zlib</em> stream.
	378	<pre><b>
	379	strm.avail_in = fread(in, 1, CHUNK, source);
	380	if (ferror(source)) {
	381	(void)inflateEnd(&strm);
	382	return Z_ERRNO;
	383	}
	384	if (strm.avail_in == 0)
	385	break;
	386	strm.next_in = in;
	387	</b></pre><!-- -->
	388	The inner <tt>do</tt>-loop has the same function it did in <tt>def()</tt>, which is to
	389	keep calling <tt>inflate()</tt> until has generated all of the output it can with the
	390	provided input.
	391	<pre><b>
	392	/* run inflate() on input until output buffer not full */
	393	do {
	394	</b></pre>
	395	Just like in <tt>def()</tt>, the same output space is provided for each call of <tt>inflate()</tt>.
	396	<pre><b>
	397	strm.avail_out = CHUNK;
	398	strm.next_out = out;
	399	</b></pre>
	400	Now we run the decompression engine itself. There is no need to adjust the flush parameter, since
	401	the <em>zlib</em> format is self-terminating. The main difference here is that there are
	402	return values that we need to pay attention to. <tt>Z_DATA_ERROR</tt>
	403	indicates that <tt>inflate()</tt> detected an error in the <em>zlib</em> compressed data format,
	404	which means that either the data is not a <em>zlib</em> stream to begin with, or that the data was
	405	corrupted somewhere along the way since it was compressed. The other error to be processed is
	406	<tt>Z_MEM_ERROR</tt>, which can occur since memory allocation is deferred until <tt>inflate()</tt>
	407	needs it, unlike <tt>deflate()</tt>, whose memory is allocated at the start by <tt>deflateInit()</tt>.
	408	<p>
	409	Advanced applications may use
	410	<tt>deflateSetDictionary()</tt> to prime <tt>deflate()</tt> with a set of likely data to improve the
	411	first 32K or so of compression. This is noted in the <em>zlib</em> header, so <tt>inflate()</tt>
	412	requests that that dictionary be provided before it can start to decompress. Without the dictionary,
	413	correct decompression is not possible. For this routine, we have no idea what the dictionary is,
	414	so the <tt>Z_NEED_DICT</tt> indication is converted to a <tt>Z_DATA_ERROR</tt>.
	415	<p>
	416	<tt>inflate()</tt> can also return <tt>Z_STREAM_ERROR</tt>, which should not be possible here,
	417	but could be checked for as noted above for <tt>def()</tt>. <tt>Z_BUF_ERROR</tt> does not need to be
	418	checked for here, for the same reasons noted for <tt>def()</tt>. <tt>Z_STREAM_END</tt> will be
	419	checked for later.
	420	<pre><b>
	421	ret = inflate(&strm, Z_NO_FLUSH);
	422	assert(ret != Z_STREAM_ERROR); /* state not clobbered */
	423	switch (ret) {
	424	case Z_NEED_DICT:
	425	ret = Z_DATA_ERROR; /* and fall through */
	426	case Z_DATA_ERROR:
	427	case Z_MEM_ERROR:
	428	(void)inflateEnd(&strm);
	429	return ret;
	430	}
	431	</b></pre>
	432	The output of <tt>inflate()</tt> is handled identically to that of <tt>deflate()</tt>.
	433	<pre><b>
	434	have = CHUNK - strm.avail_out;
	435	if (fwrite(out, 1, have, dest) != have \|\| ferror(dest)) {
	436	(void)inflateEnd(&strm);
	437	return Z_ERRNO;
	438	}
	439	</b></pre>
	440	The inner <tt>do</tt>-loop ends when <tt>inflate()</tt> has no more output as indicated
[8198f69]	441	by not filling the output buffer, just as for <tt>deflate()</tt>. In this case, we cannot
	442	assert that <tt>strm.avail_in</tt> will be zero, since the deflate stream may end before the file
	443	does.
[959f7df2]	444	<pre><b>
	445	} while (strm.avail_out == 0);
	446	</b></pre><!-- -->
	447	The outer <tt>do</tt>-loop ends when <tt>inflate()</tt> reports that it has reached the
	448	end of the input <em>zlib</em> stream, has completed the decompression and integrity
	449	check, and has provided all of the output. This is indicated by the <tt>inflate()</tt>
	450	return value <tt>Z_STREAM_END</tt>. The inner loop is guaranteed to leave <tt>ret</tt>
	451	equal to <tt>Z_STREAM_END</tt> if the last chunk of the input file read contained the end
	452	of the <em>zlib</em> stream. So if the return value is not <tt>Z_STREAM_END</tt>, the
	453	loop continues to read more input.
	454	<pre><b>
	455	/* done when inflate() says it's done */
	456	} while (ret != Z_STREAM_END);
	457	</b></pre><!-- -->
	458	At this point, decompression successfully completed, or we broke out of the loop due to no
	459	more data being available from the input file. If the last <tt>inflate()</tt> return value
	460	is not <tt>Z_STREAM_END</tt>, then the <em>zlib</em> stream was incomplete and a data error
	461	is returned. Otherwise, we return with a happy return value. Of course, <tt>inflateEnd()</tt>
	462	is called first to avoid a memory leak.
	463	<pre><b>
	464	/* clean up and return */
	465	(void)inflateEnd(&strm);
	466	return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
	467	}
	468	</b></pre><!-- -->
	469	That ends the routines that directly use <em>zlib</em>. The following routines make this
	470	a command-line program by running data through the above routines from <tt>stdin</tt> to
	471	<tt>stdout</tt>, and handling any errors reported by <tt>def()</tt> or <tt>inf()</tt>.
	472	<p>
	473	<tt>zerr()</tt> is used to interpret the possible error codes from <tt>def()</tt>
	474	and <tt>inf()</tt>, as detailed in their comments above, and print out an error message.
	475	Note that these are only a subset of the possible return values from <tt>deflate()</tt>
	476	and <tt>inflate()</tt>.
	477	<pre><b>
	478	/* report a zlib or i/o error */
	479	void zerr(int ret)
	480	{
	481	fputs("zpipe: ", stderr);
	482	switch (ret) {
	483	case Z_ERRNO:
	484	if (ferror(stdin))
	485	fputs("error reading stdin\n", stderr);
	486	if (ferror(stdout))
	487	fputs("error writing stdout\n", stderr);
	488	break;
	489	case Z_STREAM_ERROR:
	490	fputs("invalid compression level\n", stderr);
	491	break;
	492	case Z_DATA_ERROR:
	493	fputs("invalid or incomplete deflate data\n", stderr);
	494	break;
	495	case Z_MEM_ERROR:
	496	fputs("out of memory\n", stderr);
	497	break;
	498	case Z_VERSION_ERROR:
	499	fputs("zlib version mismatch!\n", stderr);
	500	}
	501	}
	502	</b></pre><!-- -->
	503	Here is the <tt>main()</tt> routine used to test <tt>def()</tt> and <tt>inf()</tt>. The
	504	<tt>zpipe</tt> command is simply a compression pipe from <tt>stdin</tt> to <tt>stdout</tt>, if
	505	no arguments are given, or it is a decompression pipe if <tt>zpipe -d</tt> is used. If any other
	506	arguments are provided, no compression or decompression is performed. Instead a usage
	507	message is displayed. Examples are <tt>zpipe < foo.txt > foo.txt.z</tt> to compress, and
	508	<tt>zpipe -d < foo.txt.z > foo.txt</tt> to decompress.
	509	<pre><b>
	510	/* compress or decompress from stdin to stdout */
	511	int main(int argc, char **argv)
	512	{
	513	int ret;
	514
[f1c8de9]	515	/* avoid end-of-line conversions */
	516	SET_BINARY_MODE(stdin);
	517	SET_BINARY_MODE(stdout);
	518
[959f7df2]	519	/* do compression if no arguments */
	520	if (argc == 1) {
	521	ret = def(stdin, stdout, Z_DEFAULT_COMPRESSION);
	522	if (ret != Z_OK)
	523	zerr(ret);
	524	return ret;
	525	}
	526
	527	/* do decompression if -d specified */
	528	else if (argc == 2 && strcmp(argv[1], "-d") == 0) {
	529	ret = inf(stdin, stdout);
	530	if (ret != Z_OK)
	531	zerr(ret);
	532	return ret;
	533	}
	534
	535	/* otherwise, report usage */
	536	else {
	537	fputs("zpipe usage: zpipe [-d] < source > dest\n", stderr);
	538	return 1;
	539	}
	540	}
	541	</b></pre>
	542	<hr>
[f1c8de9]	543	<i>Copyright (c) 2004, 2005 by Mark Adler<br>Last modified 11 December 2005</i>
[959f7df2]	544	</body>
	545	</html>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: