Encoding error in GIF LZW compressed stream

I'm writing a C library to export SDL_Surfaces to various image formats as an exercise. Now I'm working on the GIF format and I feel I'm very close to getting it working, but I've been at this for a while with no luck. I've already re-read all the specs, wiki entries, various other internet articles on the subject and tried to debug it in all sorts of ways. My implementation is a modified version of this one.
The specs I've been using are here. Appendix F contains all the details of how the LZW compressed byte stream is created. My current problem is writing the GIF LZW compressed image data sub-blocks. I've tried different files to see if that had anything to do with it. Everything goes smooth until some position (usually in the first sub-block) where the byte is incorrect compared to the original file. Sometimes the stream 'syncs' up again (but then becomes a mess later on) and sometimes it doesn't. I changed the way I pack bits into the compressed LZW byte-stream (in LZW_PackBits) to a more efficient one, but since it didn't result in a change in the file, I think my error(s) lie somewhere in the dictionary reading/creation (or the error is still present in the bit-packing). I realize this question strongly resembles a "fix my code for me" question, but I have tried so many things with no luck, and I really need some help or suggestions. I'll be happy to explain anything if necessary since I assume few people have actually worked with the GIF version of LZW. The code has been stripped of debugging stuff and all the code that writes the headers, color tables etc. #include "SDL_iw.h" #include #include #include #define LZW_START_BITS 9 #define LZW_MAX_BITS 12 #define LZW_ALPHABET_SIZE 256 #define LZW_CLEAR_CODE 256 #define LZW_END_CODE 257 #define LZW_FLUSH 1 << LZW_MAX_BITS // Invalid code signals to flush the LZW byte buffer #define BUFFER_SIZE 255 typedef struct { Uint16 next[LZW_ALPHABET_SIZE]; } lzw_entry; static Uint8 *buffer = NULL; /* Packs an LZW codepoint of 'bits' length into a buffer as an LZW byte stream */ int LZW_PackBits(SDL_RWops *dst, Uint16 codepoint, Uint8 bits) { static Uint32 bit_count = 0; static Uint32 byte_count = 0; static Uint32 bit_reservoir = 0; if (codepoint == LZW_FLUSH && byte_count > 0) { SDL_RWwrite(dst, &byte_count, 1, 1); SDL_RWwrite(dst, buffer, 1, byte_count); memset(buffer, byte_count, byte_count); byte_count = 0; } else { if (bits < LZW_START_BITS || bits > LZW_MAX_BITS) { IW_SetError("Bit length %d out of bounds in LZW_PackBits", bits); return 0; } bit_reservoir |= codepoint << bit_count; bit_count += bits; while (bit_count >= 8) { buffer[byte_count] = bit_reservoir; ++byte_count; bit_count -= 8; bit_reservoir >>= 8; if (byte_count == 255) { SDL_RWwrite(dst, &byte_count, 1, 1); SDL_RWwrite(dst, buffer, 1, byte_count); memset(buffer, 0, byte_count); byte_count = 0; } } } return 1; } int IW_WriteGIF_RW(SDL_Surface *surface, SDL_RWops *dst, int freedst) { // Header, color table etc. are written here... const Uint8 bpp = 8; const Uint16 clear_code = LZW_CLEAR_CODE; const Uint16 end_code = LZW_END_CODE; const Uint8 zero_byte = 0; SDL_RWwrite(dst, &bpp, 1, 1); int table_size = LZW_FLUSH; lzw_entry *lzw_table = (lzw_entry*)malloc(sizeof(lzw_entry) * table_size); if (!lzw_table) { IW_SetError("Out of memory: Failed to allocate LZW table\n"); goto done; } for (i = 0; i < table_size; ++i) // i declared earlier... memset(&lzw_table[i].next[0], 0, sizeof(Uint16) * LZW_ALPHABET_SIZE); buffer = (Uint8*)malloc(BUFFER_SIZE); if (!buffer) { IW_SetError("Out of memory: Failed to allocate byte buffer\n"); goto done; } memset(buffer, 0, BUFFER_SIZE); Uint16 next_code = LZW_END_CODE + 1; Uint8 out_len = LZW_START_BITS; Uint8 next_byte = 0; Uint16 input = 0; Uint16 nc = 0; // Output a clear code LZW_PackBits(dst, clear_code, out_len); if (SDL_MUSTLOCK(surface)) SDL_LockSurface(surface); Uint8 *pos = (Uint8*)surface->pixels; Uint8 *end = (Uint8*)(surface->pixels + surface->pitch * surface->h); input = *pos++; while (pos < end) { next_byte = *pos++; nc = lzw_table[input].next[next_byte]; if (nc > 0) { input = nc; } else { LZW_PackBits(dst, input, out_len); lzw_table[input].next[next_byte] = next_code++; input = next_byte; if (next_code == (1 << out_len)) { // Next code requires more bits ++out_len; if (out_len > LZW_MAX_BITS) { LZW_PackBits(dst, clear_code, out_len - 1); out_len = LZW_START_BITS; next_code = LZW_END_CODE + 1; for (i = 0; i < table_size; ++i) memset(&lzw_table[i].next[0], 0, sizeof(Uint16) * LZW_ALPHABET_SIZE); } } } } if (SDL_MUSTLOCK(surface)) SDL_UnlockSurface(surface); // Pack remaining data and end-of-data marker LZW_PackBits(dst, input, out_len); LZW_PackBits(dst, end_code, out_len); LZW_PackBits(dst, LZW_FLUSH, 0); SDL_RWwrite(dst, &zero_byte, 1, 1); // Write trailer const Uint8 trailer = 0x3b; // ';' SDL_RWwrite(dst, &trailer, 1, 1); done: free(buffer); free(lzw_table); if (freedst) SDL_RWclose(dst); return 1; }

Posted On: Friday 7th of December 2012 08:41:30 PM Total Views:  3681
View Complete with Replies

Related Messages:

Character encoding conversion programs   (802 Views)
I've written a set of these conversion programs. Please give some advices.
standard encoding for float?   (722 Views)
Is the encoding for float platform independent That is, if I take the four bytes, stick them into an ethernet packet send them to another machine, then (after endian swapping and alignment), set a float pointer to point at those four bytes, will I get the same value It seems to work on the plaforms I have, but this is suppose to work for any machine types -- so I guess my question is, is the encoding of float standardized
Lost in encoding stuff   (561 Views)
I am a bit list in encoding related stuff. Let me explain what I am doing (yes it's C++ ): I am getting some input content due Expat Xml Parser. I've setup Expat to use wchar_t. First question is this -- what is the difference of unsigned short, wchar_t and char Okay, wchar_t is an built-in type of C++ and its two bytes of size whereas char is always one byte. But what's the real difference when storing Text into those types i.e. ASCII, UTF-8, UTF-16 or UTF-32 encoded text Afaik, UTF-8 is 2 bytes, UTF-16 is 2 bytes and UTF-32 is up to four bytes Well anyway, my issue is how to correctly work with those types. Internally I am using wchar_t for all my representations but depending on the encoding I need to shift a current char value bitwise, right Okay next one -- I am storing everything of my wchar_t array into a stream of type char, doing so by a simple memcpy. Now how could I read it back in Say I have char* buffer where my wchar_t string is saved in. I could surely do a simply memcpy(myWcharVar, buffer, sizeof(wchar_t)) to get two bytes but this doesn't seem to be very efficient as I'd like to read it char by char (like wchar_t nx =, know what I mean). And then after having read such a char, I must be able to correctly encode it. I know the encoding whether its ASCII, UTF-8, 16 or anything but how would I go about it *without* using any big libraries
ldap_get_values: converting UTF8 encoding to ANSI MBCS string on UNIX systems   (778 Views)
I am using ldap_get_values() call to get the user attributes from LDAP. This call is returning the user attributes in UTF-8 encoding and its a PCHAR*. For normal English characters this is working well. When Multibyte characters are involved like Japanese, Chinese or Korean, I need to convert UTF8 to ANSI encoding to get the correct values. On Windows platform I am using MultiByteToWideChar() with the code page CP_UTF8 to convert it to wide character string and converting it back to ANSI string using the ATL macro W2A() with USES_CONVERSION. I need to do the same conversion on UNIX machines. I can think of mbstowcs() and wcstombs(). However these two calls don't change the encoding. Can someone let me know how do I change the encoding from UTF-8 to ANSI string on UNIX platforms (or same call which works on both UNIX & Windows)
problem with microsoft C compiler doesn`t accept things gcc does,how to solve? (encoding)   (697 Views)
Here is the example code. int main(int argc, char *argv[]) { string Result; WIN32_FIND_DATA daten; HANDLE h = FindFirstFile(TEXT("c://test"), &daten); system("PAUSE"); return EXIT_SUCCESS; } It works fine with DevCpp and gcc. The error with microsoft C compiler is that he can`t convert from string to LPCSTR. I think the problem is inside the encoding, ansi, unicode, ... Found some ways to avoid this error but all are not very awesome. Please tell me the best way to solve this.
String literal encoding - What can it be?   (596 Views)
Dear group, I need to know what the string literal encoding and maybe the wide string literal encoding can be according to the standard. Is is totally implementation specific If so are there any means (portable or not) to detect the encoding which is used Cheers, Philipp -- [ See for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
Get encoding name   (620 Views)
Hi everybody, Would anybody of you please tell me how can I know the encoding name of a file with in which I saved a text file Like - I saved a file with name "test.text" and encoding unicode. I want to extract "unicode" or whatever we take encoding while saving.
API for encoding/decoding to base64   (527 Views)
I am searching for Win32 (or WinMobile) API for converting ASCII string to Base64 and vice-versa can anyone help me out....
library with codecvt for state depending encoding   (737 Views)
I'm searching for a c++ library with support for state dependent cadecvt facets. My current environment is FC8 with g++ 4.1.2 and the corresponding libstdc++ lib. This offers all kind of codecvt facets supported by the underlying "locale" system. However none of the code conversion tables seem to support state dependent I/O (e.g. ISO-2022-JP) as far as I can tell. There is an iconv based gnu specific codecvt specialization which seems to be more than buggy. Is there some library around which supports something like this: std::locale jp_loc("ja_JP.ISO-2022-JP"); typedef std::codecvt codecvt_type; codecvt_type * cv = std::use_facet( &jp_loc); Thank you very much indeed O. -- [ See for info about ] [ comp.lang.c++.moderated. First time posters: Do this! ]
wstring, wofstream, and encodings   (554 Views)
"Jeffrey Walton" wrote in message >I'm attempting to write a wstring to a file by way of wofstream. I'm >getting compression on the stream (I presume it is UTF-8). How/where >do I invoke an alternate constructotor so that the stream stays wide >(UTF-16) > >I suspect that it is hidden in a locale, but I don't have much >experience with them. I also have not been able to locate it in >Stroustrup: Appendix D: Locales [1]. [1] does state the following, but >I do not have section 21.7: "Section 21.7 describes how to change >locale for a stream; this appendix describes how a locale is >constructed out of facets and explains the mechanisms through which a >locale affects its stream." From my understanding of iostream, locales will not be the answer. Locales apply to the upper layer of the iostream, which takes care of converting values to characters. They affect the choice of the characters used to represent a value, but not the encoding of these characters. The internal filebuf or basic_filebuf is the object that will determine how the in-memory characters are wirtten to a file. This is the layer (the stream *buffer*) that can define whether a file is written using UTF8 or another character encoding. However, the C++ standard does not specify an interface allowing to select what character encoding is to be used by (w)filebuf. Your best bet would be to ask your question on a platform- specific forum, related to the library implementation you use. A specific wfilebuf (/basic_filebuf) implementation may allow you to specify the file's enocding style. Or maybe this is configurable at an OS or C library level. Worst case, you will still be able to write your own streambuf layer to write files using the specific encoding you want. I hope this helps... Ivan --
Problem with user-defined IO streambuf for encoding purposes   (619 Views)
Hi , I'm having some difficulty with the following piece of code. I have stripped it to it's bare minimum to demonstrate the problem at hand. Compiler: MS Visual C++ 2005 Express Edition (similar problem arises with 2008) Runtime Library: multi-threaded variants have been seen to fail [DLL/Static] | [Debug|Release] Purpose: define a user defined stream buffer that processes each incoming character and translates it to an encoded value. Place the encoded character into a local buffer for output. The most simple case would be an encoder that translates each character to upper case. A more complicated case would be an encoder that encodes plain-text to base64 encoding (this is not a one-to-one character encoding, it's a 3 character to 4character encoding, this is why an internal buffer is needed) Problem: The code throws an "Unhandled exception at 0x00529bcc in hsl_d.exe: 0xC0000005: Access violation reading location 0x00000000." after the "encoderbuf::underflow c = 51" character '3' has been read. This basically tells me that deep down in the internals of the IO library something is being dereferenced that is not allocated. Question(s): 1) Does this occur with other compilers (requires testing) 2) Is this a problem with the IO library (unlikely I think) 3) Am I doing something stupid (more than likely) And if so what References: C++ Standard Library - A Tutorial and Reference (13.13.3 User-Defined Stream Buffers) Code: #include #include class encoderbuf : public std::streambuf { char mCharBuf[128]; int mBufLen; int mBufPos; public: //-------------------------------------------------------------- /** default constructor */ encoderbuf() : std::streambuf() , mBufLen(0) , mBufPos(0) { } //-------------------------------------------------------------- /** outgoing data */ virtual int_type underflow () { int_type c = EOF; if (mBufPos < mBufLen) { c = mCharBuf[mBufPos++]; } std::cout
Dealing with string encodings   (605 Views)
Question , I may be slightly off-topic with this but I'm not really sure where else to go with this. what's the "best/easiest" ways to deal with string encodings Right now, I'm using wstring for all my string operations outside the GUI and this works really well. Also using it for file io. Problem is this though: It's defined as wchar_t which inherently isn't a problem, except that wchar_t is 32-bit under linux and 16-bit under windows. That difference right there is going to break my file IO code as I target both platforms. utf-8 is another option but I don't terribly like it much due to the fact that each character can have a different width in bytes. That is one of the things I like the most about 32-bit wchar_t. No matter what, each character will always be *one* wchar_t. Makes indexing into a string simple and painless. But still, with one platform/compiler (not sure on which level the difference is) having wchar_t 32-bit and the other 16-bit, this is slowly turning into a nightmare. Any suggestions would be very appreciated,
Reading text file with correct text encoding mode   (619 Views)
I can read from text file using this code but i'm unabled to read Turkish or other non-English codes, how can i read non-English chars and set the default encoder parameter as if it's working fine in Windows Notepad's ...
Why there is an error?   (590 Views)
, recently I read the book "c++ template complete guide", there is a slice code that I am quite confused. (pasted here). Why there is an error Actually I can compile it on Linux, and there is only an ...
Strange error with iterators   (585 Views)
I've wrote this function which should add a comma for every 3 digits in a number (so that it looks something like 5,000). This is my function: std::string formatNumber(int number) { // Convert the int to a string. std::string ...
ICC with -cxxlib-gcc --- results in ld: /usr/lib/crtbeginS.o: No such file error   (876 Views)
Hi , I am compiling my c++ code with ICC 8.0 compiler with -cxxlib-gcc option...But this results in following error.. /usr/lib/crtbeginS.o : No such file: No such file or directory Though I am including the Path for crtbeginS.o . Code ...
C# calling C++ error (instruction at referenced memory could not be read)   (618 Views)
I have a C# assembly that is calling directly into C++ unmanaged code. Everything is working fine but at the very end of the process I get an application error, which says: "The instruction at (hexNumber) referenced memory at ...
compile time errors due to changes in stl_list.h code   (580 Views)
I am trying to get rid of compile time error that I am getting only in RHEL5 (not in RHEL4) apparently due to the changes in the stl_list.h file. The error that I am getting is coming from the following ...
specialaizing templates --error   (556 Views)
, I 'm writitng a simple program which contains the following template template int exch(T& t1 ,T& t2){ T tmp ; tmp = t1; t1 = t2; t2 = tmp; return 1; } When i run this for ...
New header leads to name clashes   (665 Views)
The most recent state of as well as the current draft N2284 (section [syserr], p.2) proposes a new enumeration type posix_errno immediatly in the namespace std. One of the enumerators has the name invalid_argument, or fully qualified: std::invalid_argument. This ...