In the past, I’ve always hacked my own XML output functions. The result wasn’t always good XML, and it took a lot of fprintf()-massaging.
Then I needed a DOM parser for C (not C++ or C#), and the only one I really liked was libxml. It’s got the proper license for me to use it, it’s simple to use, and has botth DOM and SAX parsers.
Here’s a libxml example of how to make your own xml output, taken from the Eressea II sources (you’ll find examples for making your own parser everywhere):
#include <libxml/tree.h>
int main(int argc, char** argv) {
xmlDocPtr doc = xmlNewDoc(BAD_CAST "1.0");
xmlNodePtr node = xmlNewNode(NULL, BAD_CAST "eressea");
xmlNewProp(node, BAD_CAST "game", xml_s("Ümläutß"));
xmlAddChild(node, xmlNewNode(NULL, BAD_CAST(
xmlDocSetRootElement(doc, node);
xmlKeepBlanksDefault(0);
xmlSaveFormatFile(argv[1], doc, 1);
xmlFreeDoc(doc);
}
That BAD_CAST is just a macro to convert char* into (xmlChar*), and you write it whenever you think that your input is already good UTF-8 and are too lazy to convert. Please see Joel’s article on Unicode first. For places where I don’t have that guarantee, my code uses iconv, a character conversion library to convert the internal char* to UTF-8. Here’s an iconv example for the xml_s function used above:
#include <iconv.h>
iconv_t utf8;
xmlChar* xml_s(const char * str)
{
static char buffer[1024]; /* it's enough */
const char * inbuf = str;
char * outbuf = buffer;
size_t inbytes = strlen(str)+1, outbytes = sizeof(buffer);
iconv(utf8, &inbuf, &inbytes, &outbuf, &outbytes);
return (xmlChar*)buffer;
}
int main(int argc, char** argv) {
utf_8 = iconv_open("UTF-8", "");
puts(xml_s("ä߀"));
iconv_close(utf8);
}
That’s so much more fun than fprintf-wrangling.