Compression involves some heavy theory, particularly lossy compression where it may be very difficult to come up with a fast heuristic to determine what information is acceptable to lose. This page seems to have a good run-down to begin with: http://www.data-compression.com/theory.html
Converting between two file formats is much more straightforward. Basically you just need to be able to create arbitrary documents of both types. Represent the same information in both documents manually and look at the saved format to see if you can draw parallels. Make very small changes to each document and use a utility like 'diff' to analyze their effect on the output. Eventually you should establish a rough mapping between the two formats, after which it's quite simple to automate it. It'll probably be completely broken for non-trivial cases at first, but unit testing can get you the rest of the way once you've got the basic working.