PDA

View Full Version : The Dictionary Problem


codegoboom
03-22-2005, 05:21 PM
What's a good data structure for looking up definitions by category as well as alphabetically?

Tablular, it would seem, but a giant table would just be huge! :D

gsnedders
03-22-2005, 05:53 PM
How about something like: <h2>Web Stuff</h2>
<dl>
<dt>CSS</dt>
<dd>Cascading Style Sheets, style sheets when attached to documents describe how the document is displayed or printed, e.g. a CSS sheet is attached to an HTML document, to influence its layout when accessed via a browser. CSS supports cascading, i.e. a single document may use two or more style sheets that are than applied according to specified priorities</dd>
<dt>XHTML</dt>
<dd>Extensible Hypertext Markup Language - A reformulation of HTML 4.0 in XML 1.0. XHTML is a new language for building web pages that has recently been proposed as a W3C Recommendation. This proposed Recommendation caused lots of debate on account of XHTML's usage of XML namespaces.</dd>
</dl>
<h2>Programming Languages</h2>
<dl>
<dt>C++</dt>
<dd>An industry standard object-oriented compiled language, formally standardized in 1998, but tracing its history to the early 1980s, with an heritage in C and Simula. C++ is a general-purpose programming language with a bias towards systems programming. C++ runs on most computers from the most powerful supercomputers to the ubiquitous personal computers.</dd>
<dt>Java</dt>
<dd>Developed by Sun Microsystems, Java is a programming language that is specifically designed for writing programs that can be safely downloaded to your computer through the Internet and immediately run without fear of viruses or other harm to your computer or files. Using small Java programs (called "Applets"), Web pages can include functions such as animations, calculators, and other fancy tricks. Java is a simple, robust, object-oriented, platform-independent multi-threaded, dynamic general-purpose programming environment. It is best for creating applets and applications for the Internet, intranets and any other complex, distributed network</dd>
</dl>


I think you get the idea :D

codegoboom
03-22-2005, 06:18 PM
Thanks. :)

What about "looking up" in terms of searching, or updating (as there may be thousands of listings)... anyone been there, done that?
...or heard of it? :D

gsnedders
03-22-2005, 08:13 PM
Database?

codegoboom
03-23-2005, 12:56 PM
Just wondering how to avoid loading all data, without splitting up its storage, for the most part (sorting aside); whatever structure is good for that, I'm not sure.

not a quiz... ;)

ronaldb66
03-23-2005, 01:21 PM
I think a table (as in data structure, not <table>) would offer the best possibilities; lay an index both on decription (alphabetic) and on category, and offer pages to display the content using either entry.
Come to think of it: if you want to assign multiple categories to a single definition -and you probably do, you'll probably looking at three tables: one for the definitions, one for definition-category key pairs, and one for the categories themselves. Unless you're willing to put up with some redundancy, of course.

codegoboom
03-23-2005, 02:49 PM
That sounds good, but what's getting me stuck is both the 'terms' and 'descriptions' are within a single text file (with non-standard mixtures of markup & delimiting), and apparently there's no way to get "substreams" out of the source, so tabulation of all (tons) of data would be necessary--unless each definition had its own file; but then, updates/corrections to the origninal text (which is maintained externally) would be difficult to transfer among multiple texts, so keeping the data together--while exposing it selectively would be ideal; therefore, if a different structure/format would work better for this, maybe I can transform the source, rather than splitting it, and apply updates more reliably, as such. :D

ronaldb66
03-23-2005, 02:57 PM
How about storing only links in said tables that lead to the desired snippet of the main text file via anchors? That would require the text file to be marked up, but if you want to make it available on the web that sounds like a requirement anyway (although you didn't say you wanted so initially... ;) )

codegoboom
03-23-2005, 03:37 PM
It's not that kind of markup, and anyway, that type of thing probably wouldn't do--unless I could figure out how not to load the whole main text (offline, as it were), by extracting partial data from a mini-stream of some sort... ah well it looks like an XML interface might allow for that (even though it hints otherwise)! ;)