Database

Jørgen Andreas Bangor jorgenb at ifi.uio.no
Fri Oct 21 02:12:49 CET 1994


You can count me in too on the discussion of the Disney
comics Database. I might not contribute much to the 
discussion, but I will certainly be interested in seeing
what you are doing. 

Since I am working on this WWW index, I have been thinking
about what format I should use. I have thought up something, 
but after having taken a look at the index, I think I like
yours better than mine.

What I was thinking about was to give each issue a unique 
code, which I see you have done too, then put information 
about  stories, titles, writers/artists etc. into files for 
each sort of information.

Ex: Donald Duck & Co      (dummy examples)

2000  D88215   H8000  S2600 I1500               
2001  H88515   D2100  ... ... 
etc.


Writers

D88215   Day of the Triplets  W:John Anderson  A:?
D88216   ....
etc.


Published

D88215     2000                    (published in no. 2000)
D88216 ...
etc.


Issues

2000     no. 15  12. april 1994
2001     no. 16  19. april 1994
etc.


Then it would be easy to make a script that gets the information
the user wants form these files. The files alone won't be too
human readable though...   and with this many files, filled with 
a lot of  information, it could be argued that it would be a rather
big job for the computer (but with SPARC-20's who cares...).

Well, I might just translate the Swedish files into Norwegian.

About transfering files over the net. It's no use in trying to 
transfer Latin-1 (8-bit) with mail. The SMTP (mail protocol) is
built for 7-bit transfer. A lot of smart people (and less smart)
have done a lot of (different) things to solve this problem, and
that's the real problem. There are no standards. 
If you look close on the headers in this mail (if they still are
there when it reach you) you might see something like 'J=F8rgen'.
'=F8' is MIME's way to send 'oe' over the net as 7-bit. When it
reaches a place that uses MIME, it will be restored to 'oe' (o with 
slash...).

The simplest solution to this is to use uuencode, as Fredrik 
suggested, or to put it on an FTP-site, and set FTP to 8-bit mode.

In spite of the problems with 8-bit characters, I would still suggest
that they should be used. If this is really going to be a "universal"
data base, we better use them from the beginning. Most of the languages
that probably would appear in the data base need them.

Just my opinions :-)



   Jorgen A. Bangor (jorgenb at ifi.uio.no)



More information about the DCML mailing list