Database
Jørgen Andreas Bangor
jorgenb at ifi.uio.no
Fri Oct 21 02:12:49 CET 1994
You can count me in too on the discussion of the Disney
comics Database. I might not contribute much to the
discussion, but I will certainly be interested in seeing
what you are doing.
Since I am working on this WWW index, I have been thinking
about what format I should use. I have thought up something,
but after having taken a look at the index, I think I like
yours better than mine.
What I was thinking about was to give each issue a unique
code, which I see you have done too, then put information
about stories, titles, writers/artists etc. into files for
each sort of information.
Ex: Donald Duck & Co (dummy examples)
2000 D88215 H8000 S2600 I1500
2001 H88515 D2100 ... ...
etc.
Writers
D88215 Day of the Triplets W:John Anderson A:?
D88216 ....
etc.
Published
D88215 2000 (published in no. 2000)
D88216 ...
etc.
Issues
2000 no. 15 12. april 1994
2001 no. 16 19. april 1994
etc.
Then it would be easy to make a script that gets the information
the user wants form these files. The files alone won't be too
human readable though... and with this many files, filled with
a lot of information, it could be argued that it would be a rather
big job for the computer (but with SPARC-20's who cares...).
Well, I might just translate the Swedish files into Norwegian.
About transfering files over the net. It's no use in trying to
transfer Latin-1 (8-bit) with mail. The SMTP (mail protocol) is
built for 7-bit transfer. A lot of smart people (and less smart)
have done a lot of (different) things to solve this problem, and
that's the real problem. There are no standards.
If you look close on the headers in this mail (if they still are
there when it reach you) you might see something like 'J=F8rgen'.
'=F8' is MIME's way to send 'oe' over the net as 7-bit. When it
reaches a place that uses MIME, it will be restored to 'oe' (o with
slash...).
The simplest solution to this is to use uuencode, as Fredrik
suggested, or to put it on an FTP-site, and set FTP to 8-bit mode.
In spite of the problems with 8-bit characters, I would still suggest
that they should be used. If this is really going to be a "universal"
data base, we better use them from the beginning. Most of the languages
that probably would appear in the data base need them.
Just my opinions :-)
Jorgen A. Bangor (jorgenb at ifi.uio.no)
More information about the DCML
mailing list