This document presents the structures of the original 4D database of the CNRS Ethnomusicology laboratory (LEM).

4D database structure

Diagram showing the different tables and their relations:

4D database diagram

Data tables

The Ethnomusicology database contains tables with obsolete/obscure names. We also need to translate them from french to english. Additionally, some tables are themselves obsolete, and some contain unstandardized informations which we might replace with ISO standards and the like.

old table namenew table namecomment
Alias_Ethnie?These are aliases for the ethnic groups. Not sure what to do with it.
Compt_Ethnie?What's this?
Etat?This is an non-standardized list of countries. We might drop this table. Dublin Core recommends to use TGN for geographic coverage. We migh also use ISO-3166.
Alias_Etat?This is a list of state aliases. We might drop it.
Compt_Etat?What's this?
Scientifiquescientific_instrumentThis is a list of scientific instrument names. Don't know if there's a standard for this
Vernaculairevernacular_instrumentA list of instrument names in native languages
Form?What's this?

Collection fields

These are fields for the collection table (old table name: Support).

old field namenew field nameTypeDublin Core mappingcomment
Réfpublisher_referenceStringThis field contains the reference assigned by a publishing company, when the collection has been published. If it hasn't been published, it contains a reference assigned by the LEM, which somewhat duplicates the information contained in the field "Cote". In this later case, it has the form BM.XXX.XXX, where X are digits (note: some published references also start by "BM", but have a different form)
Formatphysical_formatEnumerationFormat of the original physical support, historical data. Doesn't really appear suitable for Dublin Core's format
CoteidStringidentifierThis is a unique identifier for the collection. It is currently of the form "BM.YYY.NNN" where YYY is the year and NNN a serial number. It will evolve to CNRSMH_I_YYYY_NNN
Transcript_tradnative_titleStringtitleThis is either the translation from the native title, or its phonetic transcription. We might simply consider it as an alternative title, this is Dublin Core compliant, a resource can have several titles.
Nb_de_piecesphysical_items_numIntegerIf the collection is on a magnetic tape this is the number of tapes. If it is on a disk it is the number of tracks. This is historical data. We can keep it, but will also display a more reliable computed number of related items.
Rééditionpublishing_statusEnumerationWether the collection was fully/partially published, reedited, or not
Originalis_original BooleanWhether this is an genuine/original collection, as opposed to copied ones
Copie_TotPartieis_full_copyBooleanIf this is not an original collection (See the is_original field), then whether the collection was partially or fully copied. Note: this field is used for only 2 records on more than 4000
Copié_decopied_fromForeign key (recursive)relationIndicates which collection this collection has been (fully or partially) copied from.
Auteur_compilcreatorStringcreatorThe person who created this collection, by choosing which items to group together
Auteur_noticebooklet_writerStringcontributorThe person who wrote the disk booklet (only for disk booklets?). Empty if the collection is not published.
Noticebooklet_descriptionTextDescription of the technical documentation (= disk booklet?): its number of pages, languages, etc...
CollecteurcollectorStringcontributorThe person who actually recorded the audio items. If the same as the creator (often), contains "="
EditeurpublisherEnumerationpublisherThis is the external publishing company. If not published, might contain mixed data: "Original*", "Copie*", etc... This extra data is redundant with the is_original and is_full_copy fields, and should/might be removed.
Année parutiondate_publishedIntegerdateThis field contains -1 if the year is unknown, and -2 if the collection hasn't been published. We might map it to a proper SQL DATE field.
Collect_sériepublisher_collectionEnumerationThe name of the publisher-specific "collection"
Num_dans_collecpublisher_serial_idIntegerPublisher-specific serial id within its "collection"
mode_acquisitionacquisition_modeEnumerationThe way the collection got acquired by the LEM: gift, buying, etc...
commentairescommentTextMany things in there...
Redacteur_ficherecord_authorStringThe person who authored this collection metadata
Saisie_ficherecord_writerStringThe person who actually wrote this collection metadata. In telemeta this field might be automatically filled with the currently logged in user.

Item fields

These are fields for the item table (old table name: Phono).

old field namenew field nameTypeDublin Core mappingcomment
Cote_PhonoidStringidentifierThis is a unique identifier for the item. It currently contains the collection id ("BM.YYY.NNN") suffixed with some rather messy serial number followed by track number(s), etc.. It will evolve to CNRSMH_I_YYYY_NNN_MMM_TT_PP, where MMM is the number of the original tape, TT the track number and PP the part number. Remark: if some ids to not contain the PP suffix (this is expected), it should be assumed that PP = 01.


