From the conceptual modeling point of view it is important to understand that:
(1) There can be several levels of entity identifiers, that is the same entity exists at several levels where it has different identifiers. Example: a computer has DNS name and IP address (and MAC address); a person may have several identifiers. These (initially independent) spaces can be structured differently:
(1.1) Layered structure like DNS and IP
(1.2) Independent, e.g., a person has two passports from different countries
(2) Links (representing relationships) are attributes. Like all attributes they are functions which map input entity identifiers into output (address) space identifiers.
Taking this into account, the problem is that URLs frequently play two roles, that is, one address convention is used at two layers simultaneously which can lead to numerous problems and controversies:
o URLs are used at access (protocol) level where the goal is to identify computers and access paths within some service (and not entity). For example, this layer has no idea what this means: http://service.com/passports/1234 - it is used for HTTP access only.
o URLs are used to identify real entities and here we re-use the previous name convention for higher level purposes by identifying people like http://service.com/passports/1234 although we do not care about protocols or computer names.
In order to avoid serious problems we need to follow these steps:
* Recognize and document that there are two independent address spaces
* Define a mapping between these two spaces. Initially, it can be identity relationship, that is, every entity name (URL) is mapped to the same URL used for access.
* If necessary (and frequently highly desirable) use relative names by assuming that the higher level segments of the name can be restored from the context or configured separately.
These rules are especially important for extensibility and for complex evolving systems where the standards and conventions change in time.
(1) There can be several levels of entity identifiers, that is the same entity exists at several levels where it has different identifiers. Example: a computer has DNS name and IP address (and MAC address); a person may have several identifiers. These (initially independent) spaces can be structured differently:
(1.1) Layered structure like DNS and IP (1.2) Independent, e.g., a person has two passports from different countries
(2) Links (representing relationships) are attributes. Like all attributes they are functions which map input entity identifiers into output (address) space identifiers.
Taking this into account, the problem is that URLs frequently play two roles, that is, one address convention is used at two layers simultaneously which can lead to numerous problems and controversies:
o URLs are used at access (protocol) level where the goal is to identify computers and access paths within some service (and not entity). For example, this layer has no idea what this means: http://service.com/passports/1234 - it is used for HTTP access only.
o URLs are used to identify real entities and here we re-use the previous name convention for higher level purposes by identifying people like http://service.com/passports/1234 although we do not care about protocols or computer names.
In order to avoid serious problems we need to follow these steps:
* Recognize and document that there are two independent address spaces
* Define a mapping between these two spaces. Initially, it can be identity relationship, that is, every entity name (URL) is mapped to the same URL used for access.
* If necessary (and frequently highly desirable) use relative names by assuming that the higher level segments of the name can be restored from the context or configured separately.
These rules are especially important for extensibility and for complex evolving systems where the standards and conventions change in time.