Supplement on EDI, Groupware, DBMS, and TPS



Object Databases

Ordinary databses have records whose fields are numerical or short text items (e.g., age, birth date, address, job title, salary). An object database has records whose fields are more general. For example, a paperless admissions office might be operated by scanning in the submitted application forms and storing those scanned images as graphic image data. The database would have some conventional fields (e.g., name, address, high school, GPA), but would also have some fields whose contents were objects. Behind the scenes, we might find that the graphic image data would be stored in a file of its own, and that the database management system software would really just record the filename of that graphic image data file, but when viewed as a database, the field is the graphic object.

Object databases can be implemented with a wide variety of items for the fields: graphic images, as we have already mentioned; software update installers; complete movies; etc.



Year 2000 Issues

In dealing with existing systems, not only must the software be changed, but the existing data files will need to be converted, too. This conversion will need to be done with an awareness of any special codes (e.g., all spaces, all zeros, all nines) that may have been used for "unknown" or "last record in file" or other such purposes. The special codes will need to be re-defined, and the values translated.

Many computer systems, from the personal to the datacenter, provide backup software that will store copies of files that have been changed since the last backup. This obviously requires comparison of date data, and therefore will be sensitive to Y2K problems or other "time bombs."

The original X12 EDI standard used a two-digit year format! An old EDI implementation is therefore highly suspect.



Telecommuting

Despite their similar sounds, "telecommuting" is a very different concept from "telecommunications." The latter is one of the tools used to accomplish the former.

Although a telecommuter will not have to deal with the distractions of the water cooler and other social interactions at a regular workplace, there are other distractions: letting the dog out, answering the doorbell, etc.

The savings of telecommuting are partially offset by the costs. For example, instead of being able to share a fax machine with others working at the same location, a telecommuter is especially likely to need his or her own fax machine. Also possibly needed, and perhaps costing more when provided at a remote location, are

Face-to-face meetings between telecommuters and regular in-office staff on a regular basis can be a key to sustained success with telecommuting. Depending on the working environment and travel distance, this may be best done with one-day every week or month, or with one week every quarter.



Transaction Processing

Audit Trails

In addition to updating the database to reflect the changes constituting and caused by the transaction, it is often necessary to record information showing who did what and when. Such an audit trail is obviously a good idea for any process involving transfers of money, and may be required by law or government regulation. This may happen as a separate step, or it may be part of the mechanisms that are used to permit full recovery of the database in the event of a hardware failure or system crash during an update.


Atomicity

What is "atomicity" all about? An "atomic instruction" on a computer system is one that will be completed in its entirety, or not at all, regardless of intervention by other parts of the system. For example, the instruction to add two integer registers is atomic on most any CPU, but CPUs that have single machine language instructions to copy a large block of data from one memory address to another will usually permit that process to be interrupted for other activities, such as I/O completion, so that instruction is not atomic.

Why do we care about atomicity? Suppose two customers are looking to reserve a seat on an airline flight that has only one seat still available. By chance, it is possible that both travel agents will press the key signalling that the customer does want the ticket at virtually the same time. The system must be designed so that this "collision" is detected and dealt with, so that one customer gets the reservation and knows it, and the other does not, and is told that the seat was sold to someone else while they were talking.

How can atomicity be accomplished? There must be some mechanism, in hardware that supports the software concept of a "lock," and the TPS or its underlying database must make use of those software locks to coordinate access to data structures. For example, some hardware has a single machine language instruction to read the value in a particular RAM address, test it, modify it, and write it back to that address. Using such a read-test-modify-write instruction, it is easy to build operating system support for locks. A particular memory address is set aside for each record in the database. When a travel agent wants to make a reservation, the software checks the value of the lock address. If no one else is using it, the new value written shows all others that it is now in use. Because this is a single, atomic, machine language instruction, no other user of the system can interfere with it, and when the other travel agent tries to make the reservation, the software will detect that the record is locked, and either wait for the other transaction to be completed, or signal the collision right away. As soon as all steps of modifying that data record are completed, the lock is released.

The term "race condition" is used in computer science to refer to situations in which the order of execution can change the outcome - it is a race between competing processes. Usually a race condition is a software bug! If locks are used consistently, then race conditions can be prevented. Multiprocessor computer systems, in which there are multiple CPUs sharing RAM and I/O devices, are becoming more and more common. Operating systems for multiprocessors must provide even more elaborate mechanisms to prevent race conditions between the multiple CPUs. Atomicity of a machine language instruction by itself is adequate only for uni-processors!



Fault-Tolerant Computing

Fault-tolerant computing refers to hardware and software systems that are designed to provide the greatest degree of reliability possible. Typically this involves:



Large-scale Systems

Systems designed for time-sharing between multiple "simultaneous" users typically display a number of differences from personal workstations besides just overall capacity.


User Authentication

This is an issue we have seen in network settings, for example. The usual solution is a public user ID and a private password. This may be done by the operating system or by the application. If done by the application, it may be done on startup or at each sensitive task.


Access Control

Who is allowed to do what. The first level may be called "protections" or "permissions" and encompasses a handful of activities that are either permitted or forbidden (not all operating systems provide all of these categories):

Read
Permitted to see and to make a copy.

Write
Permitted to create and modify.

Execute
Permitted to copy an executable program into RAM memory and execute it, but not to make a copy on disk.

Delete
Permitted to delete the item.

Each of these activities is specified for each of a handful of user categories (not all operating systems provide all of these categories):

System
Specific components of the operating system. This does not restrict access by the systm manager's privileged account.

Owner
The person who owns the item.

Group
In situations where the internal user ID number is split into group and user parts, this would apply to anyone having a group number matching that of the owner.

World
Anyone on the system at all.

A more sophisticated method goes under the name of Access Control Lists (ACLs). An ACL controls access (hence its name) to a specific resource: a file, a folder of files, a disk drive, etc. Some operating systems provide for ACLs only for some types of resources. Each ACL consists of a sequence of Access Control Entries (ACEs), in which the order of the entries may be important (Windows NT V 4 does not pay attention to the order of the entries). Each ACE consists of two parts:

  1. Identifier that may match one person, a group of people, or everyone on the system.

  2. Permissions that specify which activities are permitted and which are forbidden.

When ACLs are in use, the ACL is scanned to find the first ACE whose identifier part matches the user attempting access. Permission for the requested activity is then granted or denied, based on the second part of that ACE. If an ACL has multiple ACEs that would match a given user, it is only the first one that matters. For example, you can give everyone but a single individual access to an item by having the first ACE deny it to that one person, and the next ACE grant it to everyone. (As mentioned above, the Windows NT V4 implementation of ACLs denies access to a user if there is any matching ACE that denies access, regardless of order.)

Some systems that have ACLs enhance their utility by providing the possibility to grant identifiers to a process above and beyond the login ID. These may include identifiers granted to the person (effectively creating groups) or to the process (based, for example on interactive, batch, or network status). This can be used to achieve fine-grained controls on access to system resources.


Parity Memory vs. ECC Memory

This section has now been moved to a separate file! Select the title, above, to read it.


Interrupts

Interrupt driven I/O permits the running program to continue to process other aspects of the data or service other users while waiting for a particular transfer to be completed. This can be a very real advantage, since the external device may have a natural speed much slower than the computer's processing ability. Even if no such "parallel processing" is to be done, there is still the advantage that no activity occurs on the computer's bus while waiting for the interrupt that signals completion of the transfer, and hence it can respond more rapidly to all sources of interrupts, such as terminal or disk system I/O. The disadvantage of interrupt driven I/O is that interrupt handling does require several hundred CPU cycles.

How do interrupts work? There are several steps:

  1. The interface circuitry signals the CPU, requesting an interrupt.

  2. The CPU completes or aborts its present instruction and "saves" the results, noting its internal state and the location of the next instruction of the ongoing program. These first two steps typically take a dozen CPU cycles.

  3. The CPU determines which device requested the interrupt, and where in memory the subroutine is stored that will respond to the interrupt (the "interrupt servicing routine" for that device).

  4. The CPU jumps to that routine and executes it.

  5. The final instructions in the interrupt service routine restore the CPU internal state exactly to that saved in step two, and then returns to the previously executing program, whose results will be the same as if there had been no interrupt, except for the delay.

The precise techniques by which these steps are accomplished are designed into the CPU and interface circuits, and depend on the bus structure that connects them.

The problem with interrupts is the "overhead" of performing all those steps. Just getting ready to deal with the interrupt and returning from it may well take a hundred CPU cycles, in addition to the work done by the service routine itself.

The alternative to interrupts is called "polling," in which, whenever there is a pending I/O operation the CPU checks each I/O interface in turn to see whether it is ready for the next step, and keeps checking so long as there is pending I/O. This permits quicker response to the I/O activity at the expense of accomplishing nothing else while the I/O is pending.


Direct Memory Access Controllers

Most personal computers are equipped with device interface circuitry (NIC, serial ports, parallel ports, disk controllers, etc.) that require the operating system code executed by the CPU to move each byte of data between main memory (RAM) and the controller's "data register." This requires one interrupt, with all its overhead, for each byte.

DMA controllers are designed to be able to transfer data from the peripheral device to memory, without interrupting the CPU. They may block CPU access to memory during the transfer, but that causes much less delay than having the CPU handle the data itself: perhaps ten CPU cycles instead of the several hundred CPU cycles typical of interrupt context-switching overhead and the interrupt service routine.


Virtual Machines

The VM environment on IBM mainframes presents each user's software with the appearance of a complete mainframe with small disks and a slower CPU. This is achieved at the cost of complexity, because there is a "Control Program" layer to the operating environment that sits between the user's software and the hardware. The benefit is a very thorough insulation of each user from uncontrolled impact by other users.


Process Scheduling

Multi-user systems operate with the CPU working for each user for a short while, and then going on to work for another user. Performance is usually acceptable if each user gets adequate attention once a second, or at least some attention several times a second, so that visible progress occurs on-screen (such as catching up with typed characters). Several strategies are used for scheduling the CPU. They differ in their "fairness," in the CPU cycles spent computing who to attend to next, and in their robustness under a variety of system loads.

Priority and class-based scheduling are particularly robust for mixed workload, general purpose systems, serving a combination of interactive and batch jobs.



Erasing Magnetic Media

Some data that is stored on magnetic media, both disks and tapes, is both vital to have for the company to continue to do business successfully, and vital to conceal from competitors. Simply deleting files may not provide sufficient security.

Erasing magnetic media for security is a business in its own right, both providing the service and providing equipment for customers who do not want to permit the magnetic media to leave their premises with any possibility that the data might be recovered.


Return to MIS 300 Page

Dick Piccard revised this file (http://oak.cats.ohiou.edu/~piccard/mis300/ediextra.htm) on October 27, 1998.

Please E-Mail comments or suggestions to "piccard@ohio.edu".