The AVSA of stage two relied heavily on the state of the reactive room subsystem. If the reactive room malfunctioned the AVSA did not work. In addition, the communication between the PC/UNIX environments made the AVSA unstable in some cases. It placed heavy requirements on an environment not meant for such tasks. Finally, since the PC did not support multiprocessing it was not possible to set up a monitoring system (i.e. daemon) for the AVSA to accept requests from the media space. This chapter describes the third stage in the development of the AVSA- developing an integrated self-contained system on the UNIX platform.
5.1 Ingredients
The basic parts required for this stage are the same as those required in stage two and are described in Section 4.1. However since we wanted a more stable and self-sufficient system, the actual ingredients acquired would be different in that they will all operate in the UNIX environment.
5.2 Recipe
The recipe was slightly different from stage two's. Fortunately, even though the ingredients would be different we would still be able to port some components, like the translator, since they will remain constant. The recipe for this stage is as follows:
We also needed video overlay technology for the Indy that would allow the AVSA to perform the video overlaying which we required (described in Section 3.2.1). For this we acquired an Indy Video card and its associated video library.
For the speech recognition we acquired an ASR developed by SGI. Indy's come with a standard CD quality sound processing card with an external input so we did not have to purchase any extra hardware or do any installations to process the audio.
Indy's also come with their own standard Ethernet card. Hence, no special hardware or software installation for network connectivity was required.
To build the interface we decided to use the tcl/tk development environment. We would use the standard C compiler to combine all of the different technologies. By using these standard software development environments we were assuring ourselves that if, in the future, the system was to be ported to another platform, the interface would not have to be redeveloped from scratch.
5.3.2 The development team
The development team was downsized at this point. We did not required as much input during the development into the interface. Instead we needed an efficient team that could develop the system quickly. The team consisted of:
By using the established communication utilities we can separate the systems so that the AVSA will not be dependant on the reactive room subsystem(Figure 20). The former situation was acceptable for the prototyping stage, but the new setup is more logical. In this new configuration, the AVSA communicates control information to the IIIF system directly instead of through the reactive room subsystem as in the last stage.
Figure 20: The AVSA now uses the Ethernet to communicate with the IIIF directly instead of through the reactive room.
5.3.5 Two-way communication
The Indy has much more processing power than the PC and the UNIX platform has many more processing capabilities than Windows. This should simplify the communication and make it much more stable.
In the last stage, the AVSA could send requests to the media space and receive responses based on the these requests. To enable this on the Indy the translator of the second stage had to be ported to the Indy platform.
In addition, there were some cases in which we wanted the media space to send requests to the AVSA without being requested to in the first place. For this reason we decided convert the AVSA communication layer into a daemon. In this way, after some simple modifications, other media space systems would be able to communicate with the AVSA at will.
5.3.6 The ASR
The process of integrating the ASR would be the same as the process in stage two because there was no API. However, since we were using a Silicon Graphics Incorporated (SGI) ASR package, and we had close ties with SGI, we were planning on working closely with the developers of the product to develop an API and tune the ASR to our needs. Unfortunately, the project under which the ASR was being developed at SGI ran into problems and was consequently terminated.
We did search for other ASR options. Vendors were contacted and the options were evaluated. The reason that we decided not to purchase a commercial ASR is that when we took into consideration:
The video overlay technology we were using gave us control over opaqueness of images. We used this to our advantage and solved the problem by presenting the list of options in a translucent box (Figure 21). This way there was always a more visible hint as to what was part of the interface and was part of the media space view.
Figure 21: The translucent background helps identify the option list.
The area occupied by the list was variable. The height depended on the number of options that were available, while the width depended on the string length of the longest caption. It was this variable nature which caused problems for users. Users wanted to be able to scan the list quickly, choose an option and find the associated word to utter. The constant horizontal and vertical shifting of the list, and thus search for the beginning of the caption as well as the first caption, seemed to slow users down as well as irritate them when they were going through several levels of the menu hierarchy.
From this analysis we concluded that the top left corner of the screen might be the best location for the list of options. This way the users would always know where to start scanning the list. This however was also found to be unacceptable to users. They found it too intrusive because they also scanned the view of the media space from left to right, top to bottom.
It was quite obvious from users comments that the best solution was to place the list in the bottom left corner of the screen. This eliminated the horizontal shifting. Vertical shifting still occurred, but based on our experience of using the interface and the comments of people we asked to evaluate the system, by exposing them to the four possibilities, it was found not to be as troublesome.
This feedback is more direct. It also reduces the number of types of feedback that are presented through the banner, thus reducing the cognitive load on the user. In short, the feedback was now presented in a much more effective way.
This caching made the system run much more smoothly and thus proved to be extremely valuable. There were other situations in which the system seemed to get stuck, but there was nothing that could be done about them. The reason they were getting `stuck' is because the media space was slow in responding to requests. The only way to truly fix the problem is to speed up communication within the media space. This is not in the scope of this thesis. Unfortunately, there was no way to work around the problem either because the answer to the request was crucial.
To resolve this problem we decided to implement a map as a navigational aid (Figure 23). The map summarizes the menu hierarchy through generalized captions so that a visitor can quickly see where they can go and how to get there. It also indicates the current position through a more specific caption and a different color. To help the visitor further, the current location was displayed as the first line of the option list (Figure\x1123). In order to ask the AVSA to display the map the visitor must say the word "map". This word acts as a toggle switch. Consequently, when "map" is uttered again the map will disappear.
Figure 23: The map helps users navigate through the menu hierarchy. The first line of the option list helps the visitor pinpoint their location on the map.
There are three possible solutions. All involved providing a signal by which the media space knows to mute the audio going to the meeting room, but not the AVSA. The first is to provide a mute button of some sort at the orphan CODEC site. The problem with this solution is that it does not conform to our LCD quality. The second solution was to require the user to use some hand gesture. Unfortunately, this solution is not self-explanatory or easy to use. The visitor would have to learn a special gesture and make sure that they do not use it during the course of the meeting. It may also be distracting to the people in the meeting to see someone performing what seems to be irrelevant gestures. The third solution was to provide the visitor with the command "mute". When this command is issued the audio to the meeting room, but not the AVSA, was disconnected. When "mute" is said a second time the audio is reconnected. The problem with this solution is that the members of a meeting will still hear "mute" at least once. However, we decided to go with this final option and evaluate it further before considering one of the others.
Figure 24: The help screen explains how to use the AVSA.
The next addition to the interface was the introduction screen (Figure 25). Originally when the visitor connected to the media space they saw a list of options overlaid on top of an appropriate sign. This was not very inviting. Instead, we redirected the video seen by the visitor to be that of a live view of the University of Toronto. We then overlaid a greeting on top of this image that explained to the visitor where they had connected to, how to get help and the options that were available.
Figure 25: The introduction screen.
In summary, this section explained the many new interface features we added and changed, based on user feedback, to make the interface more effective. The next section describes the features added to the system to make it more useful.
In this situation the CODEC visitor would be able to complete a connection to the room but they would not be able to access the room services. This stage addressed this problem so that as long as there was a node available in the room, the visitor would be able to access any of its services.
For this reason we decided to implement a video-on-demand (VOD) service. With this service visitors can access pre-recorded material at any time. VOD also enables members of the media space to allow visitors to access whatever material they wish them to access.
Not only can visitors access the information, but they can pick what they want to see from a list of demos (Figure 26).
The AVSA also gives them such control as play, stop, pause, fast forward, fast forward speed, etc. (Figure 27).
We used comments provided by users at the end of the second stage to improve the overall look, feel and effectiveness of the interface by improving and speeding up navigation, improving feedback, making the interface more visible, introducing the system, etc.
New functionality in the form of video-on-demand, access to other rooms with services, and access to multiple visitors within a room were added to improve the quality of the visit to the media space.
5.4.2 Adding system features
During this stage we improved the communication component of the system. As a result we were also able to add some new features to the system that make it more useful. The following sections discuss these features.
Rooms with services
The reactive room subsystem was being developed in parallel with the AVSA. One of the developments was the deployment of another room with services. It was desirable for the AVSA to recognize this fact and act accordingly. As a result, we gave the AVSA the ability to recognize any room that has services by simply registering the new room and all the possible node names in a database. When a visitor chooses to call this node the AVSA checks the database and, if the room is registered, it queries the room's local daemons for services.
Multiple visitor access
Within a room it was possible to have multiple visitors. The first prototype system did not address this functionality. It only gave CODEC visitors access to one of the possible visitors to a room. Problems arose when a member of the media space was visiting the room through the TP application and they were visiting as the visitor that would normally be occupied by a CODEC visitor.Video on demand
So far, the only advantage we have taken of the television technology in our effort to converge technologies is that we are exploiting the richness of the medium through which it is being transmitted. However, as mentioned in chapter 1, the concept of the television as a method of distributing information at any time is also very valuable.
Figure 26: The user has access to various demos provided by members of the media space.
Figure 27: The user has access to various demos provided by members of the media space.
5.5 Summary
During this stage we moved the AVSA from the PC/Windows platform to the SGI/UNIX platform to make the system more stable and expandable.