A living room is defined as a room in a home that is a central gathering place (Wikipedia 2007) and the entertainment hub of the house (Kapoor 2003) used by a wide age range of household members(from child to elderly) with varying technical capabilities, for interests varying from quiet activities e.g. reading to informational activities e.g. watching the news to noisier entertainment activities e.g. watching a movie (Wikipedia 2007). Convergence is the key element of this proposal for Deft Homes to transform the present living room to the digital living room of the future.
The convergence of devices such as the television and the computer brings new combinations of possibilities for the digital living room (Kapoor 2003). The vision of a digital living room has long been mooted; a room where audio/visual works are provided to consumers on-demand integrated with applications/devices in a wireless environment and connected to the internet ([email protected] 2005) to provide a variety of services. Deft Homes now has an opportunity to make this a reality.
A Typical User Case Scenario
Alan enters the living room and says, “Lights on”. To avoid confusing voice commands with speech, Alan can set his preferences and choose to accompany his voice command with gestures such as a clap or snapping his fingers. The voice recognition device recognizes Alan’s commands and turns on the light. Alan then adjusts the lighting conditions by saying “dimmer” or “brighter”. Alan relaxes on a recliner, and by using voice recognition, adjusts his seat reclination by “recline” or “raise” and his seat rigidity by “softer” or “harder”.
The rigidity of his seat is adjusted by injecting more air for a firmer feel or less air for a softer feel while the seat mechanisms adjusts the reclination. Alan uses his voice via voice recognition or a remote control to turn on the television. He is brought to a personalized menu where he can choose from options such as internet surfing, weather updates, music, movies, TV or email. He can also receive messages left by fellow users, in the form of audio/video recording. All user information is stored and preferences are remembered by the system. Alan’s identity is identified by his voice and the system loads his preferences. Alan decides to watch a repeat telecast of the 6pm news.
As the system remembers that Alan watches the news everyday, it records Alan’s preferred news program. Therefore, although Alan cannot make it in time to watch the news in real-time, he can watch a repeat telecast whenever he wishes. As the system knows that Alan is in the room, alerts can be sent directly to the room he is in and to the television screen by appearing in a small box in the bottom-right-hand-corner without interrupting the activity, i.e. watching the news. Alan can choose to respond to the alert by saying “pause” which pauses the activity. For alerts e.g. phone calls, the system records Alan’s various voice responses to a phone call.
For example, “busy” will play a voice mail message informing the user that Alan is currently busy and to leave a message and “Answer” means that Alan chooses to answer the call. He then can select “public” or “private”, public meaning that the call can be heard throughout the room or “private” meaning that the system will identity the nearest Bluetooth headset to Alan and directing the call there where he can speak without people close-by overhearing the conversation. “End Call” would end the call promptly. If Alan decides to record a reminder or record notes, “Take Note” “I have a meeting with John at 2pm at Wilson Tower” will record his reminder into the system and can be transmitted to his preferred device, his PDA or his laptop. Alan then resumes his activity i.e. the news by saying “Play”.
The aim of this Digital.Living.Room.Ver2.0 proposal is to demonstrate that pervasive computing centred on human commands and activities can allow the user to achieve a huge range of activities. Voice recognition capability in the system allows users to make voice commands to perform actions without the usual manual actions required to perform the action. Furniture and electronic devices are programmed to respond to voice commands and perform the requested activities. The system has information on all members and their preferences and is can adjust accordingly to each member’s preference. This system thus allows the user to perform the activity he wants with his preferences effortlessly and quickly.
To make this possible, the system requires an infrastructure centered on accepting and recognizing response. The digital living room is made up of many components from the recliner to the lighting to the TV. It is therefore essential that these components are can communicate with each other and liaises with each other to perform tasks. A network is required to identifies these components and allocate them roles and tasks according to prior user settings. New components can be added and be programmed to respond to new user commands or linked to existing components. The network has to be able to identify the right services according to the user commands and allocate the right device to perform the task. This will be managed by software i.e. the system’s operating system. The operating system would have to be flexible in responding to changes in the surrounding of the room e.g. a new member entering the room or a requested service unavailable.
The operating system will manage activities flow on behalf of the user and require minimal commands and actions from the user to perform tasks. The operating system also will record information about users and their preferences. This allows the operating system to respond more accurately to the user’s specific commands as well as perform tasks on behalf of the user based on his usual preferences e.g. record the news daily based on Alan’s preferences or change the settings of the living room to suit the user. It can also interact with other devices e.g. Alan’s PDA and share and extract information as required/requested. Voice recognition devices and motion sensors are required to be installed around the room, to detect the location of user, record and respond to his voice commands.
Natural interfaces are computer interfaces that can accept and verify natural forms of human communications such as speech and gestures (Abowd 2000). The system can recognize voice and gestures such as a clap. These actions are assigned to various commands by the operating system. It reduces effort and time for the user to use the system to perform an activity without complexity.
Error prone interaction occurs when natural interfaces fail to recognize input accurately(Abowd 2000).The system minimizes this problem by requiring the user to record 3 variations of a particular voice command and 3 variations of a gesture supporting command. To differentiate between speech and commands, the user is required to give commands via a voice command and a gesture such as a clap. By recording 3 variations in voice and gesture commands, the system is can accept a wider range of input without being inhibited by voice accuracy factors such as voice volume or voice diction.
Context awareness deals with the recognition of human activity and associates the human activity detected with the correct service offered, thus increasing the amount of human activity (Abowd 2000).This requires the system to be able to identify user and/or his location and devices/services available to that user and allocate it accordingly. The system can identify the user by voice recognition. When the user first enters the room and gives his first command, the system matches the user voice to its list of commands and attempts to identify the user. It then addresses the user,
“Welcome Alan”. If wrong user recognized, the user re-identifies himself by “I am John”. The system then attempts to load John’s profile which contains his preferences and information. The system seeks more verification if the user profile is not correctly chosen. The system will add the voice commands used to the user record to minimize the matching of wrong users. Using voice recognition and motion sensors, the system can detect user location and deliver services required to the user via the components.
Automated capture is the capture of live experiences which can be later accessed and used by the system and the users to review activities performed(Abowd 2000). The system records the user’s voice commands, gestures and the activities performed. Cameras in the room also capture video and still images of activity in the room. This is recorded and sorted via date and time. This allows users to access recordings of public occasions such as birthdays. To allow privacy of users, they can opt to have privacy and not have their actions recorded by requesting “Privacy”. The system still record the activities performed so that it can remember the user’s preferences. Users can also make notes or reminders which is captured by the system and then transmitted to the user’s choice destination e.g. PDA.
Continuous interaction is the presentation of computing as a constant presence to support the daily simple informal activities of users(Abowd 2000). When the system is not used, it continues to run activities e.g. record shows of preference e.g. Alan’s news. The system can run tasks/activities in parallel e.g. While Alan is watching a movie, the system records the news. Activities can be paused, interrupted or swapped and resumed at a later time. This gives the user a greater flexibility in his choice of activities.
Digital.Living.Room.Ver2.0 presents Deft Homes an excellent opportunity to invest in a concept living room of the future designed to utilize pervasive computing to accept user responses and allow users to remain in control while supporting a wide range of activities for the user.