FAQ

On this page you find answers to some frequently asked questions regarding SiAM-dp.

Runtime

Topics related to the execution of dialogue applications.

There is already a running instance of the dialogue applications that blocks the port. Kill it!

There is a temporary lock between jetty services and consumers.

Set in the OSGi start configuration the start-level of the plugin com.eclipsesource.jaxrs.publisher to at least 5

This error occurs, if you want to start a project in the wrong location.

To solve this problem, open the run configurations. Choose in tab “Plug-ins” your project to start and add the required plug-ins. Then open the tab “Arguments”. In the part “VM Arguments” you can find the following entries:

-Dmmds.configuration.location=${project_loc:/[Wrong path]}/configurations
-Dmmds.resources.location=${project_loc:/[Wrong Path]}/resources

Change the wrong path to the path of your project, e.g. “de.dfki.iui.example_application”. Then re-start your application.

java.lang.ArithmeticException: Double coercion: de.dfki.iui.mmds.core.emf.datatypes.BString:(…)

A bug in Jexl causes that sometimes Jexl tries to parse a double from a BString in a Jexl-Expression. Add a toString() method to the BString in order to get the real String datatype behind the BString.

This is not possible with the current SiAM-dp version. The dialogue component is a singleton.

The first matching transition is triggered, the others are ignored.

Dialogue Design

Topics related to the concepts of dialogue design and how to properly use the editors.

Yes, you can! Just use the Assign exectuable in the OnEntry or OnExit slot of a state, or in the OnTransition slot of a transition. The expression in the Expression property is also executed if no To-property is given.

If you want to react on an event, such as user input or sensor notifications, you have multiple options, some of which are used in combination. Essentially, you need to have 3 things:

  • An event that is triggered
  • An action that is performed
  • A connection between the event and the action

Triggering the event:

Events are usually triggered by an input device. For example, a microphone input device can trigger a speech input act, or a GUI device can trigger a button press event. If you are using a device that is not yet covered, you may need to add a new device. For example, if you would like to react when a robot reaches a certain position, you can create a custom device for your robot, and have the robot notify SiAM-dp of the position change via the SiAM-dp i/o interface. (You have to decide whether to extend the i/o model or use the ad-hoc custom i/o message format.) Learn more.

In a situation where you have an existing type of input event, but your dialogue application needs the event in a different (semantic) format, consider writing an input interpreter that transforms an input event from syntactical to semantic form.

Implementing the action:

Actions are usually performed by an output device. For example, a speaker output device can play synthesized speech (TTS), or a GUI device can display a message. Like for input devices, you can create your own device if you do not find an included device that covers your requirements. For example, you could create a robot device that can react to Move actions. Learn more.

In a situation where you have a semantic type of output event, but your device needs the event in a syntactic device specific format, consider writing an output generator that generates a device specific syntactial output event from the semantic one.

Using output devices yields a high reusability of actions, i.e. you can use the external device implementing the action in multiple dialogue applications. However, there is some overhead involved in creating a device, in particular if it’s an external device. A faster alternative to define actions is to write a plug-in. This is simply some Java code integrated into the dialogue application. A method of your plug-in class can be executed as reaction to an event. Learn more.

Connecting event to action:

You normally define the reaction (relations between events and actions) as part of the dialogue model. Using the declarative syntax ensures that the dialogue application remains easy to maintain. With the dialogue being modeled as a state chart, the reaction to an event is a transition between states in SiAM-dp. So you first need to pick the state in which you want to react to the event, and then create a transition to the state where the event leads to (this can be the same state). The transition will be annotated with the event itself (or rather type of event to be matched). The action to be performed is described either in the same transition,  as entry conditions of the target state, or exit condition of the source state.

In some rare situations, you may need to filter and react to events purely programmatically, for example when you want to implement your own dialogue management that is not based on state charts. In this case, you can be notified of all incoming events by writing a plugin for a new component that subscribes to a specific set of messages which is filtered by PPatterns. (A wizard that supports the creation of a new component plugin will come soon.)

 

Yes, it is possible to create your own EMF model and to use it in your dialogue application. One important point is that your new model inherits from some of the SiAM-dp base concepts, e.g. from base:Entity for new concepts describing entities that can be used as semantic content. There are some good tutorials in the internet about EMF model generation. For the use during  runtime you need to generate model code (here), for the use in the sdk  and in editors you also need edit code (here). Editor code is not necessary.

The domain specific model you created in a new project is not directly known in your Eclipse workbench environment. For this your new created bundle must be added to the active workbench. I propose two ways to achieve this:

  • Install the new plugin in your active workbench and restart eclipse. This has the disadvantage that you always have to reinstall the plugin if you make changes to your model.
  • Install the InPlace Bundle Activator which allows a dynamic bundle management and can activate a bundle from the workspace during runtime.

Plugins are Java Classes that are registered to the dialog application under a given namespace. Registered plugins can be accessed from Jexl-Expressions within SiAM models, e.g, dialogue specifications, semantic mapping rules, GUIs, or grammar rules. Plugins can be used for different things and computations that are easier to realize directly with Java code that in Jexl-Expressions or with Executables in a dialogue model. Furthermore the application developer has direct access to other Java APIs.

The creation of new plugins is quite easy. The only requirement to a Java class that is registered as plugin is that the class is available in the class loader of your Dialog Application OSGi bundle (the bundle where also all model resources are defined). Furthermore the class and all methods, you want to invoke, must be public.

A plugin is registered in the project specification resource of your application which normally has the file ending .project. Use the following steps:

1. Create a new Java Plugin child for the Project concept

2. Set the class name property of the new Java Plugin to the name of your Java Class (use the full  name including the package), e.g, de.dfki.iui.example_application.MyPlugin

3. Set the namespace property, which is used for accessing the plugin, e.g., to myPlugin

When the dialog application is started, an instance of the indicated class is created. The class can now be accessed in a Jexl-Expression with the given namespace. Method return value and call arguments are fully supported. If your Java class, e.g, defines the method String getStringForValue(int value) you can call the method inside a Jexl-expression by the following code: myPlugin.getStringForValue(5).

In the Plugins you can comment your code like in other Java classes.

In the model you cannot currently leave comments.

To change the value of a variable you can use the Executable Assign in OnEntry, OnExit or OnTransition. In the property field “Expression”  insert the new value of your variable (this can also be  a jexl-expression). In the property field „To“ insert the name of the variable. You can also use Java-PlugIns in the expression that defines the value of your variable (e.g. $expr(JavaPlugin.someFunction())).

A second option is to set the default value of the Variable, when the variable is declared. The content can again be the value or a jexl-expression.

Yes, the property field supports jexl expressions. Thus constructions like the following are possible.

$expr(“The current value of variable X is ” + variableX + “. You are welcome.”)

Yes, each Java extension is specified in the project model as a Java Plugin. You can add arbitrary entries to the plugins slot. The Java plugin is identfied by the namespace which must unique. See also here

To change the language you have to do the following steps:

1. Start the AudioManager with the desired language. (changes will have to be made in the XML config-file to get the right ASR and TTS languages) The newest version of the AudioManager automatically uses the language that is specified in the grammar or the speech output request.
2. Open the configurations folder in the project folder. Here you have to open the file „de.dfki.mmds.speech_recognition.grammar_manager.properties”. Change the value of LANGUAGE to the desired shortcut. (de-DE for German or en-US for English e.g.)
3. Add your grammar rule sets to the project space. The language is specified in the rule set.  Then choose in the .project-file the item project and add the grammar to the property Grammar Rules in the project specification.

  • You can also add  grammar rule sets for different languages to a project. SiAM-dp automatically selects the ruleset that matches to the language specified in the speech recognition properties in step 2.
  • You can also specify the language in the program arguments, when starting the executable file. Use the property -language <LANGUAGE-CODE>, e.g., -language de-DE for german. This settings overrides the setting in the property file.
  • You can change the language during runtime by changing the property file.

To create a multi-language application you have to create an own grammar rule set for each language. In the tutorial you can see how to set up a new grammar rule.

If you have a GUI that should display a button label based on the language, or you want to make the TTS speak a text in the current language, you can create “Localization rules”. To create such rules you have to create a new SiAM-dp-model in the folder “localization” located in the resource-folder. The localization has a property to define the language. Here you have to enter the language-code (de-DE for german for example) to use it in the desired language. The localization rules are mappings from a key to a specific sentence. In multi-language applications you have to create a localization mapping for each language. It is important that the key for a specific sentence is the same in all languages.

For example:

in german localization: lampIsOn -> Die Lampe ist eingeschaltet.

in english localization: lampIsOn -> The lamp is already on.

“lampIsOn” is the key which is needed to know which sentence should be used. The platform now chooses the value by the language which is set in the platform.

To use those rules in the GUI, you have to set the label-property in the GUI-element to “$expr(ResourceManager.getString(“lampIsOn”))”.

Note: The argument in getString has to be set to the desired key. In our example this is “lampIsOn”.

In order to use localization in TTS, you have to create a speech-output in the dialogue and then set the utterance-property to “$expr(ResourceManager.getString(“lampIsOn”))”.

Changes are made equally to the steps above.

You can only use one language while your application is running. For setting the language open the configurations folder in the project folder. Here you have to open the file „de.dfki.mmds.core.resourcemanager.properties”. Change the value of LANGUAGE to the desired shortcut. (de-DE for German or en-US for English e.g.)

  • You can also specify the language in the program arguments when starting the executable file. Use the property -language <LANGUAGE-CODE>, e.g., -language de-DE for german. This setting overrides the setting in the property file.
  • You can change the language during runtime by changing the property file.

Speech / Audio

Topics related to speech recognition, speech synthesis, and general audio output. Many questions concern the use of the external Audio Manager component.

Windows may block plug-ins from being loaded if your browser / extraction program flags them as having been downloaded from the Internet.

The solution is do “unblock” the download (archive file) prior to extracting. It is not sufficient to unblock the AudioManager.exe file. You can unblock a file by right-clicking it in Windows Explorer, selecting “Properties”, and then clicking the “Unblock” button at the bottom or (in Windows 10+) checking the “Unblock” checkbox. (If there is no “unblock” button/checkbox, the file has already been unblocked.)

Choose in the .project-file the item „TCP Device microphone“, which represents the connection to your microphone. After right click on it, you can choose „add child“->„mode“ and then „Speech Recognizer Mode“. Change the property Mode of the new item to SpeakToActivate.

You can change the recognition mode dynamically at runtime. In this case, you have to add a new transition. Add a “(OnTrigger)Send” and after that add a “Update Device Mode”. Change the property Device to the device you want to change (microphone e.g.). Now add a “Speech Recognizer Mode” and change the property Mode to the mode you want. After this transition is fired, the mode of the device will be changed.

Note: The ptt parameter in the XML config for Audio Manager should be set to 1 in either case.

The language of your windows versions must match the speech grammar language. If this is the case try to rename the entry “SpeechServerAsrHandler” to “DesktopAsrHandler” in the file default.xml. Alternatively you can install the MS Speech Server 11 (32-bit Version !!!, Runtime + Speech Recognition + Speech Synthesis) and keep the SpeechServerAsrHandler configuration. For this please following these links:http://www.microsoft.com/en-us/download/details.aspx?id=27225
http://www.microsoft.com/en-us/download/details.aspx?id=27224. Find more information in the AudioManager documentation.

Make sure that the required version of the .NET Framework is installed.

Make sure that the Audio manager is not already running. Check the tray area of the taskbar and the process list. If it is not running, see if any other application is using the same port.

When starting for the first time, it is recommended to run explicitly with administrator privileges. Right-click the executable in Windows Explorer and choose “Run as Administrator”.

  • Some additional details can be found in the system event log. To open it, run “eventvwr” from the command line. Then, under “Application and Service logs”, select “SiAM-dp”. You can also quickly access the event log from the Audio Manager console’s menu.
  • If no details are written to the event log, the log or event source may not be installed. Run Audio Manager explicitly as an administrator to do that.

Try to set the bit depth to 16 using the configuration parameter bitdepth.

In version 1.0.8, some processing behavior was changed (actually fixed), which may have this side effect when using old (“wrong”) config files. Check your config file for a line such as

<PhysicalDevice channel=”0″>…</PhysicalDevice>

and change it into

<PhysicalDevice channel=”-1″>…</PhysicalDevice>

Probably Speech Server is not working correctly.

  • Make sure the runtime and the language pack(s) are installed and that you have selected the 32-bit version of Speech Server and (if applicable) of the voice / recognition engine (even if you are running a 64-bit OS).
  • Alternatively, change the Audio Manager configuration from SpeechServerAsrHandler to DesktopAsrHandler or from ServerTtsHandler to DesktopTtsHandler. Whether this works depends on your OS version, OS language, installed language packs, and grammar language.
  • See the section on SAPI ASR and TTS plug-ins in the Audio Manager documentation for more details.

You are trying to use a recognizer language with the desktop ASR for which you don’t have the corresponding language pack installed/functioning on your system. If you use Windows 8.x or Windows 7 Ultimate, you can install additional language packs via control panel (Language settings in 8.x or Windows Update in Windows 7). In Windows 10, you have to use the System Settings app to install the speech input module for the desired language (click Time and Language, then Region and Language, then select a language (you may need to add it first), click Options, then click the Download button next to Speech Recognition.

If you don’t have this option, consider changing the DesktopAsrHandler to SpeechServerAsrHandler by replacing the corresponding strings in your configuration file. For the Speech Server ASR, you can download any required language packs from the web (see “Installing the Speech Server ASR”) independent from the OS language.

In setups with many connectors, devices, channels, or high sampling rates, performance issues can lead to stuttering. First, check whether other processes with a high CPU load may be the issue. Next, try the following

  • Reduce resampling by configuring connectors to use the device sampling rate OR change the physical device sampling rate to match the connector’s rate.
  • A 1:1 channel mapping reduces the required channel multiplexing.
  • Enable buffering on the logical device level (TBD) at the cost of response time.
  • If you have implemented your own connectors, try to “pre-compute” audio data if possible.

This can actually have numerous reasons and requires a systematic analysis of all factors, including signal, configuration, grammar, etc. You can use the following as a troubleshooting guideline.

  • Test in a quiet environment. Background noise (especially speech) can severely influence recognition quality. Special types of microphones (e.g. directed) can reduce the effects of background noise. Also make sure no TTS output interferes with the recording. (If you need TTS at the time as recording, you will need to configure a special barge-in setup – see the corresponding question.)
  • Always speak into the microphone from a close distance (5-10cm). A headset may help.
  • Depending on the type of microphone, it should point towards the speaker.
  • In push-to-talk mode, make sure you do not start speaking too early.
  • Use an external microphone. Even a very cheap external microphone works usually much better than a microphone built into a laptop or other device, which picks up fan noise and vibration and is further away from the source.
  • Attach a windshield to the microphone to reduce the effect of air in the environment or in plosives such as P and T. This is especially important for (high-end) condenser microphones.
  • Make sure that the correct microphone is selected as input. Sometimes, even though an external microphone is connected, the internal one is still used for ASR because it is set as default recording device or manually configured (this can be hard to detect).
  • Make sure the amplification settings and post-processing effects (should be none) for the microphone set in the system audio control panel or external sound board are correct. For microphones requiring 24/48V phantom power, ensure this power is provided.
  • If the audio is too silent, you may try to apply Automatic Gain Control. Audio Manager provides such a mechanism as part of the Voice Capture DSP.
  • Try with a different microphone and placement.
  • Test the audio signal for noise and artifacts. You can use the configuration dialog of the ASR to play back the last recognizer input. This is preferred over other (external) recording applications, as it represents the final post-processed signal used for the actual recognition. (If a third-party ASR does not support this feature, you can have Audio Manager write the recording to a disk file using the corresponding connector.) You should be able to hear a clear, undistorted speech signal without missing segments. Sudden signal breaks and noise may indicate a cable break or short-circuit.
  • If there is background noise you cannot eliminate by changing the physical setup, you can try to use a noise suppression filter. Audio Manager provides such a mechanism as part of the Voice Capture DSP.
  • If you are using a barge-in setup, you may need to adjust the echo cancellation parameters.
  • Be sure that the language (culture) is set correctly in the configuration and in the grammar files.
  • If there are names or badly recognized words in your grammar, especially if they are of a different language, using the phonetic transcription to specify them instead of the text notation may likely increase recognition accuracy.
  • If you use a grammar, always try to minimize the number of rules / utterances possible at any time. The more utterances can be recognized, the higher the possible ambiguity. Utterances which you don’t need to recognize should be disabled by dynamically updating the grammar.
  • If utterances (or even nonverbal audio input) that are not part of your grammar are wrongfully recognized as application vocabulary, try to add a “garbage dictation grammar” (see the corresponding option in the ASR parameters in the AM documentation).
  • Dictation (free text input) is a very difficult topic with untrained speaker profiles (the default for the ASRs supported by Audio Manager). Short utterances are expected to be error-prone in any case because there are so many alternatives. Complete sentences should work better. Consider enabling training via configuration parameters if possible if you know that only a single speaker will use the system.
  • Using very short words in your grammar is generally more challenging for ASR. You can try to change your input accordingly.
  • You can sometimes fine-tune the behavior of certain ASRs by looking at individual recognition alternatives returned and their confidences. You may also be able to change the alternatives behavior through configuration parameters – check the ASR reference for details.
  • Try with a different speaker. Some voices are more difficult to recognize than others. Accents and dialect do not make things easier either.

Generally, barge-in means that the user can speak already before a TTS output is finished, thereby interrupting the TTS. The method depends on the ASR mode.

  • In push-to-talk mode, your application needs to listen for the ASR event that signals that the PTT button was pressed. There is both a connector message and a SiAM-dp DeviceStateChanged client notification available for that purpose. When this signal is received, the application should cancel all TTS output.
  • In speak-to-activate mode, your application similarly needs to listen for ASR event that signals speech, and cancel all TTS output in that case. An additional challenge in this mode is to prevent the TTS to trigger ASR events if loudspeakers are used (echo). You will need to add acoustic echo cancellation (AEC) for that. Audio Manager provides the Voice Capture DSP for that purpose, which you can add to your configuration (see “Voice Capture DSP” in section “Common Plug-ins” of the Audio Manager documentation for details).

You need to run Audio Manager as an administrator to host a streaming server.

  • Recovery from such disconnection situations is a new feature of Audio Manager that may not yet work under all circumstances.
  • In any case, if the device is not re-detected, invoke a manual device scan by selecting Devices > List Devices in the console menu, and then clicking the Refresh button.

You can use the field named ssml in the SpeechSynthesis output representation to create more detailed utterances.

Please check out http://www.w3.org/TR/speech-synthesis/ for further information about SSML.

  • Check if you have sufficient bandwidth for streaming. Streaming uncompressed data at a constant bitrate requires a certain bandwidth which you can calculate. Having other network traffic going on at the same time reduces this bandwidth. In WLANs, other network nodes may also reduce the bandwidth. You can conserve bandwidth by reducing the audio format quality, e.g. by choosing a sample rate of 16 KHz instead of 48, or using mono instead of stereo.
  • Repeated data loss can also cause delays and drops. If you are using a wireless connection with bad transmission conditions (distance, obstacles, interference…), you are likely to experience drops. This is particularly bad with TCP-based streaming (including HTTP streaming), since each drop will cause streaming to block until a retransmission occurs.
  • Make sure that your client device has enough processing power. We have observed that interruptions in streaming mistakenly attributed to the network can also be caused by an endpoint device with low processing power (Google Glass), which was unable to handle the streaming at high bitrate in parallel with other tasks.
  • Generally, the streaming monitors (configuration pages for streaming virtual devices / servers) in Audio Manager will help you analyze throughput and identify the bottleneck of your streaming.

Try to reduce the buffer size for your StreamingPlaybackDevice.

On the dialogue side, you need to send an output act that includes a representation of type AudioTrack. In this representation, you can either use the trackId or the uri attribute to specify the sound to be played. The first is the ID of a track in your library and is usually a bit faster in execution. The second specifies the URL of a local (file://…) or remote (http://…) audio file. You need to address the output act to a device that can play audio. This can be a speaker device provided by Audio Manager, e.g. the same device you use for the TTS.

If you want to use Audio Manager to play the sound, you need to ensure that its configuration includes a SoundLibraryConnector. Therefore, the XML config file needs to include a Connector section like the following:

<Devices xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:noNamespaceSchemaLocation=”AudioDevices.xsd”>
<Device id=”spk1″>
<PhysicalDevice>DEFAULT_PLAYBACK</PhysicalDevice>
<Connector id=”tts” class=”DFKI.Automotive.Audio.TTS.DesktopTtsHandler”>
[…]
</Connector>
<Connector id=”snd” class=”DFKI.Automotive.Audio.SoundLibraryConnector“>
<Parameters>
<dir>%HOME%\AudioManager\soundlib</dir>
</Parameters>
</Connector>
[…]
</Device>
</Devices>

You may adjust the sound library directory (<dir>…</dir>) as necessary. You can then place uncompressed .WAV files into that directory or a subdirectory. You can use the file name (without extension) as the trackId in the AudioTrack representation (see above) to refer to a file.

 

 

General Input / Output

Topics related to connecting input/output devices in general, specific modalities other than speech, and extending SiAM-dp with custom devices. (A tutorial for creating custom devices is available from the Tutorials page.)

Yes, this is possible. In many simple and static scenarios, it is possible to add all i/o devices to the device configuration in advance, as outlined in the tutorial. However, in some scenarios, you may not know all devices beforehand, and additional devices are discovered while the dialogue is running (e.g. a user entering the room with her smart watch), or devices are removed.

For a massive multimodal system it is important that the system is aware of existing and available devices. In SiAM-dp the device manager is responsible for this, a service that keeps track of the registered, discovered, and connected devices. A device is represented in the project specification with information about the modality, the communication direction, channel information, an application internal identifier, and a physical identifier. Furthermore, the project specification provides information about which user has access to the device and in which dialogue session the device is involved.

Device Assignment

The device manager maintains three different lists for the device management:

  1. Registered Devices: These are devices that are registered in the project definition and thus are supported by the dialogue application. Additionally, new devices can be registered and existing devices can be unregistered during runtime.
  2. Discovered Devices: This is a list of discovered devices that exist in the environment and are available to the dialogue application. Devices can be either registered by the device discovery in device plugins that run inside the OSGi framework or by the specification of a TCPDevice with corresponding information about ip and port number. These can be provided directly in the project specification (or be discovered by the knowledge server: planned for a later version). Connections to TCP devices are established automatically. Devices can be discovered and lost during runtime.
  3. Connected Devices: Devices that are actually connected and involved in the dialogue application.

During runtime not every registered device must be present and not every discovered device must be connected to the dialogue application. In fact, the device manager permanently compares the list of discovered devices with the list of registered devices. If a device is discovered that meets the requirements of a registered device, the dialogue application connects to this device and makes it accessible to the dialogue. Furthermore, the device manager is responsible for creating a subscription pattern that subscribes to those messages from the event manager that are addressed to the device.

There are two stages for the matching process between registered and discovered devices. First, if the physical id of the device is given with the device registration, the assignment is explicit and only the concrete discovered device with an equal physical id matches. If no physical id is given the matching process is made by unification on the attributes given with the device descriptions, like user, modality, or communication direction. Thus it is possible to register devices to a dialogue application independently from the concrete connected devices which make devices easily exchangeable.

Device Manager Interface:

The device manager  is accessible via the OSGi service framework. It provides an interface of type IDeviceManager:

public interface IDeviceManager {
  // a device is discovered/lost by the knowledge manager or 
  // another bundle that provides a device implementation
  // these methods manipulate the list of discovered devices
 
  void deviceDiscovered(Device device);
  void deviceLost(Device device);
 
 // adds a device to the list of registered devices. 
 // The device is not necessarily available and active.
 
  registerDevice(Device device);
  unregisterDevice(String deviceId);
 
  // confirms the connection/disconnection to a device
 
  void deviceConnected(Device device);
  void deviceDisconnected(Device device);
 
  void reset();
}

Note that when using dynamic devices, it may no longer be appropriate to refer to devices by their ID when generating output. Instead, you should generate output that selects the output device for example by user and modality (e.g. you output a SpeechSynthesis act that has addressee = user1 and modality = SPEECH). Then, SiAM-dp will find a matching output device if one is available.

Even more general, you could create only modality-independent semantic output and have a generator produce the syntactic representation for those modalities which are available…

An OutputAct is used to send output to devices for rendering. It consists of a syntactical representation (the presentationAlternatives field) and a semantic content (the communicativeFunction) field. It is possible to specify only one of the two: An output act without a semantic content works fine for many simple dialogue applications. An output act without syntactical content is likely not ready for immediate rendering by an output device, but may be post-processed by an output generator component that generates a presentation that matches the semantic content.

Most output acts will contain only a single PresentationAlternative. However, if you would like to generate multiple alternative presentations from which the output device or the presentation planning engine can choose the most appropriate one, you can also do that. (Normally, it is better to sort out the right alternative before sending them to the output device, since a single multimodal task may be sent to multiple devices which could choose different alternatives.) For example, a visual and an auditive alternative can be generated, from which one will be selected based on the context. The notion is that the alternatives are exclusive (XOR).

The PresentationAlternative in turn consists of one or more OutputRepresentation instances in the presentation field. Again, most output tasks will only need a single representation, but in some cases, it may be useful to create a combined output act for presentations consisting of multiple aspects. This is often used to realize multimodal fission. For example, an output may involve givng a visual effect and playing some audio. Hence, the different presentations are ANDed together. It is possible that the same presentation is processed by multiple output devices, where each device picks the part(s) that is supports (e.g. the speaker picks the sound part and the light bulb picks the light effect part).

The actual representations then are derived from OutputRepresentation. For example, for speech synthesis output, you can use the concept SpeechSynthesis. You can always extend the model with custom concepts if needed, or fall back to the general-purpose CustomFormat.

 

The GUi model contains an UIElement with name HTMLView. It can either show valid HTML syntax that is given in the attribute content. Another possibility is to specify an URL that points to an HTML page with the attibute source. The renderer of an HTMLView element should provide a window element that is able to display HTML. At the example of the builtin HTML 5 renderer this is realized with an IFrame.

For GUIs that are not too complex and don’t require a lot of special effects / animation, the SiAM GUI model may be the best option. Here, you can declare a GUI using a model that extends the SiAM-dp i/o model with graphical input/output elements. These controls can be backed with communicative functions and primitive tasks, thereby supporting semantics directly linked to the UI. The GUI requires a renderer to render the model. SiAM-dp includes an HTML5 input/output device that can render the interface into a browser window, making it very portable. If you need another output mechanism, you could write a custom renderer, though this is quite a bit of work to support all GUI elements and GUI update mechanisms.

Another alternative for implementing a GUI is to use an existing GUI or implement a new one (e.g. in Adobe Flash or XAML), and connect it to SiAM-dp via custom messages. The GUI would in that case simply be a new input/output device – see the corresponding FAQs and tutorials for implementing one. Since the UI is tailored to a special application, it may be most appropriate to use the CustomFormat input/output representation type to exchange messages between SiAM-dp and the GUI.

You can extend SiAM-dp with custom devices. There are multiple ways how this can be done. See the tutorial for more details.

If you create an external TCP device, you will be exchanging XML fragments with SiAM-dp over a text-based channel. The XML format is directly derived from the EMF model of the data being sent: elements in EMF correspond to XML tags, and attributes in EMF are represented by XML attributes. Be sure to define the namespaces correctly as in the ecore model. An XML example is included in the tutorial.

Hint: If you are implementing your device in .NET (C#), there is a library available here that contains a reusable socket server that exposes I/O messages as .NET classes.

Make sure that the messages ends with a closing \0 (ASCII 00) character, otherwise it will not be accepted by the event manager.