Design - Multiple vs. Single Tasks within one skill?
The scenario is having multiple possible non-trivial tasks within a single skill - e.g. 'Alexa - tell to do "A"' (where there is a choice of several unrelated or semi-related tasks, "A" "B" "C" etc.). I want to use FSMs - each task represented by a single FSM. However, multiple tasks within a single skill complicates this design... Would you combine/refactor all tasks/FSMs into a single (huge) FSM? It's always possible, but seems like a testing/maintainance/extensibility/diagramming nightmare. Or do you split out the single skill into multiple skills? Easy, but not as appealing - the analogy is having one app in the App Store that does multiple things vs. multiple apps that do one thing - I prefer the first approach assuming size isn't a factor. To summarize, I prefer using a separate FSM for each task for for ease of maintenance, diagramming, implementation, testing, and extensibility - but am not quite sure how to design this when there are multiple tasks. The initial part is easy, fetching the appropriate FSM based on Intent. But what about task-jumping (intentional or not), serial dependency (i.e. do "A" before "B"), partial dependency (do "some of "A" before "B"), testing, diagramming, maintenance, etc. And while I am a huge proponent of FSMs, their shortcomings (both diagrammatically and voice-design-wise) are becoming apparent when it comes to extending functionality - for example, if I want to add "Start Over" functionality or the ability to "go back" and change an answer that was given 3 answers ago, etc, the changes in diagramming, code, voice-interaction, and test-cases become non-trivial. Yes, there are workarounds and hacks (and my favorite - just disallowing things) for all of this but I'm looking for an elegant solution. Any recommended advice, best practices, or design patterns? This seems analogous to the ubiquitous phone tree with multiple options (i.e. calling a Dr's office - are you a patient, do you want billing, to make an appointment, lab results, to schedule a procedure, refills, etc) so it's probably nothing new, although phone trees are hardly a model of a good user experience. Thanks in Advance
Mathematically, you can always represent any collection of FSMs with a single FSM. Practically, you point out there are maintenance issues once you hit a certain degree of complexity. Personally, my design ethic is end user centric. If the best user experience is hard to implement, well, there's one of me that has to go through the headache, and (hopefully) lots of users. It will end up being a better skill if you find out how to deal with the complexity. But, of course, there are Alexa limits. The key decision point for me would be if these various subtasks all require divergent utterances. If the sub-tasks have a lot of overlap in the utterances necessary to drive them, then I would combine them. If there is only a small overlap, then I would make them different skills. So, with that in mind, my suggestions for the two different paths: 1) If you are going to separate them out, push what common elements you can into libraries of shared code. That will help with maintenance and support. I'm a big fan of templated applications. TsaTsaTzu has a billion apps for Android, mostly thanks to templating (
https://play.google.com/store/search?q=jaquinta&c=apps). 2) If you are going to combine them all together, embrace the FSM mechanism. Create a generic FSM, driven by a table that you keep in data. The table should index input symbols (intents) versus snippets of functionality. Don't hard code the FSM in code. This way your complexity is isolated into maintaining that table. Put all your code in snippets, and reference them from the table. You might even be able to write some tools to test out the integrity, or to generate automated tests (which you could run with EchoSim).