Alexa Skill development for Amazon Echos and other Alexa enabled devices has become a topic that more companies are starting to think about. Because of the rapid rise in use of Amazon Alexa, there is a gap between knowledgeable developers and demand for Alexa Skills. Amazon is making an attempt to fill this void by highlighting agencies that do development for Alexa Skills and Alexa development framework options to make it easier to build skills. This post was written to dig into several Alexa development framework options that are worth exploring for anyone looking to build an Alexa skill.
The main tool that Amazon is trying to get developers to use is the Alexa Skills Kit SDK (Software Development Kit) for Node.js. With this kit you use JavaScript to make your Alexa Skill work within AWS Lambda. This kit will help you build responses based on the requests that come from Alexa.
Amazon has created the Alexa Skills Kit SDK with the idea that people are going to use AWS completely to build Skills in AWS. There are options to use other AWS services in conjunction with your Skill and this kit will make that process quite a bit easier than programming that from scratch. This appears to be Amazon’s recommended way of starting, or at least the toolkit that Amazon themselves is promoting the most.
Click here to access the Github page for the Alexa Skills Kit SDK.
With the Alexa Skills Kit SDK by itself, it can be a little difficult to build and test the Alexa Skills that you want to create. There isn’t really any good testing tool, and you will have to make a Lambda function for your Skill and connect a development Alexa Skill before you can begin any testing. Each time you do an update in the code you will have to update the Lambda function and retry the Skill using your voice and the Amazon Echo. This might not seem that bad considering you are developing a skill that will use a real voice, but the time it takes to update for a minute code change can be quite frustrating.
Another option, is the Alexa-App framework created by Matt Kruse. This framework is not affiliated with Amazon, but is starting to get a good following, and now has multiple people working on it. Matt Kruse has also created a testing tool called Alexa-app-server that is used to test your Alexa skills in a web browser with immediate feedback. This can greatly increase the speed of development for your Alexa Skill. Other people have started to work on parallel projects based on Matt’s original work and have created some tools that you might want to use depending on what you want your Skill to do.
Click here to access the Github page where this project is maintained.
The Alexa Skills Kit SDK and Alexa-App are the two Alexa development framework options that are most used and, in my opinion, should be the starting point for developers new to Amazon Alexa development. The next few are options that Amazon lists on their Agencies and Frameworks page, but would take some initial research to determine whether they will work for you.
Bespoken Tools have recently started promoting their Alexa Skill development tools. Their tools are command line/terminal based and give developers the option to develop in multiple programming languages (Java, Python or JavaScript) where most of the other frameworks only allow one programming language (typically JavaScript). They also provide a development environment and deployment tools. This looks like it could be promising framework for developing Alexa Skills, but since it is fairly new it is probably worth testing out before fully committing your project to their tool set.
Another Alexa Development Framework that has a more inclusive tool set is PullString. This looks like it would be great for developers that want a conversational app on multiple platforms. Pullstring can help to develop apps for Messenger, Skype, Slack, Alexa and possibly others. If you plan on making your skill or conversational application work across multiple platforms, it is definitely worth checking out PullString.
Amazon lists Conversable as a framework for developing Alexa Skills on Amazon’s Agencies and Frameworks page, but after further inspection it doesn’t seem to be quite as open as some of the other Alexa development frameworks. They have a good looking website, but since they don’t seem to have open documentation it is difficult to tell how useful their tools are. This is another one where it might be worth considering if you are building a cross platform application, but you will probably have to talk to their sales team before you investigate their framework and tools further.
With Alexa Skills becoming popular and developer demand growing it is worth taking some time to decide on a tool set before committing to developing a new project. Depending on your needs and development experience there are multiple options that might work for you.
With Google Home and other conversational app platforms also on the rise you might be tempted to try and use one of the cross-platform tools. This might work for you, but you should also decide if you want to be locked into a certain framework, especially with how new and rapidly advancing Alexa development is in general.
My advice for people totally new to Alexa Skill development is to start with either the plain Alexa Skills Kit SDK or start with the Alexa-App framework and if your needs expand beyond what those tools have to offer then change your tool set at the time you need to. If you are already a seasoned Alexa Skill developer, and are considering making more complex applications that would benefit from being cross-platform then it might be worth spending some time investigating the other frameworks.
Now that you understand frameworks, how do you go about building your app? Years ago we developed both an iOS app and Alexa Skill PingRing.com Alexa Phone Finder. This was not the first iOS application that was built at boberdoo.com, but this was our first time building an Alexa Skill and our first time building an Alexa Skill that also works with a mobile application.
As the Amazon Echo has gained popularity and more households are using Smart Devices it is worth considering whether new software and products should incorporate voice functionality. Up until the last year or so, voice functionality was by and large a Star Trek fantasy or only used in very specific industries. Even before the holiday season was over, there were over five million Amazon Echo devices sold and over three thousand Alexa Skills developed for it. This number undoubtedly is even higher now and Amazon was unable to keep up with demand for the devices over the last few months.
Amazon has improved their Alexa Skill development process and has also drastically improved the ability of users to find Skills in the Alexa App. So, for a lot of businesses now is a good time to start looking into making a voice service that could tie into your application or business.
For us, we decided to build a Skill that works with an iOS application in order to allow people to use Alexa to find their phone, find friends or family members or send messages to people in their group. We had an advantage that some businesses might not have. We were starting completely from scratch, so we could use all of the tools and the platform that was most closely tied to the Amazon Alexa platform. We based the entire backend of our application in AWS, and tried to leverage AWS to make our development as easy as possible.
We started our development planning by thinking about what information we wanted to be able to get from Amazon Alexa, and what actions we wanted Alexa to take on our behalf. We decided to build these core features for Alexa:
In order to do this we had to figure out how we would connect our phones to the Amazon Echo.
Amazon has created a pretty good system for developers to use for creating Alexa Skills. The Alexa Skills Kit that Amazon made for developers does all of the difficult language processing. All developers have to handle is text and some user information in their application, then send text back to the Alexa Skills Kit for it to turn into sound and a response to the Amazon Echo user.
There are two main ways to process the information to and from the Alexa Skills Kit.
1. Send the data to a URL somewhere on the internet (a URL that does something in your application, running on your server)
With this option, you will have to deal with a number of things that you otherwise wouldn’t have to worry about if you use AWS Lambda to process everything. When you send the data to a URL for an application and server that you run, you will have to pay for all of the costs to maintain that server, and also make sure that you can scale that server and application with the possible demand coming from all the possible Alexa requests.
2. Process the Alexa Skills Kit information using one or more AWS Lambda functions
With this option, you only pay for the processing that you use, and it can pretty much scale infinitely because of how it runs in AWS. AWS Lambda also has an extremely generous free tier, and drastically reduces your up-front investment into a new application. For our application, we decided to go the route of using AWS Lambda, and for most use cases I would also recommend using AWS Lambda.
Pretty much the only reason I could see people wanting to use a URL that points to one of their servers is if they have a complex application already built that would not translate well to using within AWS.
We used a few AWS services that greatly decreased the time and work required to bring this application to life. The main services within AWS that we used were:
AWS Mobile Hub
We used AWS Mobile hub to help get us started with the backend required for building a mobile application that takes advantage of all networking and communication through the AWS cloud. Mobile is a great starting point for creating an AWS backed mobile application although we did run into quite a few documentation issues. In fact, the most difficult part of pretty much the entire project was sifting through the AWS documentation, and just how outdated or clearly incorrect some very crucial parts of the documentation were. Amazon changes newer parts of AWS very frequently and sometimes the documentation can lag behind for a year or more.
After we figured out how to work with AWS Mobile Hub and how to integrate an iOS application into AWS we were able to focus on building the parts that were specific to only our application. Mobile Hub and the demo apps for Android and iOS that they provide, go a long way in helping to build your specific application. Mobile Hub also helped to provide the backbone for all the other AWS services we used and made it very easy to track and send push notifications to mobile devices.
DynamoDB
We used DynamoDB to store almost all of the data for our application, and I would highly recommend this to other developers creating a new application that isn’t heavily dependent on a relational data structure. DynamoDB is very easy to scale as long as you follow the guidelines provided by AWS, and also has a very generous free tier. Pricing for DynamoDB is based on your read/write capacity and you only pay storage volume over 25GB.
Amazon Cognito
We used Amazon Cognito to manage all of our user authentication, and also manage permission to access other AWS services. This is also a fairly new product offered by AWS, but something I would highly recommend if you are starting a new application in AWS from scratch.
API Gateway
We used API Gateway for part of our application to manage network requests that were later processed in AWS Lambda. We didn’t use this as fully as we could have, but I would recommend that others look into this further if they are developing an application based in AWS that requires a lot of API usage.
AWS Lambda
Finally, we used AWS Lambda in many places as the “glue” that communicates between other AWS services, and also does our application logic that needs processing in the cloud. Before this project I had not used AWS Lambda much at all, but once I became used to using it, I started seeing a lot more potential applications. I think this is one of the most important products that AWS has created, but might take a while for developers to fully understand and figure out how to integrate it into their application architecture.
Aside from building our actual application, it is also worth mentioning how we released our application to end users and the process of doing so. There is a big difference in how apps are released in the iOS app store vs how “Skills” are released through Amazon’s Alexa service. Both have pretty strict requirements and do their best to not allow bad or unusable apps into their service. That is generally a good thing.
The biggest difference in terms of difficulty between the two platforms is how transparent they are about what needs to be done in order to release an app or Skill. Apple has tons of documentation about guidelines for app submission, but once you officially submit the app to Apple, you are pretty much at their mercy. Their app reviewers can be anywhere from extremely helpful to downright infuriating. In our case, unfortunately they were downright infuriating. I won’t go into too much detail, but for other apps I have released to the iOS app store I have encountered similar situations and I have heard a number of similar stories from other developers. Apple should remedy this by putting a lot more effort into communicating with developers.
Amazon on the other hand, was extremely helpful, and extremely responsive through the release process. Our skill required multiple submissions to get right, but Amazon gave very clear descriptions, not only of what was wrong, but also what needed to be done to pass their requirements. I was very impressed with Amazon’s testing and help with getting an Alexa Skill approved.
From what I have seen so far in Alexa’s Skill market and the iOS app store there aren’t very many mobile applications that also currently integrate functionality with the Amazon Echo. I am sure there will be more in the future and I can only imagine that more people will run into issues trying to release their application to both simultaneously. The part that we found most difficult in that respect was the timing of release for both markets or platforms. Apple does allow developers to schedule a release for their application after the app has passed Apple’s approval. This can be very helpful for developers. In our situation, we had to release the Alexa Skill first so that Apple could test our application using an Amazon Echo. This would have been okay if Apple was quick to release our application, but in our case Apple delayed the release of our iOS application by more than a month for reasons they never explained to us, and caused multiple negative reviews by Amazon Echo users. Amazon Echo users that initially tried our Skill were unable to use the iOS application and it might have impacted initial opinions of potential users. This could be greatly improved if Amazon had a better beta program or limited release program for their Alexa Skills. Hopefully Amazon will release something like this in the future.
Amazon Alexa Skill development has not been around for a very long time, and has a lot of room to grow. I think Amazon might have been caught off guard with the demand from consumers for the Amazon Echo devices and I can only imagine that Amazon will continue to improve their platform. Apple’s iOS app store has been around for a while, and while I am sure some improvements will be made, I don’t think Apple’s platform will change very significantly in the next few years.
Hopefully this post has helped illuminate the process of creating Alexa Skills that work with mobile applications and might inspire other developers that are considering creating an Alexa Skill for their business or application. We definitely learned a lot from the process of creating this application, and will have a much better idea of what to expect if we go down a similar path in the future.