question

hshahaws avatar image
hshahaws asked

getting error while trying to do a http request to speechrecognizer

I am trying to create an AVS app that would interact with AVS via an IOS app, built in swift. I have handled login with amazon, getting the access token via "AIMobileLib.getAccessTokenForScopes(["alexa:all"], withOverrideParams: nil, delegate: self)", and using that token to do a POST request to https://access-alexa-na.amazon.com/v1/avs/speechrecognizer/recognize . using following code to do so creating audioDat -: audioData = NSData(contentsOfURL: fileURL) print("the sound file is at \(fileURL.path!)") let soundFileURL = NSURL(fileURLWithPath: fileURL.path!) let recordSettings = [AVFormatIDKey: NSNumber(unsignedInt:kAudioFormatLinearPCM), AVEncoderAudioQualityKey: AVAudioQuality.Min.rawValue, AVEncoderBitRateKey: 16, AVNumberOfChannelsKey: 1, AVSampleRateKey: 16000.0, AVLinearPCMIsBigEndianKey: "YES", AVLinearPCMIsFloatKey: "YES"] HTTP put request func postRecording() { //urlString += "\(self.accessToken!)" let finalURL = urlString.stringByAddingPercentEscapesUsingEncoding(NSUTF8StringEncoding)! let request: NSMutableURLRequest = NSMutableURLRequest(URL: NSURL(string: finalURL)!) //request.cachePolicy = NSURLRequestReloadIgnoringLocalCacheData request.HTTPShouldHandleCookies = false request.timeoutInterval = 60 request.HTTPMethod = "POST" print("\nAccessToken:\(self.accessToken!)") request.addValue(" access-alexa-na.amazon.com", forHTTPHeaderField: "Host") request.addValue("Bearer \(self.accessToken!)", forHTTPHeaderField: "Authorization") let boundry: String = "BOUNDARY1234" let contentType: String = "multipart/form-data; boundary=\(boundry)" request.addValue(contentType, forHTTPHeaderField: "Content-Type") request.addValue(" access-alexa-na.amazon.com", forHTTPHeaderField: "Host") let httpBody: NSMutableData = NSMutableData.init() //PG: JSON multipart header httpBody.appendData("--\(boundry)\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("Content-Disposition: form-data; name=\"request\"".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("Content-Type: application/json; charset=UTF-8".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) //PG: JSON multipart body httpBody.appendData(self.createMetadata()!.dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) //PG: JSON Audio multipart header httpBody.appendData("--\(boundry)\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("Content-Disposition: form-data; name=\"audio\"".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) httpBody.appendData("Content-Type: audio/L16; rate=16000; channels=1".dataUsingEncoding(NSUTF8StringEncoding)!) //PG: Audio multipart body // httpBody.appendData(self.audioData!) httpBody.appendData("\r\n".dataUsingEncoding(NSUTF8StringEncoding)!) //PG: Terminating boundary term httpBody.appendData("--\(boundry)--".dataUsingEncoding(NSUTF8StringEncoding)!) //PG: POST request to AMZN URL var config: NSURLSessionConfiguration = NSURLSessionConfiguration.defaultSessionConfiguration() var session: NSURLSession = NSURLSession(configuration: config) request.HTTPBody = httpBody do{ let string1 = NSString(data: httpBody, encoding: NSUTF8StringEncoding) print(string1!) } catch { print("could not write httpbody") } let task = NSURLSession.sharedSession().dataTaskWithRequest(request) { data, response, error in if error != nil { print("error has occurred in http request : \(error)") return } print("response : \(response?.description)") var parseError: NSError? do{ let responseObject: AnyObject? = try NSJSONSerialization.JSONObjectWithData(data!, options: NSJSONReadingOptions.MutableContainers) if let responseDictionary = responseObject as? NSDictionary { print(responseDictionary) } else { print("response is not a dictionary") } } catch { print("error has occurred in http request") } } task.resume() } and getting following response-: response : Optional(" { URL: https://access-alexa-na.amazon.com/v1/avs/speechrecognizer/recognize } { status code: 400, headers {\n Connection = \"keep-alive\";\n \"Content-Encoding\" = gzip;\n \"Content-Length\" = 185;\n \"Content-Type\" = \"application/json\";\n Date = \"Mon, 07 Mar 2016 16:34:37 GMT\";\n Server = Server;\n Vary = \"Accept-Encoding,User-Agent\";\n \"x-amzn-ErrorType\" = \"BadRequestException: https://developer.amazon.com/edw/home.html";n \"x-amzn-RequestId\" = \"7baced67-e482-11e5-ad81-0dd2128b4b85\";\n} }") { error = { code = "com.amazon.alexahttpproxy.exceptions.BadRequestException"; message = "No content to map due to end-of-input\n at [Source: [LineReaderInputStreamAdaptor: [pos: 0][limit: 0][]]; line: 1, column: 1]"; }; }
alexa voice service
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hshahaws avatar image
hshahaws answered
Following is the httpBody that is formed (in case it helps) -: --BOUNDARY1234 Content-Disposition: form-data; name="request" Content-Type: application/json; charset=UTF-8 { "messageBody" : { "locale" : "en-us", "format" : "audio/L16; rate=16000; channels=1", "profile" : "alexa-close-talk" }, "messageHeader" : { "deviceContext" : [ { "namespace" : "AudioPlayer", "payload" : { "playerActivity" : "IDLE", "streamId" : "", "offsetInMilliseconds" : "0" }, "name" : "playbackState" } ] } } --BOUNDARY1234 Content-Disposition: form-data; name="audio" Content-Type: audio/L16; rate=16000; channels=1 --BOUNDARY1234--
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Eric@Amazon avatar image
Eric@Amazon answered
A 400 response is often due to incorrect audio. The audio format should be - 16kHz, LITTLE-Endian, signed 16-bit PCM audio. Is that what you're sending? One thing you might want to try is using Audacity to import the audio data as the above format, and verifying that it's clear and understandable.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hshahaws avatar image
hshahaws answered
Hey Eric, I tried with a clear audio and the given format of the audio and now I am getting following request : { URL: https://access-alexa-na.amazon.com/v1/avs/speechrecognizer/recognize } { status code: 200, headers { Connection = "keep-alive"; "Content-Encoding" = gzip; "Content-Type" = "multipart/related; boundary=859b03c8-0779-447e-ae27-88bcc36b7b88; start=metadata.1457387122280; type=\\"application/json\\""; Date = "Mon, 07 Mar 2016 21:45:22 GMT"; Server = Server; "Transfer-Encoding" = Identity; Vary = "Accept-Encoding,User-Agent"; "x-amzn-RequestId" = "e3414a8d-e4ad-11e5-bc44-e16b8ecee342"; } } how do I actually read the audio and play it in my map? here is the module that does the http call let task = NSURLSession.sharedSession().dataTaskWithRequest(request) { data, response, error in if error != nil { print("error has occurred in http request : \(error)") return } print("response : \(response?.description)") do{ let responseObject: AnyObject? = try NSJSONSerialization.JSONObjectWithData(data!, options: NSJSONReadingOptions.MutableContainers) if let responseDictionary = responseObject as? NSDictionary { print(responseDictionary) } else { print("response is not a dictionary") } } catch { print("error has occurred in serialization") } } task.resume()
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Eric@Amazon avatar image
Eric@Amazon answered
If you look at the example HTTP response in the REST API, you can see what format the response is: https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/rest/speechrecognizer-recognize-response In your response, the boundary term is 859b03c8-0779-447e-ae27-88bcc36b7b88, so your response should look like this: --859b03c8-0779-447e-ae27-88bcc36b7b88 Content-Disposition: form-data; name="audio" Content-Type: audio/mpeg Content-ID: ABCXYZ --859b03c8-0779-447e-ae27-88bcc36b7b88-- To just get something working, I would recommend putting everything between Content-Id: and --859b03c8-0779-447e-ae27-88bcc36b7b88-- into a file, and then playing that file as an MP3. I'd also suggest investigating a multipart HTTP parsing library - it should make your life a little easier. Good luck!
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hshahaws avatar image
hshahaws answered
Thanks Eric, is it possible for you to suggest a multipart library that would help me in extracting the audio bytes from the response data. I tried using subdataWithRange(NSRange) to extract the audio bytes and play it via AVPlayer and does not work. Also I tried copying everything in data from content ID to last boundary into an mp3 file and tried to play. It did not work. It would be great if you can help me out on this.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hshahaws avatar image
hshahaws answered
toggling the question to not answered Message was edited by: hshahaws
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Eric@Amazon avatar image
Eric@Amazon answered
Unfortunately, I can't recommend a particular swift parsing library. We don't use Swift for any reference material, so we have not investigated any http parsing libraries.
10 |5000

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.