How exactly do you train Livy to learn a new function like help call detection using artificial intelligence? And what do mushrooms, an elf and Chile have in common? Today, we're sharing more transparent insights into the development of our new smart living station feature.
What do mushrooms, an elf and Chile have in common?
They form acoustic negatives for the German word "help" and train our AI model for Livy's call for help recognition. Livy thus learns different words and trains with a wide variety of vocabulary.
Thanks to artificial intelligence, we are teaching our Livy to respond independently to calls for help in the future. Unfortunately, an AI-based function cannot be implemented within a few weeks, and in our previous article we explained what AI-based training looks like and what requirements must be met for Livy to learn to recognize a call for help.
So far so complicated.
Because learning to recognize a call for help requires not only the appropriate training model, but also a large amount of data from which the #AI learns. For Livy, this means processing as large a volume of audio recordings as possible, from which the Smart Living Station learns to recognize calls for help. Unfortunately, breathing "help" into the microphones a few times and giving the data to Livy is not enough. Quite the opposite. It takes a lot of voices, pitches, accents, even sometimes an unclear help or background noises.
In short: it takes tones of data!
Why does Livy use negative examples to learn how to recognize calls for help?
Livy does not only want to be fed with the word "help", but rather with several negative examples, so that the station learns to independently distinguish the word help from all kinds of other - especially similar - words.
But why are the negative examples the basic requirement for Livy to reliably recognize the word help?
Of course, we don't want Livy to trigger on the off chance for our users, but to act with the highest possible hit rate - ideally reacting 100% only to real calls for help. So it needs many similar words that are identified by her as "No call for help".
Moreover, in everyday life we often use the word help in different contexts, which are not always related to a real call for help. We say "help", "helps", "helped", "provide help" and many more. All of this always in different contexts and conversations.
Livy is not a human listening to a conversation and filtering contexts, but in the end it is "just" a device to be triggered on a specific word in an emergency situation. In order to understand what it is all about, Livy has to learn all the different words to be able to distinguish what is "right" and what is "wrong".
Of course, we want to perfect the call for help recognition so that Livy recognizes every real call for help and acts immediately in an alarm situation.
At the beginning of the AI training, this is a pure gamble for the Smart Living Station. But when it has its first successes and gets it right, it will look for ways to increase its chances of success to get a hit more often. In application, this would mean that Livy responds not only to "help" but to anything that sounds similar, thus increasing the likelihood of triggering an alarm at the right time. Of course, this would then lead to many false alarms, which we obviously want to avoid.
To perhaps better illustrate this, let's take an example from another field: in the case of a mutual fund, this would mean that investors (representing Livy) do not bet on a single stock or property, but spread the capital across the fund to increase the chances of return. The assumption is not that the one horse will maximize profits, but spreads the opportunities, across many different ones.
Livy acts similarly, at least at the beginning of the training phase. Currently, she can already identify very clear calls for help. With the help of the Levenshtein distance, we have identified similar words that Livy can train with and thus teach her a broad vocabulary, so to speak.
Call for help detection: What influence do background noise and silence have?
What Livy does not clearly recognize as a negative example is perceived as a call for help. And this actually includes background noises such as birds chirping, radio music, the television or a doorbell. For our Livy training model, this means that it must be fed not only with negative words and phrases, but also with all kinds of (side) noises. And how do you get so much training data for side noises? Do you now sit down in the park with a microphone and record various bird songs?
In a very creative moment, our internal AI specialist came up with the idea of breaking down into sequences a soundscape that is not so far-fetched for our users - the german TV show "Musikantenstadl".
The functions of the Smart Living Station are particularly suitable for the (remote) care of people in need of care and seniors. A popular show among seniors is the Musikantenstadl and so it was obvious for our employee to take the show as a data set, break it down into 4-second sequences and feed it into the Livy training model. In addition, of course, there are many other data sets from different resources. Especially sounds are partly available in different databases and can be purchased or freely used. We also access these and let Livy learn with them. All training data is adapted in advance into 4-second equivalent sequences and then "marked" so that the training model can distinguish between right and wrong. In order to end up with a function that acts with the highest possible hit rate, we are currently training with 3 different models: two that are installed directly on the station and one that runs through one of our internal servers, as it requires significantly more computing power. All are based on the same approach and want to be trained with positive as well as negative examples. However, they show different success rates.
We can therefore achieve different training successes and use the results of one model to feed the other model and thus improve the function in a targeted manner. In this way, we can ultimately implement a reliable model on the Smart Living Station that our users can use. As before, you can support us and contribute decisively to the development of the function.
Leave us your comment and confirm your wish to participate. We will then contact you with an individual link that you can use to participate.
Via the link, you can easily enter various words directly via your smartphone, tablet or laptop without any special tools.
NOTE: All data is processed anonymously and forms the basis of a reliable call for help detection with Livy Alive for us. Every single recording helps us to further develop the function. It is not necessary to complete all levels. Thank you for your participation!
Comment now and participate.