Continue Thinking Small: Next level machine learning with TinyML — Maria Jose Molina Contreras

00:00
00:00
00:00
00:00

Thank you very much.

00:05

It's, I'm super happy to be here with all of you talking about tiny mail.

00:11

Before I start, I would like to just let you know which kind of talk will be this one.

00:18

Just to be sure that everyone is in the same page.

00:21

This is an introductory talk where we are going to cover a lot of topics.

00:26

I'm going to try not to overwhelm you a lot, but we're going to see a lot of things.

00:33

And you're going to discover why the structure of this talk, during the talk.

00:40

Well, I'm going to keep it very short because the third bit of great job introducing me.

00:46

Then I'm just going to go into a couple of things.

00:51

In the last years, I have been working mainly with center data.

00:56

And it's this kind of projects that I just love to build also in my free time.

01:03

For instance, growing lettuces in my balcony.

01:07

It's something that I just enjoy a lot and I truly recommend to you do it.

01:13

Awesome.

01:14

Implementing some machine learning, pipelines to increase your happiness plant.

01:19

And well, as you can see, my plants needed.

01:23

And it's a clear example that sometimes we don't need machine learning to understand that our plants are not happy.

01:31

But I enjoy a lot doing these kind of things.

01:35

And in my last project that I developed at home, it was for kind of half control of my air quality.

01:44

And I have a lot of fun building that because it was a monitoring system, but also a predictive system.

01:50

I could know when I should activate my ventilation system that in other words was just when opened a window.

01:59

But I and I enjoy a lot.

02:02

I thought, wait.

02:04

What if I can have this kind of system?

02:07

But with me, when I go to the street and see who is the contamination level,

02:13

what if I need to wear a mask or not?

02:16

What do these kind of things?

02:18

And that was the moment when I start with tiny ML.

02:24

Currently, I am developing this project.

02:27

I cannot show you yet because it's not done.

02:31

But maybe for the next conference, we'll be ready.

02:35

But during this time developing this project, I learned a lot of things.

02:42

And I realized that this knowledge could be shared with a community.

02:47

And maybe it could help to someone to just do it easier.

02:53

And let's start for the beginning.

02:56

What is tiny ML?

02:58

Tiny ML.

03:00

It's an interception of two great goals.

03:03

It's an interception between data science and electronics.

03:07

And specifically between machine learning and embedded system.

03:12

And wow, it's awesome because you have a very great two worlds working together.

03:19

Yeah, that's true.

03:21

However, it could be kind of challenging sometimes because you need some knowledge of data science

03:29

and also some knowledge of electronics.

03:32

Or have two teams that can work together and have good communication between them.

03:39

But we are going to cover all these topics in case that you can just build your own projects as well.

03:47

Why, Tiny ML?

03:48

Well, one of the key points from my perspective is data privacy because you're going to be able to have all the

03:58

manage your data, your prediction in the device without having the necessity to have access to a cloud or other things.

04:06

Low energy consumption.

04:08

This device doesn't require a lot of energy and a lot of power.

04:13

Low latency is how it should be faster.

04:17

And also connectivity.

04:19

If you're working with IoT systems and data, this kind of sensor data,

04:26

maybe you are going to face some internet connections time to time and you're going to lose data,

04:32

or you're going to have issues, or maybe you need to have this system in a place where the internet doesn't arrive.

04:40

This is going of the ones.

04:43

And maybe you're thinking, yeah, but has this application in real life?

04:47

Yes, there are applications.

04:50

We can find examples currently and also for the future.

04:54

One of the topics that also I think that it's very, very important, it's healthcare.

05:00

And it's because the data privacy that we were talking about.

05:06

If you have a device that track your phone information and can keep this data privacy,

05:15

maybe it's a good way to make things.

05:20

Of course, this is only an example, a publication, but there aren't many of them that we can discuss later on.

05:27

Also, I get a culture.

05:29

Maybe you're thinking, why I get a culture scheme?

05:32

Well, not everywhere arrive.

05:34

Internet, not everyone have access to internet there and has the ability to have a device or even in the mobile.

05:43

Some models for predicting pairs, predicting who happened to your plant go help to the farmers around the world.

05:53

Also, as matter of spaces, it's important to understand if you have a person in the room or not for security reasons,

06:01

sometimes in some spaces.

06:04

Then we have more.

06:06

We have predictive maintenance.

06:08

This is one of my favorite topics that I'm going to do in some mentioning because I have been working for long.

06:15

And it's because of that is that you can predict also if something going to fail in places that maybe the internet connection doesn't arrive.

06:22

Also, we have world conservation, there are projects that currently we can track elephants in this case, but also there are other with whales.

06:31

In this case, it's to avoid fortives, and avoid that the elephants could be killed.

06:39

And also, we have some recognition.

06:41

Maybe it's familiar to you this, hey, blah, blah, blah, blah, blah, blah, blah, blah.

06:48

And something happened.

06:50

Maybe something happened.

06:52

Maybe something happened.

06:54

Maybe some stone could play.

06:56

Yeah, maybe you're familiar with some of those.

06:59

This is also an application or something.

07:02

Then as we were talking about before, there are some challenges or things that you need to have in consideration before start this kind of projects.

07:13

And we are going to start from the embedded system first, from the electronics part.

07:20

First is that the development boards, which one you are going to use, whatever the needs of your project, which more.

07:29

Of course, this is my bias because it's the boards that I am using currently for developing my undoing my experiments.

07:36

And specifically, I am playing around these two, but of course you can choose the board that you need or you want.

07:49

What is important?

07:51

Understand what are your needs of your project.

07:54

And then this means that you're going to need to always have in mind what is your data science question on problem.

08:03

And we are going to see it later in the machine learning part because even it's very cool to see I'm going to take this board because it's trending topic.

08:14

Maybe it's not the best for your business case or your experiments or whatever you want to implement, right?

08:22

We are going to see later on.

08:24

But just let you know, I'm super fine of those because also it's super small.

08:29

I'm not sure if you can see it, but because it's very small, it's the size of my A.

08:34

But just to give you my money to sense of that.

08:40

What are some of the challenges that the region is seeing?

08:44

It's very hard.

08:47

It's very hard to have some standards and tell you you need to do ABC for all the boards.

08:53

And of course, if you are in the electronic goal, for you will be very easy.

08:58

But if you came from the data, it will be more challenging.

09:03

And of course, the constraints, the constraints that we are going to find.

09:09

And this is something very important because when we're developing things in data science, of course,

09:15

we try to be efficient.

09:17

We try to optimize.

09:19

However, we have other kind of challenges in the computing side, for instance.

09:25

Because usually, we develop our project in a computer or in a cloud and kind of happy.

09:33

We train the models.

09:34

We have these things.

09:36

But then in this case, we'll be much challenging.

09:42

This means that, for instance, a training, we are going to need to keep it as we are doing it right now in a computer or in the cloud.

09:52

Of course, these numbers are audientities.

09:54

They're going to be changing the pens of the case.

09:56

Also, this is the information that came from this website.

10:00

Of course, it could be different.

10:02

But just, I wanted to share with you the constraints that we need to face.

10:06

And one of the key things is the memory.

10:10

We are going to need to fit machine learning models into this, into this storage.

10:19

And have it running in this memory.

10:23

But no worries, we are going to go into that.

10:27

But it's a real challenge.

10:30

Of course, development environment.

10:32

If you develop a new work with microcontroller for you, this is not a problem at all.

10:37

But if you came from a different world, it's like, okay, oh, oh, I do that, right?

10:41

This is new.

10:42

It's like, I need to connect it and imagine it happened, or I can do something else.

10:47

Yeah.

10:49

Well, there are different platforms that are different options that you can develop.

10:53

Of course, you can also use the command line.

10:56

Very simplified.

10:58

Or just just a platform that it's open source and also is language agnostic.

11:06

And you can work a lot.

11:08

And of course, you're going to know a bit of electronics or ask for some super with

11:16

information.

11:17

Because you're going to need to do some, the flashing and move files.

11:22

And understand how fun works and the specific to see of the board and these kind of things

11:28

that, again, when you are in the field is super obvious.

11:31

But when you came out of that, it's like, I'm not sure what I'm doing.

11:37

And of course, always within the documentation.

11:40

So check.

11:41

My recommendation is to be sure that you're going to be able to follow up.

11:49

But it's true that once you do a start, you realize that this, it's going to be a straightforward.

11:56

It's just a follow up, some steps and everything going to work sooner or later.

12:04

So for the machine learning challenges.

12:08

And here we're going to find some of them.

12:14

Well, I'm here.

12:15

We have a typical data science machine learning pipeline with the different steps.

12:22

If you are into the data science world, you know that this linearity is not correct.

12:30

We are going to need to collect the data, process the data, wait.

12:35

The model is not performing very well.

12:37

You're going to need to.

12:39

And then what about now?

12:41

I know wait any more data.

12:43

The data is not very good.

12:45

And you're going to need to have this iteration in continuous.

12:49

And this is going to happen the same.

12:51

This will be exactly the same.

12:53

You need to do exactly the same, the things that you do normally.

12:58

But with some differences that we are going to write later on.

13:02

The first, the, the, the, the, the only thing that I wanted to add in that point is like,

13:08

just be careful because as I was mentioning called the pipeline,

13:12

gonna need to be into the computer or cloud that you use until the deployment,

13:19

of course, that will be in our microcontroller.

13:24

If some of you are thinking, but wait, how I'm going to implement now neural network,

13:30

because I came from a different field.

13:33

How can I do that from zero?

13:35

There are options that will simplify this process.

13:39

And you can use a non code option.

13:43

In this case, in this platform, the, it's impulse, you can have all the pipeline.

13:49

Just with clicking options.

13:51

You can have the dashboard.

13:54

There is, you can have the collection of data that you can,

13:59

I think that you can have it with sound.

14:03

Then you can have some model, train the model, have the metrics and that.

14:09

One of the things that are very cool is that also you can know

14:14

how big you're gonna be your model when you train and see all your metrics.

14:19

On the side, if this is a good option for you, or you need to just now go back

14:26

and do all this pipeline that we were talking.

14:29

Maybe you need more data, maybe, maybe not.

14:32

One of the things that this platformer, I found it that was not matching

14:37

with my necessities, is that it was a specific, for some specific boards.

14:42

And I wanted to work with some specific boards, or Michael, not developing.

14:46

Obviously, in case that the availability of the models, because I want to solve

14:51

my data science problem in our specific way.

14:55

And in this case, the offer is still in the belloping, I think.

15:01

I think at some point they're gonna have it, but it's still that.

15:06

But if you want to try it, it's a very nice option for sure.

15:11

However, today, we are gonna go for trying to do it by scratch,

15:17

because I wanted to share all the points that you're gonna need to have in mind

15:22

once you have, once you want to develop the project to make decisions.

15:30

First thing, gonna be what type of data science problem I have.

15:35

It's a super-wise, or is it a super-wise?

15:39

Will be some classification, some progression.

15:43

What exactly will be an anomaly detection?

15:47

All these kind of questions you are gonna also need to answer.

15:52

And of course, depends on the type of data, you're gonna have different challenges.

15:58

For instance, for anomaly detection, in this case, they were, in this case,

16:03

were used in audio, and in other cases, vibration.

16:07

This means that the way that you're gonna need to process this data is completely different.

16:13

And work without you, for instance, is very challenging.

16:17

And you're gonna need to face some challenges that you are not gonna need to face it in vibration data, for instance.

16:25

Or if you are working with object recognition, with images, also will be different.

16:32

Of course, maybe you're thinking, okay, but then let's call like data from zero.

16:37

Yes, this is an option.

16:40

Then you have more options inside this option.

16:43

You can have sensors.

16:46

In this case, I'm sharing with you the sensor of CO2 from the project that I was mentioning before.

16:52

But also, you can have some microcontrollers with sensor integrated in the one that I was sharing before.

17:01

But really you're thinking about the way they have a mobile phone.

17:04

I can call like data from there.

17:06

Yes, of course you can, but also you're gonna need to handle them.

17:10

The storage or who are you gonna connect these or other challenges.

17:14

But of course, there is a full full of options.

17:20

What more?

17:21

We have public data sets.

17:23

We can also use them.

17:24

We can also use pre-training models.

17:27

People is the training models.

17:29

Google people, for instance, has models available that we can use and use transfer learning and do some kind of adjustment of our case.

17:40

Yeah, this is a possibility.

17:42

However, you're gonna need to also understand which kind of data and if this is something that works for you.

17:50

But yeah, there are a lot of options that we can have.

17:54

Let's go for the designing model and training the mobile.

17:58

Because, of course, you need again to make decision.

18:02

Don't be overwhelming for this slide because it's full of information,

18:07

but I want just to show you a couple of things that are super, super important to have in consideration before the start.

18:18

Let's imagine, for instance, that you have a case, a business case that you say, yeah, this, the solution.

18:26

It's a neural network because I need to implement a neural network.

18:32

Okay, then you can go to the supported models and then you need to decide which gonna be the library for the training in your computer or in the cloud.

18:41

And then just say, okay, let's go for tensile flow.

18:45

But then you're gonna also need to check if your microcontroller that we were talking at the beginning is compatible with this libraries.

18:56

And in the framework that we are gonna need to use in this case for tensile flow,

19:02

gonna be tensile flow like micro or, for instance, also is compatible with utensil depends of the platform that you're using.

19:12

Then this is quite a more complex that we were discussing at the beginning.

19:16

Yeah, choose a microcontroller that works for you.

19:20

Yes, works for you, but you need to have very good the final data science project and how you are gonna solve the problem.

19:28

And as we were mentioning before, as we have also constraints of memory and storage, keep it simple.

19:35

As simple you can solve your problem better.

19:39

But sometimes, you know, you cannot simplify more and there are problems that needs to be.

19:44

So of course more complexity, but this is my recommendation always.

19:49

But imagine, for instance, that you are using scikit learned because it's a library that it's awesome for data science.

19:56

And you want to continue doing your project with scikit learn.

20:00

Can I do that? The inference in a microcontroller. Yes, you can.

20:05

You can use the scikit learn for different sense.

20:09

Or as we were mentioning for tensile flow, there is a lot of options.

20:13

Use the mention that in case of tensile flow, we have tensile flow light.

20:18

And then we have tensile flow micro in the case of tensile flow light is more for mobile.

20:25

And for microcontrollers, it's the micro.

20:28

Just to give you this a small detail.

20:31

And then imagine, you train your neural network because you were, imagine that you were analyzing this case of detecting.

20:42

If you're in the room, there is a person of the, there is a cat.

20:46

Imagine that you are in a building, you are the security member and you need to know that time to time the alarm sound.

20:53

And you need to know if it's just a cat or is a person who is in the building.

20:58

And you implement the system to just have that.

21:01

And then you implement your neural network.

21:03

Everything is going very small, very good in your computer.

21:07

And you know, sometimes things only work in our computer and this is a problem.

21:13

And then we have another point.

21:18

And we have the demonstration because we need to feed all this neural network with all these layers, all these parameters, everything that was super good, our accuracy, all our metrics were very good.

21:32

In this microcontroller, that is super small and has constraints that we were mentioning before.

21:40

One of the, the most popular option is to use quantification, even I also added the weight planning.

21:48

In the physical quantification, what we are doing is reduce the, the precision of the weight from 32 bits to 8 bits.

22:01

And maybe someone has concern right now, or maybe not, but you should have some concerns.

22:09

Because if we have this size, this information, imagine that this is information from one side to another side of my hand.

22:18

And we reduce it to this with this quantification, we are losing some information.

22:23

And maybe it's not the most important, but maybe some of this information that we are taking out.

22:30

It's important for our business, for our model.

22:34

Then what is needed is to check before the deployment.

22:38

We need to check if the performance of our model when we were training it in our computer with everything, happy, everyone, very happy with the success of this model is the same after the quantification.

22:56

And also we are going to need to think if we are eager to sacrifice a bit of accuracy in terms to have this model smaller.

23:08

Because maybe if we are going to do this for healthcare, we need to be very careful.

23:15

Because if we have, I don't know, some accuracy that you accept as a good.

23:21

Also, once you have this quantification process, you also need to have the same feeling that this is good.

23:31

Well, the next challenge is that the deployment of the model person.

23:36

And of course, the inference.

23:39

But the deployment, the physical deployment that we were mentioning before, should not be a problem.

23:44

Once you know how to move one file from one to our microcontroller.

23:50

Then where is the challenge?

23:53

In the case of how temporal light, let's think in this case, they recommend that this is from the official documentation.

24:01

And the recommended is compared to our model to see array.

24:09

And then run the inference in C++.

24:14

And then it's like, okay, well, what now?

24:19

Because maybe for all of you, right C++ code is something that you are doing every day.

24:25

But for other people, maybe it's not.

24:28

And I have a new challenge, right?

24:31

Okay, now I need to learn C++ to write the code.

24:35

What is very nice is the skills to learn a new programming language.

24:39

This was a very nice motivation for me.

24:42

However, for some other people, it could be something that is a challenge.

24:46

And I was wondering, is possible to run the full pipeline with Python?

24:53

Well, that is the option to extend the macro Python in C.

24:57

This is from the official documentation.

25:00

And then what I did is try to find some projects of people who did that before.

25:05

And I found this person who provides some examples, the running TensorFlow with micro Python.

25:13

There are only three examples, but it's something that apparently is working for some specific boards also.

25:22

Then this is the workflow options that we have currently.

25:25

We have our training and optimization step with Python.

25:29

But then the inference needs to be low-level programming languages.

25:34

Or go to the inference with micro Python.

25:40

That we are still on that.

25:46

I think because it's the only case that I found some information.

25:52

Then the question will be why are there more Python options for inference?

25:58

And this could be a debate.

26:01

Perfectly, do you have any idea or comment someone who encoded to say,

26:10

why we don't have more pipeline, the inference is in Python?

26:20

Okay. Any other idea?

26:27

Yeah. I completely are with Path of view.

26:31

I think that it's a think of optimization and could be other things like libraries.

26:37

And I think that could be more things.

26:41

Sorry. Sorry, sorry.

26:44

The thing is that this is the perfect opportunity for two things as I see it.

26:52

First, we see that you can implement a workflow and project that can work very good between Python and other programming languages.

27:04

And communities can interact and we can learn from each other.

27:09

And on the other side, we can kind of open the door to see if we can find collaboration between the Python community members.

27:20

And see if we can kind of contribute to see more Python in this step.

27:26

And of course, always trying to be open-minded and accept the limitations that we could have.

27:36

With that reflection, I would like to finish my presentation.

27:42

Of course, my information in case that you want to discuss about this reflection,

27:48

or you need something of this whole pipeline.

27:51

I will be very happy to share knowledge or discuss with you.

27:55

Thanks a lot.

27:56

Thank you very much.

27:57

Thank you.

27:58

Thank you.


Description

The video discusses the concept of TinyML, which is an approach to machine learning that focuses on small-scale applications. The speaker explains that TinyML can be used for various tasks, such as growing lettuces in a balcony or monitoring air quality. They share their experiences working on these types of projects and how they enjoy the creativity and freedom that comes with building something from scratch. The video covers the basics of TinyML, including the structure of the talk, the importance of understanding the project's goals, and the use of machine learning pipelines to increase happiness in plant growth. The speaker also shares their personal experiences with growing lettuces in their balcony and implementing predictive systems for air quality monitoring.