tagging.tech

Audio, Image and video keywording. By people and machines.


Leave a comment

Tagging.tech interview with Nicolas Loeillot

Tagging.tech presents an audio interview with Nicolas Loeillot about image recognition

 

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.

Keywording_Now.jpg

Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available

keywordingnow.com

 

Transcript:

 

Henrik de Gyor:  This is Tagging.Tech. I’m Henrik de Gyor. Today, I’m speaking with Nicolas Loeillot. Nicolas, how are you?

Nicolas Loeillot:  Hi, Henrik. Very well, and you?

Henrik:  Great. Nicolas, who are you, and what do you do?

Nicolas:  I’m the founder of a company which is called LM3Labs. This is a company that is entering into its 14th year of existence. It was created in 2003, and we are based in Tokyo, in Singapore, and in Sophia Antipolis in South France.

We develop computer vision algorithm software, and sometimes hardware. Instead of focusing on some traditional markets for this kind of technology, like military or security and these kind of things, we decided to focus on some more fun markets, like education, museums, entertainment, marketing.

What we do is to develop unique technologies based on computer vision systems. Initially, we are born from the CNRS, which is the largest laboratory in France. We had some first patents for triangulations of finger in the 3D space, so we could very accurately find fingers a few meters away from the camera, and to use these fingers for interacting with large screens.

We thought that it would be a good match with large projections or large screens, so we decided to go to Japan and to meet video projector makers like Epson, Mitsubishi, and others. We presented the patent, just the paper, [laughs] explaining the opportunity for them, but nobody understood what would be the future of gesture interaction.

Everybody was saying, “OK, what is it for? There is no market for this kind of technology, and the customers are not asking for this.” That’s a very Japanese way to approach the market.

The very last week of our stay in Japan, we met with NTT DoCoMo, and they said, “Oh, yeah. That’s very interesting. It looks like Minority Report, and we could use this technology in our new showroom. If you can make a product from your beautiful patent, then we can be your first customer, and you can stay in Japan and everything.”

We went back to France. We met the electronics for supporting their technology. Of course, some pilots were already written, so we went back to NTT DoCoMo, and we installed them in February 2004.

From that, NTT DoCoMo introduced us to many big companies, like NEC, DMP, and some others in Japan, and they all came with different type of request. “OK. You track the fingers, but can you track the body motion? Can you track the gestures? Can you track the eyes, the face, the motions and everything?”

We made a strong evolution of the portfolio with something like 12 products today, which are all computer vision‑related, which are usually pretty unique in their domain, even if we have seen some big competitors like Microsoft [laughs] on our market.

In 2011, we were the first to see the first deployment of 4G networks in Japan, and we said, “OK. What do we do with the 4G? That’s very interesting, very large broadband, excellent response times and everything. What can we do?”

It was very interesting. We could do what we couldn’t do before, which is to put the algorithm on the cloud and to use it on the smartphone, because the smartphone were becoming very smart. It was just beginning of the smartphones at the time, with the iPhone 4S, which was the first one which was really capable of something.

We started to develop Xloudia, which is today one of our lead products. Xloudia is mass recognition of images, products, colors, faces and everything from the cloud, and in 200 milliseconds. It goes super fast, and we search in very large databases. We can have millions of items in the base, and we can find the object or the specific item in 200 milliseconds.

Typically, applying the technology to augmented reality, which was done far before us, we said, “OK. The image recognition can be applied to something which is maybe less fun than the augmented reality, but much more useful, which is the recognition of everything.”

You just point your smartphone to any type of object, or people, or colors, or clothes, or anything, and we recognize it. This can be done with the algorithm, with the image recognition and the video recognition. That’s a key point, but not only with these kind of algorithms.

We need to develop some deep learning recognition algorithm for finding some proximities, some similarities, and to offer the users more capabilities than saying, “Yes, this is it,” or, “No, this is not it.” [laughs]

We focus on this angle, which is, OK. Computer vision is under control. We know our job, but we need to push the R&D into something which is more on the distribution of the search on the network ought to go very fast. That’s the key point. The key point was going super fast, because for the user experience, it’s absolutely momentary.

On the other hand is, “If we don’t find exactly what is searched by the user, how can we find something which is similar or close to what they are looking for?” There is an understanding of the search, which is just far beyond the database that we have in catalog, and just to make some links between the search and the environment of the users.

The other thing that we focus on was actually the user experience. For us, it was absolutely critical that the people don’t press any button for finding something. They just have to use their smartphone, to point it to the object or to the page, or to the clothes, or anything that they want to search, and the search is instantaneous, so there is no other action.

There is no picture to take. There is no capture. There is no sending anything. It’s just capturing in real time from the video flow of the smartphone, directly understanding what is passing in front of the smartphone. That was our focus.

On this end, it implies a lot of processes, I would say, for the synchronization between the smartphone and the cloud. Because you can’t send all the information permanently to the cloud, so there is some protocol to follow in terms of communication. That was our job.

Of course, we don’t send pictures to the cloud because it’s too heavy, too data‑consuming. What we do is making a big chunk of the extractions or of the work on the smartphone, and sending only the necessary data for the search to the cloud.

The data, they can be feature points for the image. They can be a color reference extracted from the image. They could be vectors, or they could be a series of images from a video, for instance, just to make something which is coherent from frame to frame.

That’s Xloudia, super fast image recognition with the smartphone, but cloud‑based, I would say, and the purpose is really to focus on the user experience, to go super fast, and to always find something back [laughs] as a reference.

The target market may be narrower than what we had before with augmented reality, and what we target is to help the e‑commerce, or more specifically, the mobile commerce players to be able to implement the visual search directly into their application.

The problem today that we have even in 2016, the problem is that when you want to buy something on your smartphone, it’s very unpleasant. Even if you go to bigger e‑commerce companies like Amazon and the others, what you have on your smartphone is just a replication of what you can see on the Web, but it’s not optimized to your device. Nobody’s using the camera, or very few are using the camera for search.

The smartphone is not a limited version of the Web, typically. It’s coming with much more power. There is cameras. There are sensors, and many things that you’d find on a smartphone which are not on a traditional PC.

The way we do mobile commerce must be completely different from the traditional e‑commerce. It’s not a downgraded version of the e‑commerce. It must be something different.

Today, we see that 50 percent of the Internet traffic to big brand website is coming from the smartphone. 50 percent, and 30 percent of the e‑commerce is done from mobile.

It means that there is a huge gap between these 50 percent and these 30 percent. There is 20 percent of the visitors who don’t buy on the smartphone because of this lack of confidence or economics or something.

There is something wrong on the road to [laughs] the final basket. They don’t buy with the smartphone, and this smartphone traffic is definitely increasing with time, as well. It’s 50 percent today for some big brands, but it’s increasing globally for everybody.

There are some countries, very critical countries like Indonesia or India, who have a huge population, more than 300 million in Indonesia, one billion people in India. These guys, they go straight from nothing to the latest Samsung S6 or 7.

They don’t go through the PC stage, so they directly buy things from the smartphone, and there’s a huge generation of people who will just buy everything on their smartphone without knowing the PC experience, because there is no ADSN lines because there are so many problems with the PC. It’s too expensive, no space, or whatever.

We target definitely these kind of markets, and we want to serve the e‑commerce or the mobile commerce pioneers, people who really consider that there is something to be done in the mobile industry for improving the user experience.

Henrik:  What are the biggest challenges and successes you’ve seen with image and video recognition?

Nicolas:  If you want to find something which is precise, where everything is fine today, 2016 saw many technologies, algorithms, where you can compare, “OK. Yes, this is a Pepsi bottle, and this is not a Coca‑Cola bottle,” so that’s pretty under control today. There is no big issue with this.

The challenge ‑‑ I would prefer to say war ‑‑ is really understanding the context, so bringing more context than just recognizing a product is, “What is the history? What is the story of the user, the location of the user? If we can’t find, or if we don’t want to find a Pepsi bottle, can we suggest something else, and if yes, what do we suggest?”

It’s more than just tagging things which are similar. It’s just bringing together a lot of sources of information and providing the best answer. It’s far beyond pure computer vision, I would say.

The challenge for the computer vision industry today, I would say, is to merge with other technologies, and the other technologies are machine learning, deep learning, sensor aggregations, and just to be able to merge all these technologies together to offer something which is smarter than previous technologies.

On the pure computer vision technologies, of course, the challenge is to create database or knowledge where we can actually identify that some object are close to what we know, but they are not completely what we know, and little by little, to learn or to build some knowledge based on what is seen or recognized by the computer vision.

One of the still‑existing challenge…It’s a few decades that I am in this industry, but [laughs] there is still a challenge which is remaining, which is actually the, I would call it the background abstraction or the noise abstractions, is, “How can you extract what is very important in the image from what is less important?”

That’s still something which is a challenge for everyone, I guess, is just, “What is the focus? What do you really want? Within a picture, what is important, and what is not important?” That is a key thing, and algorithms are evolving in this domain, but it’s still challenging for many actors, many players in this domain.

Henrik:  As of March of 2016, how do you see image and video recognition changing?

Nicolas:  The directions are speed. Speed is very important for the user experience. It must be fast. It must be seamless for the users.

This is the only way for service adoption. If the service is not smooth, is not swift ‑‑ there is many adjectives for this in English [laughs] ‑‑ but if the experience is not pleasant, it will not be adopted, and then it can die by itself.

The smoothness of the service is absolutely necessary, and the smoothness for the computer vision is coming from the speed of the answer, or the speed of the recognition. It’s even more important to be fast and swift than to be accurate, I think. That’s the key thing.

The other challenge, the other direction for our company is definitely deep learning. Deep learning is something which is taking time, because we must run algorithms on samples on big databases for building an experience, and building something which is growing by itself.

We can’t say that the deep learning for LM3Labs, or for another company, is ready and finished. It’s absolutely not. It’s something which is permanently ongoing.

Every minute, every hour, every day, it’s getting there, because the training is running on, and we learn more to recognize. We improve the recognitions, and we use the deep learning for two purpose at LM3Labs.

One of them is for the speed of recognitions, so it’s the distribution of the search on the cloud. We use deep learning technologies for smartly distributing the search and going fast.

The other one is more computer vision‑focused, which is to, if we don’t find exactly something that the user is trying to recognize, we find something which is close and we can make recommendations.

These recommendations are used for the final users so they can have something at the end, and it’s not just a blank answer. There is something to propose, or it can be used between the customers.

We can assess some trends in the search, and we can provide our customers, or B2B customers, we can provide them with recommendations saying, “OK. This month, we understand that, coming from all our customers, the brand Pepsi‑Cola is going up, for instance, instead of Coca‑Cola.” This is just an example. [laughs] That’s typically the type of application that we use with the deep learning.

Henrik:  What advice would you like to share with people looking at image and video recognition?

Nicolas:  Trust the vision. The vision is very important. There are a lot of players in the computer vision community today.

Some have been acquired recently, like Metaio by Apple, or Vuforia by PTC are two recent examples, and some people are focused on the augmented reality, so really making the visual aspect of things. Some others are more into cloud for the visual search, and just improving the search for law enforcements and these kind of things.

The scope, the spectrum of the market is pretty wide, and there are probably someone who has exactly the same vision than you [laughs] on the market.

On our side, LM3Labs, we are less interested in augmented reality clients, I would say. We are less interested in machine‑to‑machine search because this is not exactly our focus, either.

We are very excited by the future of mobile commerce, and this is where we focus, and our vision is really on this specific market segment. I would say the recommendation is find a partner who is going with you in terms of vision. If your vision is that augmented reality will invade the world, go for a pure player in this domain.

If you have a smart vision for the future of mobile commerce, join us. [laughs] We are here.

Henrik:  Thanks, Nicolas. For more on this, visit Tagging.tech.

Thanks again.


 

For a book about this, visit keywordingnow.com


Leave a comment

Tagging.tech interview with Georgi Kadrev

Tagging.tech presents an audio interview with Georgi Kadrev about image recognition

 

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoomCastBoxGoogle PlayRadioPublic or TuneIn.

Keywording_Now.jpg

Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available

keywordingnow.com

 

Transcript:

 

Henrik de Gyor:  This is Tagging.tech. I’m Henrik de Gyor. Today, I’m speaking with Georgi Kadrev. Georgi, how are you?

Georgi Kadrev:  Hi, Henrik. All good. I am quite enthusiastic to participate in the podcast.

Henrik:  Georgi, who are you and what do you do?

Georgi:  I’m Co‑Founder and CEO of Imagga, which is one of the pretty good platforms for image recognition as a service. We have auto‑tagging and auto‑categorization services that you can use for practical use cases.

Henrik:  Georgi, what are the biggest challenges and successes you’ve seen with image recognition?

Georgi:  In terms of challenges, I think, one of the biggest ones is that we, as human beings, as people, we are used to perceive a lot of our world through our eyes. Basically, when people think in general about image recognition, they have a very diverse and a very complete picture of what it should do.

Let’s say from optical character recognition or recognizing texts, to facial recognition of a particular person, to conceptual tagging, to categorization, all these different kinds of aspects of visual perception.

People typically have expectations that it’s the same technology or the same solution, but actually, quite a lot of different approaches needs to be engaged into the actual process of recognizing and understand the semantics of the image.

In terms of successes, like addressing this, I can say that not surprisingly the deep learning thing that is quite a big hype in the last few years have been a huge success into the more conceptual or class‑level object recognition. This is what it is as a type of object.

Is it a bottle? Is it a dog? Is it a cat? Is it a computer? Is it mobile phone? and so on. This has become pretty practical, and right now we can say that we are close to human level in recognition of a lot of different classes of objects.

At the same time, in some other spaces, like lower recognition, like facial recognition, we also see quite big appreciation rates that allow for practical applications.

I can say one of the good things is that we are more and more closely to automating, at least, part of the tasks that needs to be performed by a computer, replacing the need for manual annotation of photos for different use cases.

In terms of challenges, maybe I would also add that you still need a lot of data, a properly annotated data. In machine learning and in deep learning in general, it’s very data‑greedy, so we need an enormous amount of samples to really make something robust enough and practical enough.

We still see the gathering a high‑quality dataset is one of the challenges. This is something that we also try to internally address. It helps us be more competitive in terms of quality and the technology.

Henrik:  As of March 2016, how do you see image recognition changing?

Georgi:  What we definitely see that there are more and more services. Some of them are pretty good quality that try to automate different aspects of image recognition that I briefly tackled.

We see even big players like Google starting to offer services for some of those things like what they call label recognition or what we call tagging, what they call optical character recognition or most of the vendors call it that way.

We also have seen logo and facial recognition being quite popular and being utilized more and more in different kinds of brand monitoring services.

At the same time, from the perspective of a bit of downside of visual recognition, something that we see when we talk about highly artistic images or some more specific art or other types of specific content, still the technologies needs to be customly‑trained for that.

If possible at all to train a classification‑based image recognition to recognize different kinds of artistic images or different kinds of very specialized image content.

It’s related with what I had mentioned in the beginning, that if you have a specific task, sometimes you need a specific approach. Deep learning to a certain extent has addressed this, but still it’s not like one-size-fits-all solution. We see that in a lot of cases the customers need to define a specific problem so that they can have a very good and precise specific solution.

Henrik:  As of March 2016, how much of image recognition is completed by humans versus machines?

Georgi:  I would say, [laughs] honestly depends on the task. We’ve seen some cases that machines can be better than humans and not just in theory, in practice.

For example, if we train a custom classifier with the human‑curated data set, and then we do some kind of testing or validation, we see that the errors, the things that are reported as errors in the learning process can actually mean errors by the people.

It’s mistaken when it has annotated the photo so that then it’s false reported as an error, although it’s correct. In a way, this is promising because it shows the automation and consistency that machines can do is pretty good in terms of precision.

At the same time, there are tasks where if you have a lot of explicit or implicit knowledge that you need to get in order to resolve an automation task. A lot of background knowledge that people have is not available for the machine and then you need to figure out a way how to either automate this or use a combination between a computer and a human, or you can decide this as a fully humanic task.

Still, it’s not approachable by technical approach. I cannot give an exact number. Something interesting that I can share is a statistic, we did a pretty interesting experiment called Clash of Tags, where we ask people. We have a data set of stock photography. This stock photography has existing set of tags provided by various people like the stock photographers themselves.

Then we also have the same set of images of stock photos that are annotated using current technology, completely blindly from the original tags that people have put for the image. Then, we do this thing, we ask people, “Type a keyword and then you get search results.”

One of the set of results on the left‑hand side or the right‑hand side is not known in advance, but one of the set of results is based on the tags that people have put, and the other set of results is based on the tags that our API has generated and has been assigned to the images.

The user needs to pick which is the winning set. In a lot of cases, I can say in 45 percent roughly of the cases, people have chosen that result set based on automatically generated tag is better than the set of results based on human‑provided tags. It’s not more than 50, but still means in a lot of cases the machine has been superior to the human performance.

I believe this number will grow in the future. I can say it’s still a way to go to something like complete automation, but we are getting closer and closer and we’re enthusiastic about it.

Henrik:  Georgi, what advice would you like to share with people looking into image recognition?

Georgi:  I would say, have a very clear idea of what kind of venue you want to drive out of that and try to optimize for that. Either working on it yourself or with a vendor. Make it really clear what are your objectives, what are your objections about image recognition. Just think from the practical aspect.

This is something that me, personally and the whole of our team has always been stressing on. Let’s see what it does and what it can do and what it can’t and address. If they’re really a pain that can be solved right now or not. Also from the vendor side, I would suggest don’t over‑promise because it’s quite easy to get people a bit confused.

They have an expectation like, “It’s AI so it can do anything?”, but you need to be realistic, so you save your time and you save your potential customer time. If the use case is very clear and if he was a professional then commit that this is going to work out, then go for it. Other than that, don’t waste time, yours and your potential customers.

This is something that we saw a lot, because a lot of people ask about features that currently technically are not practical enough or they ask about features that we don’t have. We learn the hard way and to certain extent to say, “This is possible, this is not possible currently from our perspective.”

Henrik:  Where can we find more information about image recognition?

Georgi:  Depending on what you need. Do you need more data for training, or do you need more basic knowledge, or do you need different kind of inspirations about business applications? There are different sources.

Obviously, ImageNet and all the accompanying information and the algorithms that we have around this pretty nice dataset is quite useful for researchers. We also have for beginners in image recognition, we have all these set of Coursera courses.

One of the most notable one from Stanford University. A few more pretty good ones from most of the top European or American universities. We have different kinds like newsletters and digests. AI Weekly is pretty good inspirational wise. There is some mixture of research topics, business cases, cool hacks and ideas about what you can do with image recognition.

Henrik:  Well, thanks, Georgi.

Georgi:  Thanks a lot, Henrik. I hope your audience will enjoy the podcast, including our participation in it.

Henrik:  For more on this, visit Tagging.tech.

Thanks again.


 

For a book about this, visit keywordingnow.com


Leave a comment

Tagging.tech interview with Brad Folkens

Tagging.tech presents an audio interview with Brad Folkens about image recognition

 

Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.

Keywording_Now.jpg

Keywording Now: Practical Advice on using Image Recognition and Keywording Services

Now available

keywordingnow.com

 

Transcript:

 

Henrik de Gyor:  This is Tagging.tech. I’m Henrik de Gyor. Today I’m speaking with Brad Folkens. Brad, how are you?

Brad Folkens:  Good. How are you doing today?

Henrik:  Great. Brad, who are you and what do you do?

Brad:  My name’s Brad Folkens. I’m the CTO and co‑founder of CamFind Inc. We make an app that allows you to take a picture of anything and find out what it is, and an image recognition platform that powers everything and you can use as an API.

Henrik:  Brad, what are the biggest challenges and successes you’ve seen with image recognition?

Brad:  I think the biggest challenge with image recognition that we have today is truly understanding images. It’s something that computers have really been struggling with for decades in fact.

We saw that with voice before this. Voice was always kind of the promised frontier of the next computer‑human interface. It took many decades until we could actually reach a level of voice understanding. We saw that for the first time with Siri, with Cortana.

Now we’re kind of seeing the same sort of transition with image recognition as well. Image recognition is this technology that we’ve had promised to us for a long time. But it hasn’t quite crossed that threshold into true usefulness. Now we’re starting to see the emergence of true image understanding. I think that’s really where it changes from image recognition being a big challenge to starting to become a success when computers can finally understand the images that we’re sending them.

Henrik:  Brad, as of March 2016, how much of image recognition is done by humans versus machines?

Brad:  That’s a good question. Even in-house, quite a bit of it actually is done by machine now. When we first started out, we had a lot of human-assisted I would say image recognition. More and more of it now is done by computers. Essentially 100 percent of our image recognition is done by computers now, but we do have some human assistance as well. It really kind of depends on the case.

Internally, what we’re going for is what we call six-star answer. If you imagine a five-star answer is something where you take a picture of say a cat or a dog. We know generally what kind of breed it is. A six-star answer is the kind of answer where you take a picture of the same cat, and we know exactly what kind of breed it is. If you take a picture of a spider, we know exactly what kind of species that spider is every time. That’s what we’re going for.

Unsupervised computer learning is something that is definitely exciting, but I think we’re about 20 to 30 years beyond when we’re going to actually see unsupervised computer vision, unsupervised deep learning neural networks as something that actually finally achieves the promise that we expect from it. Until then, supervised deep learning neural networks is something that are going to be around for a long time.

What we’re really excited about is that we’ve really found a way to make that work in a way that’s a cloud site that customers are actually happy. The users of CamFind are happy with the kind of results that they’re getting out of it.

Henrik:  As of March 2016, how do you see image recognition changing?

Brad:  We talk a little bit about image understanding. I think where this is really going is to video next. Now that we’ve got some technology out there that understands images, really the next phase of this is moving into video. How can we truly automate and machine the understanding of video? I think that’s really the next big wave of what we’re going to see evolve in terms of image recognition.

Henrik:  What advice would you like to share with people looking into image recognition?

Brad:  I think what we need to focus on specifically is this new state of the art technology. It’s not quite new but of deep learning neural networks. Really we’ve played around…As computer scientists, we’ve screwed around a lot, for decades, with a lot of different machine learning types.

What really is fascinating about deep learning is it mimics the human brain. It really mimics how we as humans learn about the world around us. I think that we need to really inspire different ways of playing around with and modeling these neural networks, training them, on larger and larger amounts of real-world data. This is what we’ve really experimented is in training these neural networks on real-world data.

What we’ve found is that this is what truly brought about the paradigm shift that we were looking to achieve with deep learning neural networks. It’s really all about how we train them. For a long time, when we’ve been experimenting with image recognition, computer vision, these sorts of things. If you look at an applesto apples analogy, we’re trying to train computers very similarly to if we were to shut off all of our senses.

We have all these different senses. We have sight. We have sound. We have smell. We have our emotions. We learn about the world around us through all of these senses combined. That’s what form these very strong relationships in our memory that really teach us about things.

When you hold a ball in your hand, you see it in three dimensions because you’ve got stereoscopic vision, but you also feel the texture of it. You feel the weight of it. You feel the size. Maybe you smell the rubber or you have an emotional connection to playing with a ball as a child. All of these senses combined create your experience of what you know as a ball plus language and everything else.

Computers on the other hand, we feed them lots of two-dimensional images. It’s like if you were to close one of your eyes and look at the ball, but without any other senses at all, not a sense of touch, no sense of smell, no sense of sound, no emotional connection, none of those extra senses. It’s almost like if you’re flashing your eye for 30 milliseconds to that ball, tons of different pictures of the ball, and expecting to learn about it.

Of course, this isn’t how we learn about the world around. We learn about the world around through all these different senses and experiences and everything else. This is what we would like to inspire other computer scientists and those that are working with image recognition to really take this into account. Because this is really where we’ve seen as a company the biggest paradigm shift in image understanding and image cognition. We really want to try to push the envelope as far the state of the art as a whole. This is kind of where we see it going.

Henrik:  Where can we find more information about image recognition?

Brad:  It’s actually a great question. This is such a buzzword these days, especially in the past couple of years. Really, it sounds almost cheesy but just typing in a search into Google about image recognition brings up so much now.

If you’re a programmer, there’s a lot of different frameworks that you can get started with image recognition. You can get started with one of them’s called OpenCV. This is a little bit more of a toolbox for image recognition. It requires a little bit of an understanding of programming and a little bit of understanding of the math and the sciences behind it. This gives you a lot of tools for basic image recognition.

Then to play around with some of these other things I was talking about, deep learning, neural networks, there’s a couple of different frameworks out there. There’s actually this really cool JavaScript website where you can play around with a neural network in real time and see how it learns. This was really a fantastic resource that I like to send people to, kind of help them, give them an introduction to how neural networks work.

It’s pretty cool. You play with it, parameters. It comes up with…It paints a picture of a cat. It’s all in JavaScript, too, so it’s pretty simple and everything.

There’s two frameworks that we particularly like to play around with. One of them is called Cafe, and the other one is called Torch. Both of these are publicly available, open source projects and frameworks for deep learning neural networks. They’re a great place to play around with and learn, see how these things work.

Those are really what people tend to ask about image recognition and deep learning neural networks, that’s the sort of thing. I like to point them to because it’s great introduction and playground to get your feet wet and dirty with this type of technology.

Henrik:  Thanks, Brad.

Brad:  Absolutely. Thanks again.

Henrik:  For more on this, visit Tagging.tech.

Thanks again.


 

For a book about this, visit keywordingnow.com