Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides - Artificial Intelligence

170 Articles
article-image-do-you-need-artificial-intelligence-and-machine-learning-expertise-in-house
Guest Contributor
22 Jan 2019
7 min read
Save for later

Do you need artificial intelligence and machine learning expertise in house?

Guest Contributor
22 Jan 2019
7 min read
Developing artificial intelligence expertise is a challenge. There’s a huge global demand for practitioners with the right skills and knowledge and a lack of people who can actually deliver what’s needed. It’s difficult because many of the most talented engineers are being hired by the planet’s leading tech companies on salaries that simply aren’t realistic for many organizations. Ultimately, you have two options: form an in-house artificial intelligence development team or choose an external software development team or consultant with proven artificial intelligence expertise. Let’s take a closer look at each strategy. Building an in-house AI development team If you want to develop your own AI capabilities, you will need to bring in strong technical skills in machine learning. Since recruiting experts in this area isn’t an easy task, upskilling your current in-house development team may be an option. However, you will need to be confident that your team has the knowledge and attitude to develop those skills. Of course, it’s also important to remember that a team building artificial intelligence is comprised of a range of skills and areas of expertise. If you can see how your team could evolve in that way, you’re halfway to solving your problem. AI experts you need for building a project  Big Data engineers: Before analyzing data, you need you collect, organize, and process it. AI is usually based on big data, so you need the engineers who have experience working with structured and unstructured data, and can build a secure data platform. They should have sound knowledge of Hadoop, Spark, R, Hive, Pig, and other Big Data technologies.  Data scientists: Data scientists are a vital part of your AI team. They work their magic with data, building the models, investigating, analyzing, and interpreting it. They leverage data mining and other techniques to surface hidden insights and solve business problems. NLP specialists: A lot of AI projects involve Natural Language Processing, so you will probably need NLP specialists. NLP allows computers to understand and translate human language serving as a bridge between human communication and machine interpretation. Machine learning engineers: These specialists utilize machine learning libraries, deploying ML solutions into production. They take care of the maintainability and scalability of data science code. Computer vision engineers: They specialize in imagery recognition, correlating image to a particular metric instead of correlating metrics to metrics. For example, computer vision is used for modeling objects or environments (medical image analysis), identification tasks (a species identification system), and processes controlling (industrial robots).  Speech recognition engineers: You will need these experts if you want to build your speech recognition system. Speech recognition can be very useful in telecommunication services, in-car systems, medical documentation, and education. For instance, it is used in language learning for practicing pronunciation. Partnering with an AI solution provider If you realize that recruiting and building your own in-house AI team is too difficult and expensive, you can engage with an external AI provider. Such an approach helps companies keep the focus on their core expertise and avoid the headache of recruiting the engineers and setting up the team. Also, it allows them to kick off the project much faster and thus gain a competitive advantage. Factors to consider when choosing an artificial intelligence solution provider AI engineering experience Due to the huge popularity of AI these days, many companies claim to be professional AI development providers without practical experience. Hence it’s extremely important to do extensive research. Firstly, you should study the portfolio and case studies of the company. Find out which AI, machine learning or data science projects your potential vendor worked on and what kind of artificial intelligence solutions the company has delivered. For instance, you may check out these European AI development companies and the products they developed. Also, make sure a provider has experience in the types of machine learning algorithms (supervised, unsupervised, and reinforcement), data structures and algorithms, computer vision, NLP, etc that are relevant to your project needs. Expertise in AI technologies Artificial Intelligence covers a multitude of different technologies, frameworks, and tools. Make sure your external engineering team consists of professional data scientists and data engineers who can solve your business problems. Building the AI team and selecting the necessary skill set might be challenging for businesses that have no internal AI expertise. Therefore, ask a vendor to provide tech experts or delivery managers who will advise you on the team composition and help you hire the right people. Capacities to scale a team When choosing a team, you should consider not only your primary needs but also the potential growth of your business. If you expect your company to scale up, you’ll need more engineering capacities. Therefore, take into account your partner’s ability to ramp up the team in the future. Also, consider factors such as the vendor’s employer image and retention rate since your ability to attract top AI talent and keep them on your project will largely depend on it. Suitable cooperation model It is essential to choose the AI company with a cooperation model that fits your business requirements. The most popular cooperation models are Fixed Price, Time and Material, and Dedicated Development Team. Within the fixed price model all the requirements and the scope of work are set from the start, and you as a customer need to have them described to the smallest detail as it will be extremely difficult to make change requests during the project. However, it is not the best option for AI projects since they involve a lot of R&D and it is difficult to define everything at the initial stage. Time and material model is the best for small projects when you don’t need the specialists to be fully dedicated to your project. This is not the best choice for AI development as the hourly rates of AI engineers are extremely high and the whole project would cost you a fortune with this type of contract. In order to add more flexibility yet keep control over the project budget, it is better to choose a dedicated development team model or staff augmentation. It will allow you to change the requirements when needed and have control over your team. With this type of engagement, you will be able to keep the knowledge within your team and develop your AI expertise as developers will work exclusively for you. Conclusion If you have to deal with the challenge of building AI expertise in your company, there are two possible ways to go. First off, you can attract local AI talent and build the expertise in-house. Then you have to assemble the team of data scientists, data engineers, and other specialists depending on your needs. However, developing AI expertise in-house is always time- and cost-consuming taking into account the shortage of well-qualified machine learning specialists and superlative salary expectations. The other option is to partner with an AI development vendor and hire an extended team of engineers. In this case, you have to consider a number of factors such as the company’s experience in delivering AI solutions, the ability to allocate the necessary resources, the technological expertise, and its capabilities to satisfy your business requirements. Author Bio Romana Gnatyk is Content Marketing Manager at N-IX passionate about software development. Writing insightful content on various IT topics, including software product development, mobile app development, artificial intelligence, the blockchain, and different technologies. Researchers introduce a machine learning model where the learning cannot be proved “All of my engineering teams have a machine learning feature on their roadmap” – Will Ballard talks artificial intelligence in 2019 [Interview] Apple ups it’s AI game; promotes John Giannandrea as SVP of machine learning
Read more
  • 0
  • 0
  • 17232

article-image-how-can-artificial-intelligence-support-your-big-data-architecture
Natasha Mathur
26 Sep 2018
6 min read
Save for later

How can Artificial Intelligence support your Big Data architecture?

Natasha Mathur
26 Sep 2018
6 min read
Getting a big data project in place is a tough challenge. But making it deliver results is even harder. That’s where artificial intelligence comes in. By integrating artificial intelligence into your big data architecture, you’ll be able to better manage, and analyze data in a way that provide a substantial impact on your organization. With big data getting even bigger over the next couple of years, AI won’t simply be an optional extra, it will be essential. According to IDC, the accumulated volume of big data will increase from 4.4 zettabytes to roughly 44 zettabytes or 44 trillion GB, by 2020. Only by using Artificial Intelligence will you really be able to properly leverage such huge quantities of data. The International Data Corporation (IDC) also predicted a need for 181,000 people with deep analytical skills, data management, and interpretation skills, this year. AI comes to rescue again. AI can ultimately compensate for the lack of analytical resources today with the power of machine learning that enables automation.  Now that we know why Big data needs AI, let’s have a look at how AI helps big data. But, for that, you first need to understand the big data architecture. While it’s clear that artificial intelligence is an important development in the context of big data, what are the specific ways it can support and augment your big data architecture? It can, in fact, help you across every component in the architecture. That’s good news for anyone working with big data, and good for organizations that depend on it for growth as well. Artificial Intelligence in Big data Architecture In a big data architecture, data is collected from different data sources and then moves forward to other layers. Artificial Intelligence in data sources Using machine learning, this process of structuring data becomes easier, thereby, making it easier for the organizations to store and analyze their data. Now, keep in mind that large amounts of data from various sources can sometimes make data analysis even harder. This is because we now have access to heterogeneous sources of data that add different dimensions and attributes to the data. This further slows down the entire process of collecting data. To make things much quicker and more accurate, it’s important to consider only the most important dimensions. This process is what’s called data dimensionality reduction (DDR). With DDR, it is important to keep note of the fact that the model should always convey the same information without any loss of insight or intelligence. Principal Component Analysis or PCA is another useful machine learning method that’s used for dimensionality reduction. PCA performs feature extraction, meaning it combines all the input variables from the data, then drops the “least important” variables while making sure to retain the most valuable parts of all of the variables. Also, each of the “new” variables after PCA are independent of each other. Artificial Intelligence in data storage Once data is collected from the data source, it then needs to be stored. AI can allow you to automate storage with machine learning. This also makes structuring the data easier. Machine learning models automatically learn to recognize patterns, regularities, and interdependencies from unstructured data and then adapt, dynamically and independently, to new situations. K-means clustering is one of the most popular unsupervised algorithms for data clustering, which is used when there’s large-scale data without any defined categories or groups. The K-means Clustering algorithm performs pre-clustering or classification of data into larger categories. Unstructured data gets stored as binary objects, annotations are stored in NoSQL databases, and raw data is ingested into data lakes. All this data act as input to machine learning models. This approach is great as it automates refining of the large-scale data. So, as the data keeps coming, the machine learning model will keep storing it depending on what category it fits. Artificial Intelligence in data analysis After the data storage layer comes the data analysis part. There are numerous machine learning algorithms that help with effective and quick data analysis in big data architecture. One such algorithm that can really step up the game when it comes to data analysis is Bayes Theorem. Bayes theorem uses stored data to ‘predict’ the future. This makes it a wonderful fit for big data. The more data you feed to a Bayes algorithm, the more accurate its predictive results become. Bayes Theorem determines the probability of an event based on prior knowledge of conditions that might be related to the event. Another machine learning algorithm that great for performing data analysis are decision trees. Decision trees help you reach a particular decision by presenting all possible options and their probability of occurrence. They’re extremely easy to understand and interpret. LASSO (least absolute shrinkage and selection operator) is another algorithm that will help with data analysis. LASSO is a regression analysis method. It is capable of performing both variable selection and regularization which enhances the prediction accuracy and interpretability of the outcome model. The lasso regression analysis can be used to determine which of your predictors are most important. Once the analysis is done, the results are presented to other users or stakeholders. This is where data utilization part comes into play. Data helps to inform decision making at various levels and in different departments within an organization. Artificial intelligence takes big data to the next level Heaps of data gets generated every day by organizations all across the globe. Given such huge amount of data, it can sometimes go beyond the reach of current technologies to get right insights and results out of this data. Artificial intelligence takes the big data process to another level, making it easier to manage and analyze a complex array of data sources. This doesn’t mean that humans will instantly lose their jobs - it simply means we can put machines to work to do things that even the smartest and most hardworking humans would be incapable of. There’s a saying that goes "big data is for machines; small data is for people”, and it couldn’t be any truer. 7 AI tools mobile developers need to know How AI is going to transform the Data Center How Serverless computing is making AI development easier
Read more
  • 0
  • 0
  • 16907

article-image-facebook-plans-to-use-bloomsbury-ai-to-fight-fake-news
Pravin Dhandre
30 Jul 2018
3 min read
Save for later

Facebook plans to use Bloomsbury AI to fight fake news

Pravin Dhandre
30 Jul 2018
3 min read
“Our investments in AI mean we can now remove more bad content quickly because we don't have to wait until after it's reported. It frees our reviewers to work on cases where human expertise is needed to understand the context or nuance of a situation. In Q1, for example, almost 90% of graphic violence content that we removed or added a warning label to was identified using AI. This shift from reactive to proactive detection is a big change -- and it will make Facebook safer for everyone.” Mark Zuckerberg, in Facebook’s earnings, call on Wednesday this week To understand the significance of the above statement, we must first look at the past. Last year, Social media giant Facebook suffered from multiple lawsuits across the UK, Germany, and US for defamation due to fake news articles and for spreading misleading information. To make amends, Facebook came up with fake news identification tools, however, failed to completely tame the effects of bogus news. In fact, the company’s revenue took a bad hit in advertising revenue along with its social reputation nosediving. Early this month, Facebook confirmed the acquisition of Bloomsbury AI, a London-based artificial intelligence start-up with over 60 patents acquired to date. Bloomsbury AI focuses on natural language processing - developing machine reading methods that can understand written text across a broad range of domains. The Artificial Intelligence team at Facebook would be on-boarding the complete team of Bloomsbury AI and will build highly robust methods to kill the plague of fake news throughout the Facebook platform. The rich expertise carried over by the Bloomsbury AI team will strengthen Facebook's endeavor in natural language processing research and gauge deeper understanding of natural language and its applications. It appears that the amalgamation will help Facebook to develop advanced machine reading, reasoning and question answering methods which will boost the Facebook’s NLP engine to understand the legitimacy of questions across a broad range of topics and make intellect choices thereby defeating the challenges of fake news and Autobots. No doubt, Facebook is going to leverage the Bloomsbury’s Cape service to answer a majority of the questions on unstructured text. The duo would play a significant role in parsing the content majorly to tackle fake photos and videos too. In addition, it has been said that the new team members would provide an active contribution to the ongoing artificial intelligence projects such as AI hardware chips, AI technology mimicking humans and many more. Facebook is investigating data analytics firm Crimson Hexagon over misuse of data Google, Microsoft, Twitter, and Facebook team up for Data Transfer Project Did Facebook just have another security scare?
Read more
  • 0
  • 0
  • 16123

article-image-6-use-cases-machine-learning-healthcare
Sugandha Lahoti
10 Nov 2017
7 min read
Save for later

6 use cases of Machine Learning in Healthcare

Sugandha Lahoti
10 Nov 2017
7 min read
While hospitals have sophisticated processes and highly skilled administrators, management can still be an administrative nightmare for the already time-starved healthcare professionals. A sprinkle of automation can do wonders here. It could free up a practitioner's invaluable time and, thereby, allow them to focus on tending to critically ill patients and complex medical procedures. At the most basic level, machine learning can mechanize routine tasks such as automating documentation, billing, and regulatory processes. It can also provide ways and tools to diagnose and treat patients more efficiently. However, these tasks only scratch the surface. Machine learning is here to revolutionize healthcare and other allied industries such as pharma and medicine. Below are some ways it is being put to use in these domains. Helping with disease identification and drug discovery Healthcare systems generate copious amounts of data and use them for disease prediction. However, the necessary software to generate meaningful insights from this unstructured data is often not in place. Hence, drug and disease discovery end up taking time. Machine Learning algorithms can discover signatures of diseases at rapid rates by allowing systems to learn and make predictions based on the previously processed data. They can also be used to determine which chemical compounds could work together to aid drug discovery. Thus the time-consuming process of experimenting and testing millions of compounds is eliminated. With the fast discovery of diseases, the chances of detecting symptoms earlier and the probability of survival increases. It also boosts available treatment options. IBM has collaborated with Teva Pharmaceutical to discover new treatment options for respiratory and central nervous system diseases using Machine Learning algorithms such as predictive and visual analytics that run on IBM Watson Health Cloud. To gain more insights on how IBM Watson is changing the face of healthcare, check this article. Enabling precision medicine Precision Medicine revolves around healthcare practices specific to a particular patient. This includes analyzing a person’s genetic information, health history, environmental exposure, and needs and preferences to guide diagnosis for diseases and subsequent treatment. Here, machine learning algorithms are utilized to sift through vast databases of patient data to identify factors such as their genetic history and predisposition to diseases, that could strongly determine treatment success or failure. ML techniques in precision medicine exploit molecular and genomic data to assist doctors in directing therapies to patients and shed light on disease mechanisms and heterogeneity. It also predicts what diseases are likely to occur in the future and suggests methods to avoid them. Cellworks, a Life Sciences Technology company, brings together a SaaS-based platform for generating precision medicine products. Their platform analyses the genomic profile of the patient and then provides patient-specific reports for improved diagnosis and treatment. Assisting radiology and radiotherapy CTI and MRI scans for radiological diagnosis and interpretation are burdensome and laborious (not to mention, time-consuming). They involve segmentation—differentiating between healthy and infectious tissues—which when done manually has a good probability of resulting in errors and misdiagnosis. Machine Learning algorithms can speed up the segmentation process while also increasing accuracy in radiotherapy planning. ML can provide physicians information for better diagnostics which helps in obtaining accurate tumor location. It also predicts radiotherapy response to help create a personalized treatment plan. Apart from these, ML algorithms find use in medical image analysis as they learn from examples. This involves classification techniques which analyze images and available clinical information to generate the most likely diagnosis. Deep Learning can also be used for detecting lung cancer nodules in early screening CT scans and displaying the results in useful ways for clinical use. Google’s machine-learning division, DeepMind, is automating radiotherapy treatment for head and neck cancers using scans from almost 700 diagnosed patients. An ML algorithm scans the reports of symptomatic patients against these previous scans to help physicians develop a suitable treatment process. Arterys, a cloud-based platform, automates cardiac analysis using deep learning. Providing Neurocritical Care A large number of neurological diseases develop gradually or in stages, so the decay of the brain happens over time. Traditional approaches to neurological care such as peak activation, EEG epileptic spikes, Pronator drift etc., are not accurate enough to diagnose and classify neurological and psychiatric disorders. This is because they are typically used for end results assessment rather than for progressive analysis on how the brain disease develops. Moreover, timely personalized neurological treatments and diagnoses rely highly on the constant availability of an expert. Machine Learning algorithms can advance the science of detection and prediction by learning how the brain progressively develops into these conditions. Deep Learning techniques are applied in the area of neuroimaging to detect abstract and complex patterns from single-subject data to detect and diagnose brain disorders. Machine learning techniques such as SVM, RBFN, and RF are amalgamated with PDT (Pronator drift tests) to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. Machine Learning algorithms can also be used for detecting signs of dementia before its onset. The Douglas Mental Health University Institute uses PET scans to train ML algorithms to spot signs of dementia by analyzing it against scans of patients who have mild cognitive impairment. Then they run the scans belonging to symptomatic patients on the trained algorithm to predict the possibility of dementia. Predicting epidemic outbreaks Epidemic predictions traditionally rely on manual accounting. This includes self-reports or aggregation of information from healthcare services such as reports by different health protection agencies like CDC, NHIS, National Immunization Survey etc. However, they are time-consuming and error-prone. Thus predicting and prioritizing the outbreaks becomes challenging. ML algorithms can automatically perform analysis, improve calculations and verify information with minimal human intervention. Machine learning techniques like support vector machines and artificial neural networks can predict the epidemic potential of a disease and provide alerts for disease outbreak. They do this using data collected from satellites, and from real-time social media updates, historical information on the web, and other sources. They also use geospatial data such as temperature, weather conditions, wind speed, and other data points to predict the magnitude of impact an epidemic can cause in a particular area and to recommend necessary measures for preventing and containing them early on. AIME, a medical startup, has come up with an algorithm to predict outcome and even the epicenter of epidemics such as dengue fever before their occurrence. Better hospital management Machine Learning can bring about a change in traditional hospital management systems by envisioning hospitals as a digital patient-centric care center. These include automating routine tasks such as billing, admission and clearance, monitoring patients’ vitals etc. With administrative tasks out of the way, hospital authorities could fully focus on the care and treatment of patients. ML techniques such as computer vision can be used to feed all the vital signs of a patient directly into the EHR from the monitoring devices. Smart tracking devices are also used on patients to provide real-time whereabouts. Predictive analysis techniques provide continuous stream of real-time images and data. This analysis can sense risk and prioritize activities for the benefit of all patients. ML can also automate non-clinical functions, including pharmacy, laundry, and food delivery. The John Hopkins Hospital has its own command center that uses predictive analytics for efficient operational flow. Conclusion The digital health era focuses on health and wellness rather than diseases. The incorporation of machine learning in healthcare provides an improved patient experience, a better public health management, and reduces costs by automating manual labour. The next step in this amalgamation is a successful collaboration of clinicians and doctors with machines. This would bring about a futuristic health revolution with improved, precise, and more efficient care and treatment.
Read more
  • 0
  • 0
  • 16122

article-image-behind-scenes-deep-learning-evolution-core-concepts
Shoaib Dabir
19 Dec 2017
6 min read
Save for later

Behind the scenes: Deep learning evolution and core concepts

Shoaib Dabir
19 Dec 2017
6 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book by Kuntal Ganguly titled Learning Generative Adversarial Networks. The book will help you build and analyze various deep learning models and apply them to real-world problems.[/box] This article will take you through the history of Deep learning and how it has grown over time. It will walk you through some of the core concepts of Deep Learning like sigmoid activation, rectified linear unit(ReLU), etc. Evolution of deep learning A lot of the important work on neural networks happened in the 80's and in the 90's, but back then computers were slow and datasets very tiny. The research didn't really find many applications in the real world. As a result, in the first decade of the 21st century, neural networks have completely disappeared from the world of machine learning. It's only in the last few years, first seeing speech recognition around 2009, and then in computer vision around 2012, that neural networks made a big comeback with (LeNet, AlexNet). What changed? Lots of data (big data) and cheap, fast GPU's. Today, neural networks are everywhere. So, if you're doing anything with data, analytics, or prediction, deep learning is definitely something that you want to get familiar with. See the following figure: Deep learning is an exciting branch of machine learning that uses data, lots of data, to teach computers how to do things only humans were capable of before, such as recognizing what's in an image, what people are saying when they are talking on their phone, translating a document into another language, helping robots explore the world and interact with it. Deep learning has emerged as a central tool to solve perception problems and it's the state of the art with computer vision and speech recognition. Today many companies have made deep learning a central part of their machine learning toolkit—Facebook, Baidu, Amazon, Microsoft, and Google are all using deep learning in their products because deep learning shines wherever there is lots of data and complex problems to solve. Deep learning is the name we often use for "deep neural networks" composed of several layers. Each layer is made of nodes. The computation happens in the node, where it combines input data with a set of parameters or weights, that either amplify or dampen that input. These input-weight products are then summed and the sum is passed through activation function, to determine what extent the value should progress through the network to affect the final prediction, such as an act of classification. A layer consists of row of nodes that that turn on or off as the input is fed through the network based. The input of the first layer becomes the input of the second layer and so on. Here's a diagram of what neural network might look like: Let's get familiarize with some deep neural network concepts and terminology. Sigmoid activation Sigmoid activation function used in neural network has an output boundary of (0, 1), and α is the offset parameter to set the value at which the sigmoid evaluates to 0. Sigmoid function often works fine for gradient descent as long as input data x is kept within a limit. For large values of x, y is constant. Hence, the derivatives dy/dx (the gradient) equates to 0, which is often termed as the vanishing gradient problem. This is a problem because when the gradient is 0, multiplying it with the loss (actual value - predicted value) also gives us 0 and ultimately networks stops learning. Rectified Linear Unit (ReLU) A neural network can be built from combining some linear classifier with some non-linear function. The Rectified Linear Unit (ReLU) has become very popular in the last few years. It computes the function f(x) = max(0,x) f(x)=max(0,x). In other words, the activation is simply thresholded at zero. Unfortunately, ReLU units can be fragile during training and can die as a ReLU neuron could cause the weights to update in such a way that the neuron will never activate on any datapoint again and so the gradient flowing through the unit will forever be zero from that point on. To overcome this problem a leaky ReLU function will have a small negative slope (of 0.01, or so) instead of zero when x<0: f(x)= (x<0)(αx)+ (x>=0)(x)f(x)=1(x<0)(αx)+1(x>=0)(x) where αα is a small constant. Exponential Linear Unit (ELU) The mean of ReLU activation is not zero and hence sometime makes the learning difficult for the network. Exponential Linear Unit (ELU) is similar to ReLU activation function when input x is positive, but for negative values it is a function bounded by a fixed value -1, for α=1 (hyperparameter α controls the value to which an ELU saturates for negative inputs). This behavior helps to push the mean activation of neurons closer to zero, that helps to learn representations that are more robust to noise. Stochastic Gradient Descent (SGD) Scaling batch gradient descent is cumbersome because it has to compute a lot if the dataset is big and as a rule of thumb. If computing your loss takes n floating point operations, computing its gradient takes about three times that compute. But in practice we want to be able to train lots of data because on real problems we will always get more gains the more data we use. And because gradient descent is iterative and have to do that for many steps. So, that means that in-order to update the parameters in a single step, it has to go through all the data samples and then doing this iteration over the data tens or hundreds of times. Instead of computing the loss over entire data samples for every step, we can compute the average loss for a very small random fraction of the training data. Think between 1 and 1000 training samples each time. This technique is called Stochastic Gradient Descent (SGD) and is at the core of deep learning. That's because SGD scales well with both data and model size. SGD gets its reputation for being black magic as it has lots of hyper-parameters to play and tune such as initialization parameters, learning rate parameters, decay, momentum, and you have to get them right. Deep Learning has emerged over time with its evolution from neural networks to machine learning. It is an intriguing segment of machine learning that uses huge amount of data, to teach computers how to do things that only humans were capable of. It highlights some of the key players who have adopted this concept at the very early stage that are Facebook, Baidu, Amazon, Microsoft, and Google. It shows the different concept layers through which deep learning is executed. If Deep Learning has got you hooked, wait till you learn what GANs are from the book Learning Generative Adversarial Networks.
Read more
  • 0
  • 0
  • 15789

article-image-deep-learning-games-neural-networks-design-virtual-world
Amey Varangaonkar
28 Mar 2018
4 min read
Save for later

Deep Learning in games - Neural Networks set to design virtual worlds

Amey Varangaonkar
28 Mar 2018
4 min read
Games these days are closer to reality than ever. Life-like graphics, smart gameplay and realistic machine-human interactions have led to major game studios up the ante when it comes to adopting the latest and most up to date tech for developing games. In fact, not so long ago, we shared with you a few interesting ways in which Artificial Intelligence is transforming the gaming industry. Inclusion of deep learning in games has emerged as one popular solution to make the games smarter. Deep learning can be used to enhance the realism and excitement in games by teaching the game agents how to behave more accurately, and in a more life-like manner. We recently came up with this interesting implementation of deep learning to to play the game of FIFA 18, and we were quite impressed! Using just 2 layers of neural networks and with a limited amount of training, the bot that was developed managed to learn the basic rules of football (soccer). Not just that, it was also able to perform the basic movements and tasks in the game correctly. To achieve this, 2 neural networks were developed - a Convolutional Neural Network to detect objects within the game, and a second layer of LSTM (Long Short Term Memory) network to specify the movements accordingly. The same user also managed to leverage deep learning to improve the in-game graphics of the FIFA 18 game. Using the deepfakes algorithm, he managed to swap the in-game face of one of the players with the player’s real-life face. The reason? The in-game faces, although quite realistic, could be better and more realistic. The experiment ended up being near perfect, as the resultant face that was created was as good as perfect. How did he do it? After gathering some training data which was basically some images of players scraped off Google, the user managed to train two autoencoders which learnt the distinction between the in-game face and the real-world face. Then, using the deepfakes algorithm, the inputs were reversed, recreating the real-world face in the game itself. The difference is quite astonishing: Apart from improving the gameplay and the in-game character graphics, deep learning can also be used to enhance the way the opponents/adversaries interact with the player in the game. If we take the example of the FIFA game mentioned before, deep learning can be used to enhance the behaviour and appearance of the in-game crowd, who can react or cheer their team better as per their team’s performance. How can Deep Learning benefit video games? The following are some of the clear advantages of implementing deep learning techniques in games: Highly accurate results can be achieved with more and more training data Manual intervention is minimal Game developers can focus on effective storytelling than on the in-game graphics Another obvious question comes to mind at this stage, however. What are the drawbacks of implementing deep learning for games? A few come to mind immediately: Complexity of the training models can be quite high Images in games need to be generated in real-time which is quite a challenge The computation time can be quite significant The training dataset for achieving accurate results can be quite humongous With advancements in technology and better, faster hardware, many of the current limitations in developing smarter games  can be overcome. Fast generative models can look into the real-time generation of images, while faster graphic cards can take care of the model computation issue. All in all, dabbling with deep learning in games seems worth the punt which game studios should definitely think of taking. What do you think? Is incorporating deep learning techniques in games a scalable idea?
Read more
  • 0
  • 0
  • 15686
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-5g-mobile-data-propel-artificial-intelligence
Neil Aitken
02 Aug 2018
7 min read
Save for later

How 5G Mobile Data will propel Artificial Intelligence (AI) progress

Neil Aitken
02 Aug 2018
7 min read
Like it’s predecessors, 3G and 4G, 5G refers to the latest ‘G’ – Generation – of mobile technology. 5G will give us very fast - effectively infinitely fast - mobile data download bandwidth. Downloading a TV show to your phone over 5G, in its entirety, in HD, will take less than a second, for example. A podcast will be downloaded within a fraction of a second of you requesting it. Scratch the surface of 5G, however, and there is a great deal more to see than just fast mobile data speeds.  5G is the backbone on which a lot of emerging technologies such as AI, blockchains, IoT among others will reach mainstream adoption. Today, we look at how 5G will accelerate AI growth and adoption. 5G will create the data AI needs to thrive One feature of 5G with ramifications beyond data speed is ‘Latency.’ 5G offers virtually ‘Zero Latency’ as a service. Latency is the time needed to transmit a packet of data from one device to another. It includes the period of time between when the request was made, to the time the response is completed. [caption id="attachment_21251" align="aligncenter" width="580"] 5G will be superfast – but will also benefit from near zero ‘latency’[/caption] Source: Economist At the moment, we keep files (music, pictures or films) in our phones’ memory permanently. We have plenty of processing power on our devices. In fact, the main upgrade between phone generations these days is a faster processor. In a 5G world, we will be able to use cheap parts in our devices – processors and memory in our new phones. Data downloads will be so fast, that we can get them immediately when we need them. We won’t need to store information on the phone unless we want to.  Even if the files are downloaded from the cloud, because the network has zero latency – he or she feels like the files are on the phone. In other words, you are guaranteed a seamless user experience in a 5G world. The upshot of all this is that the majority of any new data which is generated from mobile products will move to the cloud for storage. At their most fundamental level, AI algorithms are pattern matching tools. The bigger the data trove, the faster and better performing the results of AI analysis is. These new structured data sets, created by 5G, will be available from the place where it is easiest to extract and manipulate (‘Analyze’) it – the cloud. There will be 100 billion 5G devices connected to cellular networks by 2025, according to Huawei. 5G is going to generate data from those devices, and all the smartphones in the world and send it all back to the cloud. That data is the source of the incredible power AI gives businesses. 5G driving AI in autonomous vehicles 5G’s features and this Cloud / Connected Device future, will manifest itself in many ways. One very visible example is how 5G will supercharge the contribution, especially to reliability and safety, that AI can make to self driving cars. A great deal of the AI processing that is required to keep a self driving car operating safely, will be done by computers on board the vehicle. However, 5G’s facilities to communicate large amounts of data quickly will mean that any unusual inputs (for example, the car is entering or in a crash situation) can be sent to bigger computing equipment on the cloud which could perform more serious processing. Zero latency is important in these situations for commands which might come from a centralized accident computer, designed to increase safety– for example issuing the command ‘break.’ In fact, according to manufacturers, it’s likely that, ultimately, groups of cars will be coordinated by AI using 5G to control the vehicles in a model known as swarm computing. 5G will make AI much more useful with ‘context’ - Intel 5G will power AI by providing location information which can be considered in establishing the context of questions asked of the tool – according to Intel’s Data Center Group. For example, asking your Digital Assistant where the tablets are means something different depending on whether you’re in a pharmacy or an electronics store. The nature of 5G is that it’s a mobile service. Location information is both key to context and an inherent element of information sent over a 5G connection. By communicating where they are, 5G sensors will help AI based Digital Assistants solve our everyday problems. 5G phones will enable  AI calculations on ‘Edge’ network devices  - ARM 5G will push some processing to the ‘Edge’ of the network, for manipulation by a growing range of AI chips on to the processors of phones. In this regard, smartphones like any Internet Of Things connected processor ‘in the field’ are simply an ‘AI platform’. Handset manufacturers are including new software features in their phones that customers love to use – including AI based search interfaces which allow them to search for images containing ‘heads’ and see an accurate list. [caption id="attachment_21252" align="aligncenter" width="1918"] Arm are designing new types of chips targeted at AI calculations on ‘Edge’ network devices.[/caption] Source: Arm's Project Trillium ARM, one of the world’s largest CPU producers are creating specific, dedicated AI chip sets, often derived from the technology that was behind their Graphics Processing Units. These chips process AI based calculations up to 50 times faster than standard microprocessors already and their performance is set to improve 50x over the next 3 years, according to the company. AI is part of 5G networks - Huawei Huawei describes itself as an AI company (as well as a number of other things including handset manufacturer.) They are one of the biggest electronic manufacturers in China and are currently in the process of selling networking products to the world’s telecommunications companies, as they prepare to roll out their 5G networks. Based on the insight that 70% of network system downtime comes from human error, Huawei is now eliminating humans from the network management component of their work, to the degree that they can. Instead, they’re implementing automated AI based predictive maintenance systems to increase data throughput across the network and reduce downtime. The way we use cellular networks is changing. Different applications require different backend traffic to be routed across the network, depending on the customer need. Someone watching video, for example, has a far lower tolerance for a disruption to the data throughput (the ‘stuttering Netflix’ effect) than a connected IoT sensor which is trying to communicate the temperature of a thermometer. Huawei’s network maintenance AI software optimizes these different package needs, maintaining the near zero latency that the standard demands at a lower cost. AI based network maintenance complete a virtuous loop in which 5G devices on new cellular networks give AI the raw data they need, including valuable context information, and AI helps the data flow across the 5G network better. Bringing it all together 5G and artificial intelligence (AI) are revolutionary technologies that will evolve alongside each other. 5G isn’t just fast data, it’s one of the most important technologies ever devised. Just as the smartphone did, it will fundamentally change how we relate to information, partly, because it will link us to thousands of newly connected devices on the Internet of Things. Ultimately, it could be the secondary effects of 5G, the network’s almost zero latency, which could provide the largest benefit – by creating structured data sets from billions of connected devices, in an easily accessible place – the cloud which can be used to fuel the AI algorithms which run on them. Networking equipment, chip manufacturers and governments have all connected the importance of AI with the potential of 5G. Commercial sales of 5G start in The US, UK and Australia in 2019. 7 Popular Applications of Artificial Intelligence in Healthcare Top languages for Artificial Intelligence development Cognitive IoT: How Artificial Intelligence is remoulding Industrial and Consumer IoT      
Read more
  • 0
  • 0
  • 15674

article-image-quantum-intelligent-mix-quantum
Sugandha Lahoti
29 Nov 2017
6 min read
Save for later

Quantum A.I. : An intelligent mix of Quantum+A.I.

Sugandha Lahoti
29 Nov 2017
6 min read
“Mixed reality, Artificial intelligence and Quantum computing are the three path-breaking technologies that will shape the world in the coming years.” - Satya Nadella, CEO, Microsoft. The biggest scientific & technological revolution of the decade, Artificial Intelligence, has the potential to flourish human civilizations like never before. At the surface level, it seems to be all about automated functioning and intelligent coding. But at the core, algorithms require huge data, quality training, and complex models. Processing of these algorithmic computations need hardware. Presently, digital computers operate on the classical Boolean logic. Quantum computing is the next-gen hardware and software technology, based on the quantum law. It typically means that, they use qubits instead of the boolean logic in order to speed up calculations. The concoction of the both path-breaking techs, i.e. AI and Quantum Computing is said to be the future of technology. Quantum A.I. is all about implementing fast computation capabilities of quantum computers to Artificial intelligence based applications. Understanding Quantum Computing Before we jump into Quantum A.I., let us first understand Quantum Computing in detail. In physics terminology, quantum mechanics is the study of nature at the atomic and subatomic level. Totally opposite of classical physics theory which describes the nature at macroscopic level. At the quantum level, nature particles may take form of more than one state at the same time. Quantum computing utilizes this fundamental quantum phenomena of the nature to process information. Quantum computer stores information in the form of quantum bits, known as qubits, similar to the binary logic used by digital computers. However, the state of the bits is not defined. It can encode information as both 1s and 0s with the help of quantum mechanical principles of superposition, entanglement, and tunneling. The use of quantum logic enables a quantum computer to solve problems at an exponentially faster rate than present day computers. Physicists and researchers consider that quantum computers are powerful enough to outperform the present processors. Quantum Computing for Artificial Intelligence Regardless of smart AI algorithms, a high-processing hardware is essential for them to function. Current GPUs, allow algorithms to run at an operable speed, a speckle of what quantum computing does. Quantum computing approach helps AI algorithms undergo exponential speedups over existing digital computers. In this way it will ease problems related to machine learning, clustering, classification and finding constructive patterns in large quantities of data. Quantum learning amalgamates with AI to speed up ML and AI algorithms in order to develop systems which can better interpret, improve, and understand large data sets of information. Specific use cases in the area of Quantum AI: Random Number Generation Classical, digital computers are only able to generate pseudo-random numbers. They use computational difficulty for encryptions, making them easily crackable using quantum computers. Certain machine learning algorithms require pure random numbers to generate ideal results, specifically for financial applications. Quantum systems have the mechanism to generate pure random numbers as required by machine learning applications. QRNG (Quantum Random number generator) is a quantum computer by Certes Networks, used for generating high-level random numbers for secure encryption key generation. Quantum-enhanced Reinforcement Learning Reinforcement learning is an Artificial intelligence area which allows agents to learn about an environment and take actions to achieve rewards. Usually it is time consuming in the initial training process and choosing an optimal path. With the help of a quantum agent, the training time reduces dramatically. Additionally, a quantum agent is thorough with the description of the environment after the end of each learning process. This is marked as an advancement over the classical approach where reinforcement learning schemes are model-free. Quantum-Inspired Neural Nets Quantum neural networks leverage ideas from the quantum theory for a fuzzy logic based neural network implementation. Current Neural network in the areas of big data applications are generally difficult to train as they use a feedback loop to update parameters in the training phase. In quantum computers, quantum forces such as interference and entanglement can be used to quickly update parameters in the training phase, easing the entire training process. Big data Analytics Quantum computers have the ability to handle huge amount of data generated and will continue to do so at an exponential rate. Using quantum computing techniques for big data analytics, useful insights would be within every individual’s reach. This would lead to better portfolio management, optimal routing for navigation, best possible treatments, personalized medications, etc. Empowering Big data analytics with quantum computing will ease out sampling, optimizing, and analyzing large quantities of data, giving businesses and consumers better decision making ability. These are few examples in terms of measuring Quantum AI capabilities. Quantum computers powered by Artificial Intelligence is set to have tremendous impact in the field of science and engineering. Ongoing Research and Implementation Google plans to build a 49-qubit quantum chip by the end of 2017. Microsoft CEO, during his keynote session at Microsoft Ignite  made the announcement of a new programming language designed to work on quantum simulator as well as quantum computer. In this rat race, IBM successfully built and measured a 50 qubit quantum computer. Additionally, Google is collaborating with NASA to release a number of research papers pertaining to Quantum A.I. domain. Rigetti Computing plans to devise a computer that will leverage quantum physics for applications pertaining to artificial intelligence and chemistry simulations. They will offer a cloud based service, on the lines of Google and Microsoft for remote usage. Volkswagen, a German automaker, plans to collaborate with Google quantum AI to develop new-age digital features for cars and intelligent traffic-management system. They are also contemplating to build AI systems for autonomous cars. Future Scope In the near future, high-level quantum computers will help in development of complex AI models with ease. Such Quantum enhanced AI algorithms will influence application development in the field of finance, security, healthcare, molecular science, automobile and manufacturing etc. Artificial intelligence married to Quantum computing is said to be the key of a brighter, more tech-oriented future. A future that will take intelligent information processing at a whole new altitude.  
Read more
  • 0
  • 0
  • 15572

article-image-13-reasons-exit-polls-wrong
Sugandha Lahoti
13 Nov 2017
7 min read
Save for later

13 reasons why Exit Polls get it wrong sometimes

Sugandha Lahoti
13 Nov 2017
7 min read
An Exit poll, as the name suggests, is a poll taken immediately after voters exit the polling booth. Private companies working for popular newspapers or media organizations conduct these exit polls and are popularly known as pollsters. Once the data is collected, data analysis and estimation is used to predict the winning party and the number of seats captured. Turnout models which are built using logistic regression or random forest techniques are used for prediction of turnouts in the exit poll results. Exit polls are dependent on sampling. Hence a margin of error does exist. This describes how close pollsters are in expecting an election result relative to the true population value. Normally, a margin of error plus or minus 3 percentage points is acceptable. However, in the recent times, there have been instances where the poll average was off by a larger percentage. Let us analyze some of the reasons why exit polls can get their predictions wrong. [dropcap]1[/dropcap] Sampling inaccuracy/quality Exit polls are dependent on the sample size, i.e. the number of respondents or the number of precincts chosen. Incorrect estimation of this may lead to error margins. The quality of sample data also matters. This includes factors such as whether the selected precincts are representative of the state, whether the polled audience in each precinct represents the whole etc. [dropcap]2[/dropcap] Model did not consider multiple turnout scenarios Voter turnout refers to the percentage of voters who cast a vote during an election. Pollsters may often misinterpret the number of people who actually vote based on the total no. of the population eligible to vote. Also, they often base their turnout prediction on past trends. However, voter turnout is dependent on many factors. For example, some voters might not turn up due to reasons such as indifference or a feeling of perception that their vote might not count--which is not true. In such cases, the pollsters adjust the weighting to reflect high or low turnout conditions by keeping the total turnout count in mind. The observations taken during a low turnout is also considered and the weights are adjusted therein. In short, pollsters try their best to maintain the original data. [dropcap]3[/dropcap] Model did not consider past patterns Pollsters may commit a mistake by not delving into the past. They can gauge the current turnout rates by taking into the account the presidential turnout votes or the previous midterm elections. Although, one may assume that the turnout percentage over the years have been stable a check on the past voter turnout is a must. [dropcap]4[/dropcap] Model was not recalibrated for year and time of election such as odd-year midterms Timing is a very crucial factor in getting the right traction for people to vote. At times, some social issues would be much more hyped and talked-about than the elections. For instance, the news of the Ebola virus breakout in Texas was more prominent than news about the contestants standing in the mid 2014 elections. Another example would be an election day set on a Friday versus on any other weekday. [dropcap]5[/dropcap] Number of contestants Everyone has a personal favorite. In cases where there are just two contestants, it is straightforward to arrive at a clear winner. For pollsters, it is easier to predict votes when the whole world's talking about it, and they know which candidate is most talked about. With the increase in the number of candidates, the task to carry out an accurate survey is challenging for the pollsters. They have to reach out to more respondents to carry out the survey required in an effective manner. [dropcap]6[/dropcap] Swing voters/undecided respondents Another possible explanation for discrepancies in poll predictions and the outcome is due to a large proportion of undecided voters in the poll samples. Possible solutions could be Asking relative questions instead of absolute ones Allotment of undecided voters in proportion to party support levels while making estimates [dropcap]7[/dropcap] Number of down-ballot races Sometimes a popular party leader helps in attracting votes to another less popular candidate of the same party. This is the down-ballot effect. At times, down-ballot candidates may receive more votes than party leader candidates, even when third-party candidates are included. Also, down-ballot outcomes tend to be influenced by the turnout for the polls at the top of the ballot. So the number of down-ballot races need to be taken into account. [dropcap]8[/dropcap] The cost incurred to commission a quality poll A huge capital investment is required in order to commission a quality poll. The cost incurred for a poll depends on the sample size, i.e. the number of people interviewed, the length of the questionnaire--longer the interview, more expensive it becomes, the time within which interviews must be conducted, are some contributing factors. Also, if a polling firm is hired or if cell phones are included to carry out a survey, it will definitely add up to the expense. [dropcap]9[/dropcap] Over-relying on historical precedence Historical precedence is an estimate of the type of people who have shown up previously on a similar type of election. This precedent should also be taken into consideration for better estimation of election results. However, care should be taken not to over-rely on it. [dropcap]10[/dropcap] Effect of statewide ballot measures Poll estimates are also dependent on state and local governments. Certain issues are pushed by local ballot measures. However, some voters feel that power over specific issues should belong exclusively to state governments. This causes opposition to local ballot measures in some states. These issues should be taken into account while estimation for better result prediction. [dropcap]11[/dropcap] Oversampling due to various factors such as faulty survey design, respondents’ willingness/unwillingness to participate etc   Exit polls may also sometimes oversample voters for many reasons. One example of this is related to the people of US with cultural ties to Latin America. Although, more than one-fourth of Latino voters prefer speaking Spanish to English, yet exit polls are almost never offered in Spanish. This might oversample English speaking Latinos. [dropcap]12[/dropcap] Social desirability bias in respondents People may not always tell the truth about who they voted for. In other words, when asked by pollsters they are likely to place themselves on the safer side, as exit polls is a sensitive topic. The voters happen to tell pollsters that they have voted for a minority candidate, but they have actually voted against the minority candidate. Social Desirability has no linking to issues with race or gender. It is just that people like to be liked and like to be seen as doing what everyone else is doing or what the “right” thing to do is, i.e., they play safe. Brexit polling, for instance, showed stronger signs of Social desirability bias. [dropcap]13[/dropcap] The spiral of silence theory People may not reveal their true thoughts to news reporters as they may believe media has an inherent bias. Voters may not come out to declare their stand publicly in fear of reprisal or the fear of isolation. They choose to remain silent. This may also hinder estimate calculation for pollsters. The above is just a shortlist of a long list of reasons why exit poll results must be taken with a pinch of salt. However, even with all its shortcomings, the striking feature of an exit poll is the fact that rather than predicting about a future action, it records an action that has just happened. So you rely on present indicators rather than ambiguous historical data. Exit polls are also cost-effective in obtaining very large samples. If these exit polls are conducted properly, keeping in consideration the points described above, they can predict election results with greater reliability.
Read more
  • 0
  • 0
  • 15281

article-image-active-learning-an-approach-to-training-machine-learning-models-efficiently
Savia Lobo
27 Apr 2018
4 min read
Save for later

Active Learning : An approach to training machine learning models efficiently

Savia Lobo
27 Apr 2018
4 min read
Training a machine learning model to give accurate results requires crunching huge amounts of labelled data in it. Data being naturally unlabelled, need ‘experts’ who can scan through the data and tag them with correct labels. To perform topic-specific data labelling, for example, classifying diseases based on their type, would definitely require a doctor or someone with a medical background to label the data. Getting such topic-specific experts to label data can get difficult and quite expensive. Also, doing this for many machine learning projects is impractical. Active learning can help here. What is Active Learning Active learning is a type of semi-supervised machine learning, which aids in reducing the amount of labeled data required to train a model. In active learning, the model focuses only on data that the model is confused about and requests the experts to label them. The model later trains a bit more on the small amount of labeled data, and repeats the same for such confusing data labeling. Active learning, in short, prioritizes confusing samples that need labeling. This enables models to learn faster, and allows experts to skip labeling data that is not a priority, and to provide the model with the most useful information on the confused samples. This in turn can fetch great machine learning models, as active learning can reduce the number of labels required to collect from experts. Types of Active learning An active learning environment includes a learner (the model being trained), huge amount of raw and unlabelled data, and the expert (the person/system labelling the data). The role of the learner is to choose which instances or examples should be labelled. The learner’s goal is to reduce the number of labeled examples needed for an ML model to learn. On the other hand, the expert on receiving the data to be labelled, analyzes the data to determine appropriate labels for it. There are three types of Active learning scenarios. Query Synthesis - In such a scenario, the learner constructs examples, which are further sent to the expert for labeling. Stream-based active learning - Here, from the stream of unlabelled data, the learner decides the instances to be labelled or choose to discard them. Pool-based active learning - This is the most common scenario in active learning. Here, the learner chooses only the most informative or best instances and forwards them to the expert for labelling. Some Real-life applications of Active learning Natural Language Processing (NLP): Most of the NLP applications require a lot of labelled data such as POS (Parts-of-speech) tagging, NER (Named Entity Recognition), and so on. Also, there is a huge cost incurred in labelling this data. Thus, using active learning can reduce the amount of labelled data required to label. Scene understanding in self-driving cars: Active learning can also be used in detecting objects, such as pedestrians from a video camera mounted on a moving car,a key area to ensure safety in autonomous vehicles. This can result in high levels of detection accuracy in complex and variable backgrounds. Drug designing: Drugs are biological or chemical compounds that interact with specific ‘targets’ in the body (usually proteins, RNA or DNA) with an aim to modify their activity. The goal of drug designing is to find which compounds bind to a particular target. The data comes from large collections of compounds, vendor catalogs, corporate collections, and combinatorial chemistry. With active learning, the learner can find out the compounds that are active (binds to target) or inactive. Active learning is still being researched using different deep learning algorithms such as CNNs and LSTMs, which act as learners in order to improve their efficiency. Also, GANs (Generative Adversarial Networks) are being implemented in the active learning framework. There are also some research papers that try to learn active learning strategies using meta-learning. Why is Python so good for AI and Machine Learning? 5 Python Experts Explain AWS Greengrass brings machine learning to the edge Unity Machine Learning Agents: Transforming Games with Artificial Intelligence
Read more
  • 0
  • 0
  • 14956
article-image-ai-chip-wars-brainwave-microsofts-answer-googles-tpu
Amarabha Banerjee
18 Oct 2017
5 min read
Save for later

AI chip wars: Is Brainwave Microsoft's Answer to Google's TPU?

Amarabha Banerjee
18 Oct 2017
5 min read
When Google decided to design their own chip with TPU, it generated a lot of buzz for faster and smarter computations with its ASIC-based architecture. Google claimed its move would significantly enable intelligent apps to take over, and industry experts somehow believed a reply from Microsoft was always coming (remember Bing?). Well, Microsoft has announced its arrival into the game – with its own real-time AI-enabled chip called Brainwave. Interestingly, as the two tech giants compete in chip manufacturing, developers are certainly going to have more options now, while facing the complex computational processes of modern day systems. What is Brainwave? Until recently, Nvidia was the dominant market player in the microchip segment, creating GPUs (Graphics Processing Unit) for faster processing and computation. But after Google disrupted the trend with its TPU (tensor processing unit) processor, the surprise package in the market has come from Microsoft. More so because its ‘real-time data processing’ Brainwave chip claims to be faster than the Google chip (the TPU 2.0 or the Cloud TPU chip). The one thing that is common between both Google and Microsoft chips is that they can both train and simulate deep neural networks much faster than any of the existing chips. The fact that Microsoft has claimed that Brainwave supports Real-Time AI systems with minimal lag, by itself raises an interesting question - are we looking at a new revolution in the microchip industry? The answer perhaps lies in the inherent methodology and architecture of both these chips (TPU and Brainwave) and the way they function. What are the practical challenges of implementing them in real-world applications? The Brainwave Architecture: Move over GPU, DPU is here In case you are wondering what the hype with Microsoft’s Brainwave chip is about, the answer lies directly in its architecture and design. The present-day complex computational standards are defined by high-end games for which GPUs (Graphical Processing Units) were originally designed. Brainwave differs completely from the GPU architecture: the core components of a Brainwave chip are Field Programmable Gate Arrays or FPGAs. Microsoft has developed a huge number of FPGA modules on top of which DNN (Deep Neural Network) layers are synthesized. Together, this setup can be compared with something similar to Hardware Microservices where each task is assigned by a software to different FPGA and DNN modules. These software controlled Modules are called DNN Processing Units or DPUs. This eliminates the latency of the CPU and the need for data transfer to and fro from the backend. The two methodologies involved here are seemingly different in their architecture and application: one is the hard DPU and the other is the Soft DPU. While Microsoft has used the soft DPU approach where the allocation of memory modules are determined by software and the volume of data at the time of processing, the hard DPU has a predefined memory allocation which doesn’t allow for flexibility so vital in real-time processing. The software controlled feature is exclusive to Microsoft, and unlike other AI processing chips, Microsoft have developed their own easy to process data types that are faster to process. This enables the Brainwave chip to perform near real-time AI computations easily.  Thus, in a way Microsoft brainwave holds an edge over the Google TPU when it comes to real-time decision making and computation capabilities. Brainwave’s edge over TPU 2 - Is it real time? The reason Google had ventured out into designing their own chips was their need to increase the number of data centers, with the increase in user queries. They had realized the fact that instead of running data queries via data centers, it would be far more plausible if the computation was performed in the native system. That’s where they needed more computational capabilities than what the modern day market leaders like Intel X86 Xeon processors and the Nvidia Tesla K80 GPUs offered. But Google opted for Application Specific Integrated Circuits (ASIC) instead of FPGAs, the reason being that it was completely customizable. It was not specific for one particular Neural Network but was rather applicable for multiple Networks. The trade-off for this ability to run multiple Neural Networks was of course Real Time computation which Brainwave could achieve because of using the DPU architecture. The initial data released by Microsoft shows that the Brainwave has a data transfer bandwidth of 20TB/sec, 20 times faster than the latest Nvidia GPU chip. Also, the energy efficiency of Brainwave is claimed to be 4.5 times better than the current chips. Whether Google would up their ante and improve on the existing TPU architecture to make it suitable for real-time computation is something only time can tell. [caption id="attachment_1064" align="alignnone" width="644"] Source: Brainwave_HOTCHIPS2017 PPT on Microsoft Research Blog[/caption] Future outlook and challenges Microsoft is yet to declare the benchmarking results for the Brainwave chip. But Microsoft Azure customers most definitely look forward to the availability of Brainwave chip for faster and better computational abilities. What is even more promising is Brainwave works seamlessly with Google’s TensorFlow and Microsoft’s own CNTK framework. Tech startups like Rigetti, Mythic and Waves are trying to create mainstream applications which will employ AI and quantum computation techniques. This will bring AI to the masses, by creating practical AI driven applications for daily consumers, and these companies have shown a keen interest in both the Microsoft and the Google AI chips. In fact, Brainwave will be most suited for these companies such as the above which are looking to use AI capabilities for everyday tasks, as they are less in number because of the limited computational capabilities of the current chips. The challenges with all AI chips, including Brainwave, will still revolve around their data handling capabilities, the reliability of performance, and on improving memory capabilities of our current hardware systems.
Read more
  • 0
  • 0
  • 14519

article-image-data-science-folks-12-reasons-thankful-thanksgiving
Savia Lobo
21 Nov 2017
8 min read
Save for later

Data science folks have 12 reasons to be thankful for this Thanksgiving

Savia Lobo
21 Nov 2017
8 min read
We are nearing the end of 2017. But with each ending chapter, we have remarkable achievements to be thankful for. Similarly, for the data science community, this year was filled with a number of new technologies, tools, version updates etc. 2017 saw blockbuster releases such as PyTorch, TensorFlow 1.0 and Caffe 2, among many others. We invite data scientists, machine learning experts, and other data science professionals to come together on this Thanksgiving Day, and thank the organizations, which made our interactions with AI easier, faster, better and generally more fun. Let us recall our blessings in 2017, one month at a time... [dropcap]Jan[/dropcap] Thank you, Facebook and friends for handing us PyTorch Hola 2017! While the world was still in the New Year mood, a brand new deep learning framework was released. Facebook along with a few other partners launched PyTorch. PyTorch came as an improvement to the popular Torch framework. It now supported the Python language over the less popular Lua. As PyTorch worked just like Python, it was easier to debug and create unique extensions. Another notable change was the adoption of a Dynamic Computational Graph, used to create graphs on the fly with high speed and flexibility. [dropcap]Feb[/dropcap] Thanks Google for TensorFlow 1.0 The month of February brought Data Scientist’s a Valentine's gift with the release of TensorFlow 1.0. Announced at the first annual TensorFlow Developer Summit, TensorFlow 1.0 was faster, more flexible, and production-ready. Here’s what the TensorFlow box of chocolate contained: Fully compatibility with Keras Experimental APIs for Java and Go New Android demos for object and image detection, localization, and stylization A brand new Tensorflow debugger An introductory glance of  XLA--a domain-specific compiler for TensorFlow graphs [dropcap]Mar[/dropcap] We thank Francois Chollet for making Keras 2 a production ready API Congratulations! Keras 2 is here. This was a great news for Data science developers as Keras 2, a high- level neural network API allowed faster prototyping. It provided support both CNNs (Convolutional Neural Networks) as well as RNNs (Recurrent Neural Networks). Keras has an API designed specifically for humans. Hence, a user-friendly API. It also allowed easy creation of modules, which meant it is perfect for carrying out an advanced research. Developers can now code in  Python, a compact, easy to debug language. [dropcap]Apr[/dropcap] We like Facebook for brewing us Caffe 2 Data scientists were greeted by a fresh aroma of coffee, this April, as Facebook released the second version of it’s popular deep learning framework, Caffe. Caffe 2 came up as a easy to use deep learning framework to build DL applications and leverage community contributions of new models and algorithms. Caffe 2 was fresh with a first-class support for large-scale distributed training, new hardware support, mobile deployment, and the flexibility for future high-level computational approaches. It also provided easy methods to convert DL models built in original Caffe to the new Caffe version. Caffe 2 also came with over 400 different operators--the basic units of computation in Caffe 2. [dropcap]May[/dropcap] Thank you, Amazon for supporting Apache MXNet on AWS and Google for your TPU The month of May brought in some exciting launches from the two tech-giants, Amazon and Google. Amazon Web Services’ brought Apache MXNet on board and Google’s Second generation TPU chips were announced. Apache MXNet, which is now available on AWS allowed developers to build Machine learning applications which can train quickly and run anywhere, which means it is a scalable approach for developers. Next up, was Google’s  second generation TPU (Tensor Processing Unit) chips, designed to speed up machine learning tasks. These chips were supposed to be (and are) more capable of CPUs and even GPUs. [dropcap]Jun[/dropcap] We thank Microsoft for CNTK v2 The mid of the month arrived with Microsoft’s announcement of the version 2 of its Cognitive Toolkit. The new Cognitive Toolkit was now enterprise-ready, had production-grade AI and allowed users to create, train, and evaluate their own neural networks scalable to multiple GPUs. It also included the Keras API support, faster model compressions, Java bindings, and Spark support. It also featured a number of new tools to run trained models on low-powered devices such as smartphones. [dropcap]Jul[/dropcap] Thank you, Elastic.co for bringing ML to Elastic Stack July made machine learning generally available for the Elastic Stack users with its version 5.5. With ML, the anomaly detection of the Elasticsearch time series data was made possible. This allows users to analyze the root cause of the problems in the workflow and thus reduce false positives. To know about the changes or highlights of this version visit here. [dropcap]Aug[/dropcap] Thank you, Google for your Deeplearn.js August announced the arrival of Google’s Deeplearn.js, an initiative that allowed Machine Learning models to run entirely in a browser. Deeplearn.js was an open source WebGL- accelerated JS library. It offered an interactive client-side platform which helped developers carry out rapid prototyping and visualizations. Developers were now able to use hardware accelerator such as the GPU via the webGL and perform faster computations with 2D and 3D graphics. Deeplearn.js also allowed TensorFlow model’s capabilities to be imported on the browser. Surely something to thank for! [dropcap]Sep[/dropcap] Thanks, Splunk and SQL for your upgrades September surprises came with the release of Splunk 7.0, which helps in getting Machine learning to the masses with an added Machine Learning Toolkit, which is scalable, extensible, and accessible. It includes an added native support for metrics which speed up query processing performance by 200x. Other features include seamless event annotations, improved visualization, faster data model acceleration, a cloud-based self-service application. September also brought along the release of MySQL 8.0 which included a first-class support for Unicode 9.0. Other features included are An extended support for native JSOn data Inclusion of windows functions and recursive SQL syntax for queries that were previously impossible or difficult to write Added document-store functionality So, big thanks to the Splunk and SQL upgrades. [dropcap]Oct[/dropcap] Thank you, Oracle for the Autonomous Database Cloud and Microsoft for SQL Server 2017 As Fall arrived, Oracle unveiled the World’s first Autonomous Database Cloud. It provided full automation associated with tuning, patching, updating and maintaining the database. It was self scaling i.e., it instantly resized compute and storage without downtime with low manual administration costs. It was also self repairing and guaranteed 99.995 percent reliability and availability. That’s a lot of reduction in workload! Next, developers were greeted with the release of SQL Server 2017 which was a major step towards making SQL Server a platform. It included multiple enhancements in Database Engine such as adaptive query processing, Automatic database tuning, graph database capabilities, New Availability Groups, Database Tuning Advisor (DTA) etc. It also had a new Scale Out feature in SQL Server 2017 Integration Services (SSIS) and SQL Server Machine Learning Services to reflect support for Python language. [dropcap]Nov[/dropcap] A humble thank you to Google for TensorFlow Lite and Elastic.co for Elasticsearch 6.0 Just a month more for the year to end!! The Data science community has had a busy November with too many releases to keep an eye on with Microsoft Connect(); to spill the beans. So, November, thank you for TensorFlow Lite and Elastic 6. Talking about TensorFlow Lite, a lightweight product  for mobile and embedded devices, it is designed to be: Lightweight: It allows inference of the on-device machine learning models that too with a small binary size, allowing faster initialization/ startup. Speed: The model loading time is dramatically improved, with an accelerated hardware support. Cross-platform: It includes a runtime tailormade to run on various platforms–starting with Android and iOS. And now for Elasticsearch 6.0, which is made generally available. With features such as easy upgrades, Index sorting, better Shard recovery, support for Sparse doc values.There are other new features spread out across the Elastic stack, comprised of Kibana, Beats and Logstash. These are, Elasticsearch’s solutions for visualization and dashboards, data ingestion and log storage. [dropcap]Dec[/dropcap] Thanks in advance Apache for Hadoop 3.0 Christmas gifts may arrive for Data Scientists in the form of General Availability of Hadoop 3.0. The new version is expected to include support for Erasure Encoding in HDFS, version 2 of the YARN Timeline Service, Shaded Client Jars, Support for More than 2 NameNodes, MapReduce Task-Level Native Optimization, support for Opportunistic Containers and Distributed Scheduling to name a few. It would also include a rewritten version of Hadoop shell scripts with bug fixes, improved compatibility and many changes in some existing installation procedures. Pheww! That was a large list of tools for Data Scientists and developers to thank for this year. Whether it be new frameworks, libraries or a new set of software, each one of them is unique and helpful to create data-driven applications. Hopefully, you have used some of them in your projects. If not, be sure to give them a try, because 2018 is all set to overload you with new, and even more amazing tools, frameworks, libraries, and releases.
Read more
  • 0
  • 0
  • 14049

article-image-one-shot-learning-solution-low-data-problem
Savia Lobo
04 Dec 2017
5 min read
Save for later

One Shot Learning: Solution to your low data problem

Savia Lobo
04 Dec 2017
5 min read
The fact that machines are successful in replicating human intelligence is mind-boggling. However, this is only possible if machines are fed with correct mix of algorithms, huge collection of data, and most importantly the training given to it, which in turn leads to faster prediction or recognition of objects within the images. On the other hand, when you train humans to recognize a car for example, you simply have to show them a live car or an image. The next time they see any vehicle, it would be easy for them to distinguish a car among-st other vehicles. In a similarly way, can machines learn with single training example like humans do? Computers or machines lack a key part that distinguishes them from humans, and that is, ‘Memory’. Machines cannot remember; hence it requires millions of data to be fed in order to understand the object detection, be it from any angle. In order to reduce this supplement of training data and enabling machines to learn with less data at hand, One shot learning is brought to its assistance. What is one shot learning and how is it different from other learning? Deep Neural network models outperform various tasks such as image recognition, speech recognition and so on. However, such tasks are possible only due to extensive, incremental training on large data sets. In cases when there is a smaller dataset or fewer training examples, a traditional model is trained on the data that is available. During this process, it relearns new parameters and incorporates new information, and completely forgets the one previously learned. This leads to poor training or catastrophic inference. One shot learning proves to be a solution here, as it is capable of learning with one, or a minimal number of training samples, without forgetting. The reason for this is, they posses meta-learning; a capability often seen in neural network that has memory. How One shot learning works? One shot learning strengthens the ability of the deep learning models without the need of a huge dataset to train on. Implementation of One shot learning can be seen in a Memory Augmented Neural Network (MANN) model. A MANN has two parts, a controller and an external memory model. The controller is either a feed forward neural network or an LSTM (Long Short Term Memory) network, which interacts with the external memory module using number of read/write heads. These heads fetch or place representations to and fro the memory. LSTMs are proficient in long term storage through slow updates of weights and short term storage via the external memory module. They are trained to meta-learn; i.e. it can rapidly learn unseen functions with fewer data samples.Thus, MANNs are said to be capable of metalearning. The MANN model is later trained on datasets that include different classes with very few samples. For instance, the Omniglot dataset, a collection of handwritten samples of different languages, with very few samples of each language. After continuously training the model with thousands of iterations by using few samples, the model was able to recognize never-seen-before image samples, taken from a disjoint sample of the Omniglot dataset. This proves that MANN models are able to outperform various object categorization tasks with minimal data samples.   Similarly, One shot learning can also be achieved using Neural Turing Machine and Active One shot learning. Therefore, learning with a single attempt/one shot actually involves meta-learning. This means, the model gradually learns useful representations from the raw data using certain algorithms, for instance, the gradient descent algorithm. Using these learnings as a base knowledge, the model can rapidly cohere never seen before information with a single or one-shot appearance via an external memory module. Use cases of One shot learning Image Recognition: Image representations are learnt using supervised metric based approach. For instance,  siamese neural network, an identical sister network, discriminates between the class-identity of an image pair. Features of this network are reused for one-shot learning without the need for retraining. Object Recognition within images: One shot learning allows neural network models to recognize known objects and its category within an image. For this, the model learns to recognize the object with a few set of training samples. Later it compares the probability of the object to be present within the image provided. Such a model trained on one shot can recognize objects in an image despite the clutter, viewpoint, and lighting changes.   Predicting accurate drugs: The availability of datasets for a drug discovery are either limited or expensive. The molecule found during a biological study often does not end up being a drug due to ethical reasons such as toxicity, low-solubility and so on. Hence, a less amount of data is available about the candidate molecule. Using one shot learning, an iterative LSTM combined with Graph convolutional neural network is used to optimize the candidate molecule. This is done by finding similar molecules with increased pharmaceutical activity and lesser risks to patients. A detailed explanation of how using low data, accurate drugs can be predicted is discussed in a research paper published by the American Chemical Society(ACS). One shot learning is in its infancy and therefore use cases can be seen in familiar applications such as image and object recognition. As the technique will advance with time and the rate of adoption, other applications of one shot learning will come into picture. Conclusion One shot learning is being applied in instances of machine learning or deep learning models that have less data available for their training. A plus point in future is, that organizations will not have to collect huge amount of data for their ML models to be trained, only a few training samples would do the job! Large number of organizations are looking forward to adopt one shot learning within their deep learning models. It would be exciting to see how one shot learning will glide through being the base of every neural network implementation.  
Read more
  • 0
  • 0
  • 13975
article-image-4-ways-artificial-intelligence-leading-disruption-fintech
Pravin Dhandre
23 Nov 2017
6 min read
Save for later

4 ways Artificial Intelligence is leading disruption in Fintech

Pravin Dhandre
23 Nov 2017
6 min read
In the digital disruption era, Artificial Intelligence in Fintech is viewed as an emerging technology forming the sole premise for revolution in the sector. Tech giants positioned in the Fortune’s 500 technology list such as Apple, Microsoft, Facebook are putting resources in product innovations and technology automation. Businesses are investing hard to bring agility, better quality and high end functionality for driving their revenue growth by multi digits. Widely used AI-powered applications such as Virtual Assistants, Chatbots, Algorithmic Trading and Purchase Recommendation systems are fueling up the businesses with low marginal costs, growing revenues and providing a better customer experience. According to a survey, by National Business Research Institute, more than 62% of the companies will deploy AI powered fintech solutions in their applications to identify new opportunities and areas to scale the business higher. What has led the disruption? The Financial sector is experiencing a faster technological evolution right from providing personalized financial services, executing smart operations to simplify the complex and repetitive process. Use of machine learning and predictive analytics has enabled financial companies to provide smart suggestions on buying and selling stocks, bonds and commodities. Insurance companies are accelerating in automating their loan applications, thereby saving umpteen number of hours. Leading Investment Bank, Goldman Sachs automated their stock trading business replacing their trading professionals with computer engineers. Black Rock, one of the world’s largest asset management company facilitates high net worth investors with automated advice platform superseding highly paid wall street professionals. Applications such as algorithmic trading, personal chatbots, fraud prevention & detection, stock recommendations, and credit risk assessment are the ones finding their merit in banking and financial services companies.   Let us understand the changing scenarios with next-gen technologies: Fraud Prevention & Detection Fraud prevention is tackled by the firms using an anomaly detection API. The API is designed using machine learning & deep learning mechanism. It helps identify and report any suspicious or fraudulent activity taking place among-st billions of transactions on a daily basis. Fintech companies are infusing huge capital to handle cyber-crime, resulting into a global market spends of more than 400 billion dollars annually. Multi-national giants such as MasterCard, Sun Financial, Goldman Sachs, and Bank of England use AI-powered systems to safeguard and prevent money laundering, banking frauds and illegal transactions. Danske Bank, a renowned Nordic-based financial service provider, deployed AI engines in their operations helping them investigate millions of online banking transactions in less than a second. With this, cost of fraud investigation and delivering faster actionable insights reduced drastically. AI Powered Chatbots Chatbots are automated customer support chat applications powered by Natural Language Processing (NLP). They help deliver quick, engaging, personalized, and effective conversation to the end user. With an upsurge in the number of investors and varied investment options, customers seek financial guidance, profitable investment options and query resolution, faster and in real-time. Large number of banks such as Barclays, Bank of America, JPMorgan Chase are widely using AI-supported digital Chatbots to automate their client support, delivering effective customer experience with smarter financial decisions. Bank of America, the largest bank in US launched Erica, a Chatbot which guides customers with investment option notification, easy bill payments, and weekly update on their mortgage score.  MasterCard offers a chatbot to their customers which not only allows them to review their bank balance or transaction history but also facilitates seamless payments worldwide. Credit Risk Management For money lenders, the most common business risk is the credit risk and that piles up largely due to inaccurate credit risk assessment of borrowers. If you are unaware of the term credit risk, it is simply a risk associated with a borrower defaulting to repay the loan amount. AI backed Credit Risk evaluation tools developed using predictive analytics and advanced machine learning techniques has enabled bankers and financial service providers to simplify the borrower’s credit evaluation thereby transforming the labor intensive scorecard assessment method. Wells Fargo, an American international banking company adopted AI technology in executing mortgage verification and loan processing. It resulted in lower market exposure risk of their lending assets. With this, the team was able to establish smarter and faster credit risk management functionality. It resulted in analysis of millions of structured and unstructured data points for investigation thereby proving AI as an extremely valuable asset for credit security and assessment. Algorithmic Trading More than half a dozen US citizens own individual stocks, mutual funds, and exchange-traded mutual funds. Also, a good number of users trade on a daily basis, making it imperative for major broking and financial trading companies to offer AI powered algorithmic trading platform. The platform enables customers with strategic execution of trades offering significant returns. The algorithms analyse hundreds of millions of data pointers and draw down a decisive trading pattern enabling traders to book higher profits every microsecond of the trading hour. France-based international bank BNP Paribas deployed algorithmic trading which aids their customers in executing trades strategically and provides graphical representation of stock market liquidity. With the help of this, customers are able to determine the most appropriate ways of executing trade under various market conditions. The advances in automated trading has assisted users with suggestions and rich insights, helping humans to take better decisions. How do we see the Future of AI in Financial sector? The influence of AI in fintech has marked disruption in almost each and every financial institution, right from investment banks to retail banking, to small credit unions. Data science and machine learning practitioners are endeavoring to position AI as an essential part of the banking ecosystem. Financial companies are synergizing with data analytics and fintech professionals to orient AI as the primary interface for interaction with their customers. However, the sector commonly faces challenges in adoption of emerging technologies, making it inevitable for AI too. The foremost challenge companies face is availability of massive data which is clean and rich to train machine learning algorithms. The next hurdle in line would be the reliability and accuracy of the data insights provided by the AI mechanized solution. With dynamic market situation, businesses could experience decline in efficacy of their models causing serious harm to the company. Hence, they need to be smarter and cannot solely trust the AI technology in achieving the business mission. Absence of emotional intelligence in Chatbots is another area of concern resulting in an unsatisfactory customer service experience. While there may be other roadblocks, the rising investment in AI technology would definitely assist financial companies in overcoming such challenges and developing competitive intelligence in their product offerings. Predicting the near future, adoption of cutting edge technologies such as machine learning and predictive analytics will boost higher customer engagement, exceptional banking experience, lesser frauds and higher operating margins for banks, financial institutions and Insurance companies.
Read more
  • 0
  • 0
  • 13948

article-image-using-meta-learning-nonstationary-competitive-environments-pieter-abbeel-et-al
Sugandha Lahoti
15 Feb 2018
5 min read
Save for later

Using Meta-Learning in Nonstationary and Competitive Environments with Pieter Abbeel et al

Sugandha Lahoti
15 Feb 2018
5 min read
This ICLR 2018 accepted paper, Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments, addresses the use of meta-learning to operate in non-stationary environments, represented as a Markov chain of distinct tasks. This paper is authored by Pieter Abbeel, Maruan Al-Shedivat, Trapit Bansal, Yura Burda, Ilya Sutskever, and Igor Mordatch. Pieter Abbeel is a professor at UC Berkeley since 2008. He was also a Research Scientist at OpenAI (2016-2017). His current research focuses on robotics and machine learning with particular focus on meta-learning and deep reinforcement learning. One of the other authors of this paper, Ilya Sutskever is the co-founder and Research Director of OpenAI. He was also a Research Scientist at the Google Brain Team for 3 years. Meta-Learning, or alternatively learning to learn, typically uses metadata to understand how automatic learning can become flexible in solving learning problems, i.e. to learn the learning algorithm itself. Continuous adaptation in real-world environments is quite essential for any learning agent and meta-learning approach is an appropriate choice for this task. This article will talk about one of the top accepted research papers in the field of meta-learning at the 6th annual ICLR conference scheduled to happen between April 30 - May 03, 2018. Using a gradient-based meta-learning algorithm for Nonstationary Environments What problem is the paper attempting to solve? Reinforcement Learning algorithms, although achieving impressive results ranging from playing games to applications in dialogue systems to robotics, are only limited to solving tasks in stationary environments. On the other hand, the real-world is often nonstationary either due to complexity, changes in the dynamics in the environment over the lifetime of a system, or presence of multiple learning actors. Nonstationarity breaks the standard assumptions and requires agents to continuously adapt, both at training and execution time, in order to succeed. The classical approaches to dealing with nonstationarity are usually based on context detection and tracking i.e., reacting to the already happened changes in the environment by continuously fine-tuning the policy. However, nonstationarity allows only for limited interaction before the properties of the environment change. Thus, it immediately puts learning into the few-shot regime and often renders simple fine-tuning methods impractical. In order to continuously learn and adapt from limited experience in nonstationary environments, the authors of this paper propose the learning-to-learn (or meta-learning) approach. Paper summary This paper proposes a gradient-based meta-learning algorithm suitable for continuous adaptation of RL agents in nonstationary environments. The agents meta-learn to anticipate the changes in the environment and update their policies accordingly. This method builds upon the previous work on gradient-based model-agnostic meta-learning (MAML) that has been shown successful in the few shot settings. Their algorithm re-derive MAML for multi-task reinforcement learning from a probabilistic perspective, and then extends it to dynamically changing tasks. This paper also considers the problem of continuous adaptation to a learning opponent in a competitive multi-agent setting and have designed RoboSumo—a 3D environment with simulated physics that allows pairs of agents to compete against each other. The paper answers the following questions: What is the behavior of different adaptation methods (in nonstationary locomotion and competitive multi-agent environments) when the interaction with the environment is strictly limited to one or very few episodes before it changes? What is the sample complexity of different methods, i.e., how many episodes are required for a method to successfully adapt to the changes? Additionally, it answers the following questions specific to the competitive multi-agent setting: Given a diverse population of agents that have been trained under the same curriculum, how do different adaptation methods rank in a competition versus each other? When the population of agents is evolved for several generations, what happens with the proportions of different agents in the population? Key Takeaways This work proposes a simple gradient-based meta-learning approach suitable for continuous adaptation in nonstationary environments. This method was applied to nonstationary locomotion and within a competitive multi-agent setting—the RoboSumo environment. The key idea of the method is to regard nonstationarity as a sequence of stationary tasks and train agents to exploit the dependencies between consecutive tasks such that they can handle similar nonstationarities at execution time. In both cases, i.e meta-learning algorithm and the multi-agent setting,  meta-learned adaptation rules were more efficient than the baselines in the few-shot regime. Additionally, agents that meta-learned to adapt, demonstrated the highest level of skill when competing in iterated games against each other. Reviewer feedback summary Overall Score: 24/30 Average Score: 8 The paper was termed as a great contribution to ICLR. According to the reviewers, the paper addressed a very important problem for general AI and was well-written. They also appreciated the careful experiment designs, and thorough comparisons making the results convincing. They found that editorial rigor and image quality could be better. However, there was no content related improvements suggested. The paper was appreciated for being dense and rich on rapid meta-learning.
Read more
  • 0
  • 0
  • 13596
Modal Close icon
Modal Close icon