Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-ubers-kepler-gl-an-open-source-toolbox-for-geospatial-analysis

28 Jun 2018

4 min read

Uber's kepler.gl, an open source toolbox for GeoSpatial Analysis

28 Jun 2018

Geography Visualization, also called as Geovisualization plays a pivotal role in areas like cartography, geographic information systems, remote sensing and global positioning systems. Uber, a peer-to-peer transportation network company headquartered at California believes in data-driven decision making and hence keeps developing smart frameworks like deck.gl for exploring and visualizing advanced geospatial data at scale. Uber strives to make the data web-based and shareable in real-time across their teams and customers. Early this month, Uber surprised the geospatial market with its newly open-source toolbox, kepler.gl, a geoanalytics tool to gain quick insights from geospatial data with amazing and intuitive visualizations. What’s exactly Kepler.gl is? kepler.gl is a visualization-rich web platform, developed on top of deck.gl, a WebGL-powered data visualization library providing real-time visual analytics of millions of geolocation points. The platform provides visual exploration of geographical data sets along with spatial aggregation of all data points collected. The platform is said to be data-agnostic with a single interface to convert your data into insightful visualizations. https://www.youtube.com/watch?v=i2fRN4e2s0A The platform is very user-friendly where one can just drag the CSV or the GeoJSON files and drop them into the browser to visualize the dataset more intuitively. The platform is supported with different map layers, filtering option, aggregation feature through which you can get the final visualization in an animated format or like a video. The usability of features is so high that you can apply all the metrics available to your data points without much of a hassle. The web platform exhibits high performance where you can get insights from your spatial data in less than 10 minutes and that too in a single window. Another advantage of this framework is it does not involve any sort of coding and hence non-technical users can also reap the benefits by churn valuable insights from the data points. The platform is also equipped with some advanced, complex features such as 2D cartographic plane,a separate dimension for altitude, visibility of height of hexagon and grids. The users seem happy with the new height feature which helps them detect abnormalities and illicit traits in an aggregated map. With the filtering menu, the analysts and engineers can compare their data and have a granular look at their data points. This option also helps in reading the histogram well and one can easily detect outliers and make their dataset more reliable. It has a feature to add playback to time series data points which makes getting useful information of real time location systems easy. The team at Uber looks at this toolbox with a long-term vision where they are planning to keep adding new features and enhancements to make it highly functional and a single-click visualization dashboard. The team has already announced that they would be powering it up with two major enhancements to the current functionality in next couple of months. They would add support on, More robust exploration: There will be interlinkage between charts and maps, and support for custom charts, maps and widgets like the renowned BI tool Tableau through which it will facilitate analytics teams to unveil deeper insights. Addition of newer geo-analytical capabilities: To support massive datasets, there will be added features on data operations such as polygon aggregation, union of data points, operations like joining and buffering. Companies across different verticals such as Airbnb, Atkins Global, Cityswifter, Mapbox have found great value in kepler.gl offerings and are looking towards engineering their products to leverage this framework. The visualization specialists at these companies have already praised Uber for building such a simple yet fast platform with remarkable capabilities. To get started with kepler.gl, read the documentation available at Github and start creating visualizations and enhance your geospatial data analysis. Top 7 libraries for geospatial analysis Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data Data Visualization with ggplot2

0
0
18680

article-image-devops-for-big-data-success

Ashwin Nair

11 Oct 2017

5 min read

DevOps might be the key to your Big Data project success

Ashwin Nair

11 Oct 2017

5 min read

So, you probably believe in the power of Big Data and the potential it has to change the world. Your company might have already invested in or is planning to invest in a big data project. That’s great! But what if I were to tell you that only 15% of the business were successfully able to deploy their Big Data projects to production. That can’t be a good sign surely! Now, don’t just go freeing up your Big Data budget. Not yet. Big Data’s Big Challenges For all the hype around Big Data, research suggests that many organizations are failing to leverage its opportunities properly. A recent survey by NewVantage partners, for example, explored the challenges facing organizations currently running their own Big Data projects or trying to adopt them. Here’s what they had to say: “In spite of the successes, executives still see lingering cultural impediments as a barrier to realizing the full value and full business adoption of Big Data in the corporate world. 52.5% of executives report that organizational impediments prevent realization of broad business adoption of Big Data initiatives. Impediments include lack or organizational alignment, business and/or technology resistance, and lack of middle management adoption as the most common factors. 18% cite lack of a coherent data strategy.” Clearly, even some of the most successful organizations are struggling to get a handle on Big Data. Interestingly, it’s not so much about gaps in technology or even skills, but rather lack of culture and organizational alignment that’s making life difficult. This isn’t actually that surprising. The problem of managing the effects of technological change is one that goes far beyond Big Data - it’s impacting the modern workplace in just about every department, from how people work together to how you communicate and sell to customers. DevOps Distilled It’s out of this scenario that we’ve seen the irresistible rise of DevOps. DevOps, for the uninitiated, is an agile methodology that aims to improve the relationship between development and operations. It aims to ensure a fluid collaboration between teams; with a focus on automating and streamlining monotonous and repetitive tasks within a given development lifecycle, thus reducing friction and saving time. We can perhaps begin to see, then, that this approach - usually used in typical software development scenarios - might actually offer a solution to some of the problems faced when it comes to big data. A typical Big Data project Like a software development project, a Big Data project will have multiple different teams working on it in isolation. For example, a big data architect will look into the project requirements and design a strategy and roadmap for implementation, while the data storage and admin team will be dedicated to setting up a data cluster and provisioning infrastructure. Finally, you’ll probably then find data analysts who process, analyse and visualize data to gain insights. Depending on the scope and complexity of your project it is possible that more teams are brought in - say, data scientists are roped in to trains and build custom machine learning models. DevOps for Big Data: A match made in heaven Clearly, there are a lot of moving parts in a typical Big Data project - each role performing considerably complex tasks. By adopting DevOps, you’ll reduce any silos that exist between these roles, breaking down internal barriers and embedding Big Data within a cross-functional team. It’s also worth noting that this move doesn’t just give you a purely operational efficiency advantage - it also gives you much more control and oversight over strategy. By building a cross-functional team, rather than asking teams to collaborate across functions (sounds good in theory, but it always proves challenging), there is a much more acute sense of a shared vision or goal. Problems can be solved together, discussions can take place constantly and effectively. With the operational problems minimized, everyone can focus on the interesting stuff. By bringing DevOps thinking into big data, you also set the foundation for what’s called continuous analytics. Taking the principle of continuous integration, fundamental to effective DevOps practice, whereby code is integrated into a shared repository after every task or change to ensure complete alignment, continuous analytics streamlines the data science lifecycle by ensuring a fully integrated approach to analytics, where as much as possible is automated through algorithms. This takes away the boring stuff - once again ensuring that everyone within the project team can focus on what’s important. We’ve come a long way from Big Data being a buzzword - today, it’s the new normal. If you’ve got a lot of data to work with, to analyze and to understand, you better make sure you’ve the right environment setup to make the most from it. That means there’s no longer an excuse for Big Data projects to fail, and certainly no excuse not to get one up and running. If it takes DevOps to make Big Data work for businesses then it’s a MINDSET worth cultivating and running with.

0
0
18351

article-image-is-initiative-q-a-pyramid-scheme-or-just-a-really-bad-idea

Richard Gall

25 Oct 2018

5 min read

Is Initiative Q a pyramid scheme or just a really bad idea?

Richard Gall

25 Oct 2018

5 min read

If things seem too good to be true, they probably are. That's a pretty good motto to live by, and one that's particularly pertinent in the days of fake news and crypto-bubbles. However, it seems like advice many people haven't heeded with Initiative Q, a new 'payment system' developed by the brains behind PayPal technology. That's not to say that Initiative Q certainly is too good to be true. But when an organisation appears to be offering almost hundreds of thousands of dollars to users who simply offer an email and then encourage others to offer theirs, caution is essential. If it looks like a pyramid scheme, then do you really want to risk the chance that it might just be a pyramid scheme? What is Initiative Q? Initiative Q, is, according to its founders, "tomorrow's payment network." On its website it says that current methods of payment, such as credit cards, are outdated. They open up the potential for fraud and other bad business practices, as well as not being particularly efficient. Initiative Q claims that is it going to develop an alternative to these systems "which aggregate the best ideas, innovations, and technologies developed in recent years." It isn't specific about which ideas and technological innovations its referring to, but if you read through the payment model it wants to develop, there are elements that sound a lot like blockchain. For example, it talks about using more accurate methods of authentication to minimize fraud, and improving customer protection by "creating a network where buyers don’t need to constantly worry about whether they are being scammed" (the extent to which this turns out to be deliciously ironic remains to be seen). To put it simply, it's a proposed new payment system that borrows lots of good ideas that still haven't been shaped into a coherent whole. Compelling, yes, but alarm bells are probably sounding. Who's behind Initiative Q? There are very few details on who is actually involved in Initiative Q. The only names attached to the project are Saar Wilf, an entrepreneur who founded Fraud Sciences, a payment technology that was bought by PayPal in 2008, and Lawrence White, Professor of Monetary Theory and Policy and George Mason University. The team should grow, however. Once the number of members has grown to a significant level, the Initiative Q team say "we will continue recruiting the world’s top professionals in payment systems, macroeconomics, and Internet technologies." How is Initiative Q supposed to work? Initiative Q explains that for the world to adopt a new payment network is a huge challenge - a fair comment, because after all, for it to work at all, you need actors within that network who believe in it and trust it. This is why the initial model - which looks and feels a hell of a lot like a pyramid or Ponzi scheme - is, according to Initiative Q, so important. To make this work, you need a critical mass of users. Initiative Q actually defends itself from accusations that it is a Pyramid scheme by pointing out that there's no money involved at this stage. All that happens is that when you sign up you receive a specific number of 'Qs' (the name of the currency Initiative Q is proposing). These Qs obviously aren't worth anything at the moment. The idea is that when the project actually does reach critical mass, it will take on actual value. Isn't Initiative Q just another cryptocurrency? Initiative Q is keen to stress that it isn't a cryptocurrency. That said, on its website the project urges you to "think of it as getting free bitcoin seven years ago." But the website does go into a little more detail elsewhere, explaining that "cryptocurrencies have failed as currencies" because they "focus on ensuring scarcity" while neglecting to consider how people might actually use them in the real world." The implication, then, is that Initiative Q is putting adoption first. Presumably, it's one of the reasons that it has decided to go with such an odd acquisition strategy. Ultimately though, it's too early to say whether Initiative Q is or isn't a cryptocurrency in the strictest (ie. fully de-centralized etc.) sense. There simply isn't enough detail about how it will work. Of course, there are reasons why Initiative Q doesn't want to be seen as a cryptocurrency. From a marketing perspective, it needs to look distinctly different from the crypto-pretenders of the last decade. Initiative Q: pyramid scheme or harmless vaporware? Because no money is exchanged at any point, it's difficult to call Initiative Q a ponzi or pyramid scheme. In fact it's actually quite hard to know what to call it. As David Gerard wrote in a widely shared post from June, published when Initiative Q had a first viral wave, "the Initiative Q payment network concept is hard to critique — because not only does it not exist, they don’t have anything as yet, except the notion of “build a payment network and it’ll be awesome.” But while it's hard to critique, it's also pretty hard to say that it's actually fraudulent. In truth, at the moment it's relatively harmless. However, as David Gerard points out in the same post, if the data of those who signed up is hacked - or even sold (although the organization says it won't do that) - that's a pretty neat database of people who'll offer their details up in return for some empty promises of future riches.

0
0
18282

article-image-emoji-scavenger-hunt-showcases-tensorflow-js

Richard Gall

03 Apr 2018

3 min read

Emoji Scavenger Hunt showcases TensorFlow.js

Richard Gall

03 Apr 2018

3 min read

What is Emoji Scavenger Hunt? Emoji Scavenger Hunt is a game built using neural networks. Developed by Google using TensorFlow.js, a version of the machine learning library designed to run on browsers, the game showcases how machine learning can be brought to web applications. But more importantly, TensorFlow.js, which was announced at the end of March at the TensorFlow Developer Summit looks like it could be a tool to define the next few years of web development, making machine learning more accessible to JavaScript developers than ever before. Start playing now. At the moment Emoji Scavenger Hunt is pretty basic, but the central idea is pretty cool. When you open up the web page in your browser and click 'Let's Play', the app asks for access to your camera. The game then starts: you'll see a countdown, before your camera opens and the web application asks you to find an example of an emoji in the real world. If you find yourself easily irritated you're probably not going to get addicted, as Google seem to have done their best to cultivate an emoji-esque mise en scene. But the game nevertheless highlights not only how neural networks work, but also, in the context of TensorFlow.js, how they might operate in a browser. Of course, one of the reasons Emoji Scavenger Hunt is so basic is because a core part of the game is training the neural network. Presumably, as more people play it, the neural network will improve at 'guessing' what objects in the real world relate to which emoji on your keyboard. TensorFlow.js will bring machine learning to the browser What's exciting is how TensorFlow.js might help shape the future of web development. It's going to make it much easier for JavaScript developers to get started with machine learning - on Reddit a number of users were thankful that they could now use TensorFlow without touching a line of Python code. On the other hand - perhaps a little less likely - TensorFlow.js might lead to more machine learning developers using JavaScript. If games like Emoji Scavenger Hunt become the norm, engineers and data scientists will have a new way to train algorithms - getting users to do it for them. TensorFlow.js and deeplearn.js Eagle-eyed readers who have been watching TensorFlow closely might be thinking here - what about deeplearn.js? Fortunately, the TensorFlow team have an answer: TensorFlow.js... is the successor to deeplearn.js, which is now called TensorFlow.js Core. TensorFlow.js and the future of machine learning The announcement of TensorFlow.js highlights that Google and the core development team behind TensorFlow have a clear focus on the future. They're already the definitive library for machine learning and deep learning. What this will do is spread its dominance into new domains. Emoji Scavenger Hunt is pointing the way - we're sure to see plenty of machine learning imitators and innovators over the next few years.

0
0
18207

article-image-product-development-need-developers-and-product-managers-collaborate

Packt Editorial Staff

04 Aug 2018

16 min read

Effective Product Development needs developers and product managers collaborating on success metrics

Packt Editorial Staff

04 Aug 2018

16 min read

Modern product development is witnessing a drastic shift. Disruptive ideas and ambiguous business conditions have changed the way products are developed. Product development is no longer guided by existing processes or predefined frameworks. Delivering on time is a baseline metric, as is software quality. Today, businesses are competing to innovate. They are willing to invest in groundbreaking products with cutting-edge technology. Cost is no longer the constraint—execution is. Can product managers then continue to rely upon processes and practices aimed at traditional ways of product building? How do we ensure that software product builders look at the bigger picture and do not tie themselves to engineering practices and technology viability alone? Understanding the business and customer context is essential for creating valuable products. In this article, we are going to identify what success means to us in terms of product development. This article is an excerpt from the book Lean Product Management written by Mangalam Nandakumar. For the kind of impact that we predict our feature idea to have on the Key Business Outcomes, how do we ensure that every aspect of our business is aligned to enable that success? We may also need to make technical trade-offs to ensure that all effort on building the product is geared toward creating a satisfying end-to-end product experience. When individual business functions take trade-off decisions in silo, we could end up creating a broken product experience or improvising the product experience where no improvement is required. For a business to be able to align on trade-offs that may need to be made on technology, it is important to communicate what is possible within business constraints and also what is not achievable. It is not necessary for the business to know or understand the specific best practices, coding practices, design patterns, and so on, that product engineering may apply. However, the business needs to know the value or the lack of value realization, of any investment that is made in terms of costs, effort, resources, and so on. The section addresses the following topics: The need to have a shared view of what success means for a feature idea Defining the right kind of success criteria Creating a shared understanding of technical success criteria "If you want to go quickly, go alone. If you want to go far, go together. We have to go far — quickly." Al Gore Planning for success doesn't come naturally to many of us. Come to think of it, our heroes are always the people who averted failure or pulled us out of a crisis. We perceive success as 'not failing,' but when we set clear goals, failures don't seem that important. We can learn a thing or two about planning for success by observing how babies learn to walk. The trigger for walking starts with babies getting attracted to, say, some object or person that catches their fancy. They decide to act on the trigger, focusing their full attention on the goal of reaching what caught their fancy. They stumble, fall, and hurt themselves, but they will keep going after the goal. Their goal is not about walking. Walking is a means to reaching the shiny object or the person calling to them. So, they don't really see walking without falling as a measure of success. Of course, the really smart babies know to wail their way to getting the said shiny thing without lifting a toe. Somewhere along the way, software development seems to have forgotten about shiny objects, and instead focused on how to walk without falling. In a way, this has led to an obsession with following processes without applying them to the context and writing perfect code, while disdaining and undervaluing supporting business practices. Although technology is a great enabler, it is not the end in itself. When applied in the context of running a business or creating social impact, technology cannot afford to operate as an isolated function. This is not to say that technologists don't care about impact. Of course, we do. Technologists show a real passion for solving customer problems. They want their code to change lives, create impact, and add value. However, many technologists underestimate the importance of supporting business functions in delivering value. I have come across many developers who don't appreciate the value of marketing, sales, or support. In many cases, like the developer who spent a year perfecting his code without acquiring a single customer, they believe that beautiful code that solves the right problem is enough to make a business succeed. Nothing can be further from the truth Most of this type of thinking is the result of treating technology as an isolated function. There is a significant gap that exists between nontechnical folks and software engineers. On the one hand, nontechnical folks don't understand the possibilities, costs, and limitations of software technology. On the other hand, technologists don't value the need for supporting functions and communicate very little about the possibilities and limitations of technology. This expectation mismatch often leads to unrealistic goals and a widening gap between technology teams and the supporting functions. The result of this widening gap is often cracks opening in the end-to-end product experience for the customer, thereby resulting in a loss of business. Bridging this gap of expectation mismatch requires that technical teams and business functions communicate in the same language, but first they must communicate. Setting SMART goals for team In order to set the right expectations for outcomes, we need the collective wisdom of the entire team. We need to define and agree upon what success means for each feature and to each business function. This will enable teams to set up the entire product experience for success. Setting specific, measurable, achievable, realistic, and time-bound (SMART) metrics can resolve this. We cannot decouple our success criteria from the impact scores we arrived at earlier. So, let's refer to the following table for the ArtGalore digital art gallery: The estimated impact rating was an indication of how much impact the business expected a feature idea to have on the Key Business Outcomes. If you recall, we rated this on a scale of 0 to 10. When the estimated impact of a Key Business Outcomes is less than five, then the success criteria for that feature is likely to be less ambitious. For example, the estimated impact of "existing buyers can enter a lucky draw to meet an artist of the month" toward generating revenue is zero. What this means is that we don't expect this feature idea to bring in any revenue for us or put in another way, revenue is not the measure of success for this feature idea. If any success criteria for generating revenue does come up for this feature idea, then there is a clear mismatch in terms of how we have prioritized the feature itself. For any feature idea with an estimated impact of five or above, we need to get very specific about how to define and measure success. For instance, the feature idea "existing buyers can enter a lucky draw to meet an artist of the month" has an estimated impact rating of six towards engagement. This means that we expect an increase in engagement as a measure of success for this feature idea. Then, we need to define what "increase in engagement" means. My idea of "increase in engagement" can be very different from your idea of "increase in engagement." This is where being S.M.A.R.T. about our definition of success can be useful. Success metrics are akin to user story acceptance criteria. Acceptance criteria define what conditions must be fulfilled by the software in order for us to sign off on the success of the user story. Acceptance criteria usually revolve around use cases and acceptable functional flows. Similarly, success criteria for feature ideas must define what indicators can tell us that the feature is delivering the expected impact on the KBO. Acceptance criteria also sometimes deal with NFRs (nonfunctional requirements). NFRs include performance, security, and reliability. In many instances, nonfunctional requirements are treated as independent user stories. I also have seen many teams struggle with expressing the need for nonfunctional requirements from a customer's perspective. In the early days of writing user stories, the tendency for myself and most of my colleagues was to write NFRs from a system/application point of view. We would say, "this report must load in 20 seconds," or "in the event of a network failure, partial data must not be saved." These functional specifications didn't tell us how/why they were important for an end user. Writing user stories forces us to think about the user's perspective. For example, in my team we used to have interesting conversations about why a report needed to load within 20 seconds. This compelled us to think about how the user interacted with our software. It is not uncommon for visionary founders to throw out very ambitious goals for success. Having ambitious goals can have a positive impact in motivating teams to outperform. However, throwing lofty targets around, without having a plan for success, can be counter-productive. For instance, it's rather ambitious to say, "Our newsletter must be the first to publish artworks by all the popular artists in the country," or that "Our newsletter must become the benchmark for art curation." These are really inspiring words, but can mean nothing if we don't have a plan to get there. The general rule of thumb for this part of product experience planning is that when we aim for an ambitious goal, we also sign up to making it happen. Defining success must be a collaborative exercise carried out by all stakeholders. This is the playing field for deciding where we can stretch our goals, and for everyone to agree on what we're signing up to, in order to set the product experience up for success. Defining key success metrics For every feature idea we came up with, we can create feature cards that look like the following sample. This card indicates three aspects about what success means for this feature. We are asking these questions: what are we validating? When do we validate this? What Key Business Outcomes does it help us to validate? The criteria for success demonstrates what the business anticipates as being a tangible outcome from a feature. It also demonstrates which business functions will support, own, and drive the execution of the feature. That's it! We've nailed it, right? Wrong. Success metrics must be SMART, but how specific is the specific? The preceding success metric indicates that 80% of those who sign up for the monthly art catalog will enquire about at least one artwork. Now, 80% could mean 80 people, 800 people, or 8000 people, depending on whether we get 100 sign-ups, 1000, or 10,000, respectively! We have defined what external (customer/market) metrics to look for, but we have not defined whether we can realistically achieve this goal, given our resources and capabilities. The question we need to ask is: are we (as a business) equipped to handle 8000 enquiries? Do we have the expertise, resources, and people to manage this? If we don't plan in advance and assign ownership, our goals can lead to a gap in the product experience. When we clarify this explicitly, each business function could make assumptions. When we say 80% of folks will enquire about one artwork, the sales team is thinking that around 50 people will enquire. This is what the sales team at ArtGalore is probably equipped to handle. However, marketing is aiming for 750 people and the developers are planning for 1000 people. So, even if we can attract 1000 enquiries, sales can handle only 50 enquiries a month! If this is what we're equipped for today, then building anything more could be wasteful. We need to think about how we can ramp up the sales team to handle more requests. The idea of drilling into success metrics is to gauge whether we're equipped to handle our success. So, maybe our success metric should be that we expect to get about 100 sign-ups in the first three months and between 40-70 folks enquiring about artworks after they sign up. Alternatively, we can find a smart way to enable sales to handle higher sales volumes. Before we write up success metrics, we should be asking a whole truckload of questions that determine the before-and-after of the feature. We need to ask the following questions: What will the monthly catalog showcase? How many curated art items will be showcased each month? What is the nature of the content that we should showcase? Just good high-quality images and text, or is there something more? Who will put together the catalog? How long must this person/team(s) spend to create this catalog? Where will we source the art for curation? Is there a specific date each month when the newsletter needs to go out? Why do we think 80% of those who sign up will enquire? Is it because of the exclusive nature of art? Is it because of the quality of presentation? Is it because of the timing? What's so special about our catalog? Who handles the incoming enquiries? Is there a number to call or is it via email? How long would we take to respond to enquiries? If we get 10,000 sign-ups and receive 8000 enquiries, are we equipped to handle these? Are these numbers too high? Can we still meet our response time if we hit those numbers? Would we still be happy if we got only 50% of folks who sign up enquiring? What if it's 30%? When would we throw away the idea of the catalog? This is where the meat of feature success starts taking shape. We need a plan to uncover underlying assumptions and set ourselves up for success. It's very easy for folks to put out ambitious metrics without understanding the before-and-after of the work involved in meeting that metric. The intent of a strategy should be to set teams up for success, not for failure. Often, ambitious goals are set without considering whether they are realistic and achievable or not. This is so detrimental that teams eventually resort to manipulating the metrics or misrepresenting them, playing the blame game, or hiding information. Sometimes teams try to meet these metrics by deprioritizing other stuff. Eventually, team morale, productivity, and delivery take a hit. Ambitious goals, without the required capacity, capability, and resources to deliver, are useless. Technology to be in line with business outcomes Every business function needs to align toward the Key Business Outcomes and conform to the constraints under which the business operates. In our example here, the deadline is for the business to launch this feature idea before the Big Art show. So, meeting timelines is already a necessary measure of success. The other indicators of product technology measures could be quality, usability, response times, latency, reliability, data privacy, security, and so on. These are traditionally clubbed under NFRs (nonfunctional requirements). They are indicators of how the system has been designed or how the system operates, and are not really about user behavior. There is no aspect of a product that is nonfunctional or without a bearing on business outcomes. In that sense, nonfunctional requirements are a misnomer. NFRs are really technical success criteria. They are also a business stakeholder's decision, based on what outcomes the business wants to pursue. In many time and budget-bound software projects, technical success criteria trade-offs happen without understanding the business context or thinking about the end-to-end product experience. Let's take an example: our app's performance may be okay when handling 100 users, but it could take a hit when we get to 10,000 users. By then, the business has moved on to other priorities and the product isn't ready to make the leap. This depends on how each team can communicate the impact of doing or not doing something today in terms of a cost tomorrow. What that means is that engineering may be able to create software that can scale to 5000 users with minimal effort, but in order to scale to 500,000 users, there's a different level of magnitude required. There is a different approach needed when building solutions for meeting short-term benefits, compared to how we might build systems for long-term benefits. It is not possible to generalize and make a case that just because we build an application quickly, that it is likely to be full of defects or that it won't be secure. By contrast, just because we build a lot of robustness into an application, this does not mean that it will make the product sell better. There is a cost to building something, and there is also a cost to not building something and a cost to a rework. The cost will be justified based on the benefits we can reap, but it is important for product technology and business stakeholders to align on the loss or gain in terms of the end-to-end product experience because of the technical approach we are taking today. In order to arrive at these decisions, the business does not really need to understand design patterns, coding practices, or the nuanced technology details. They need to know the viability to meet business outcomes. This viability is based on technology possibilities, constraints, effort, skills needed, resources (hardware and software), time, and other prerequisites. What we can expect and what we cannot expect must both be agreed upon. In every scope-related discussion, I have seen that there are better insights and conversations when we highlight what the business/customer does not get from this product release. When we only highlight what value they will get, the discussions tend to go toward improvising on that value. When the business realizes what it doesn't get, the discussions lean toward improvising the end-to-end product experience. Should a business care that we wrote unit tests? Does the business care what design patterns we used or what language or software we used? We can have general guidelines for healthy and effective ways to follow best practices within our lines of work, but best practices don't define us, outcomes do. To summarize we learned before commencing on the development of any feature idea, there must be a consensus on what outcomes we are seeking to achieve. The success metrics should be our guideline for finding the smartest way to implement a feature. Developer’s guide to Software architecture patterns Hey hey, I wanna be a Rockstar (Developer) The developer-tester face-off needs to end. It’s putting our projects at risk

0
0
18161

Darrell Pratt

03 Nov 2016

5 min read

What is the API Economy?

Darrell Pratt

03 Nov 2016

5 min read

If you have pitched the idea of a set of APIs to your boss, you might have run across this question. “Why do we need an API, and what does it have to do with an economy?” The answer is the API economy - but it's more than likely that that is going to be met with more questions. So let's take some time to unpack the concept and get through some of the hyperbole surrounding the topic. An economy (From Greek οίκος – "household" and νέμoμαι – "manage") is an area of the production, distribution, or trade, and consumption of goods and services by different agents in a given geographical location. - Wikipedia If we take the definition of economy from Wikipedia and the definition of API as an Application Programming Interface, then what we should be striving to create is a platform (as the producer of the API) that will attract a set of agents that will use that platform to create, trade or distribute goods and services to other agents over the Internet (our geography has expanded). The central tenet of this economy is that the APIs themselves need to provide the right set of goods (data, transactions, and so on) to attract other agents (developers and business partners) that can grow their businesses alongside ours and further expand the economy. This piece from Gartner explains the API economy very well. This is a great way of summing it up: "The API economy is an enabler for turning a business or organization into a platform." Let’s explore a bit more about APIs and look at a few examples of companies that are doing a good job of running API platforms. The evolution of the API economy If you asked someone what an API actually was 10 or more years ago, you might have received puzzled looks. The Application Programming Interface at that time was something that the professional software developer was using to interface with more traditional enterprise software. That evolved into the popularity of the SDK (Software Development Kit) and a better mainstream understanding of what it meant to create integrations or applications on pre-existing platforms. Think of the iOS SDK or Android SDK and how those kits and the distribution channels that Apple and Google created have led to the explosion of the apps marketplace. Jeff Bezos’s mandate that all IT assets have an API at Amazon was a major event in the API economy timeline. Amazon continued to build APIs such as SNS, SQS, Dynamo and many others. Each of these API components provided a well-defined service that any application can use and has significantly reduced the barrier to entry for new software and service companies. With this foundation set, the list of companies providing deep API platforms has steadily increased. How exactly does one profit in the API economy? If we survey a small set of API frameworks, we can see that companies use their APIs in different ways to add value to their underlying set of goods or create a completely new revenue stream for the company. Amazon AWS Amazon AWS is the clearest example of an API as a product unto itself. Amazon makes available a large set of services that provide defined functionality and for which Amazon charges with rates based upon usage of CPU and storage (it gets complicated). Each new service they launch addresses a new area of need and work to provide integrations between the various services. Social APIs Facebook, Twitter and others in the social space, run API platforms to increase the usage of their properties. Some of the inherent value in Facebook comes from sites and applications far afield from facebook.com and their API platform enables this. Twitter has had a more complicated relationship with its API users over time, but the API does provide many methods that allow both apps and websites to tap into Twitter content and thus extend Twitter’s reach and audience size. Chat APIs Slack has created a large economy of applications focused around its chat services and built up a large number of partners and smaller applications that add value to the platform. Slack’s API approach is one that is centered on providing a platform for others to integrate with and add content into the Slack data system. This approach is more open than the one taken by Twitter and the fast adoption has added large sums to Slack’s current valuation. Along side the meteoric rise of Slack, the concept of the bot as an assistant has also taken off. Companies like api.ai are offering services to enable chat services with AI as a service. The service offerings that surround the bot space are growing rapidly and offer a good set of examples as to how a company can monetize their API. Stripe Stripe competes in the payments as a service space along with PayPal, Square and Braintree. Each of these companies offers API platforms that vastly simplify the integration of payments into web sites and applications. Anyone who has built an e-commerce site before 2000 can and will appreciate the simplicity and power that the API economy brings to the payment industry. The pricing strategy in this space is generally on a per use case and is relatively straightforward. It Takes a Community to make the API economy work There are very few companies that will succeed by building an API platform without growing an active community of developers and partners around it. While it is technically easy to create and API given the tooling available, without an active support mechanism and detailed and easily consumable documentation your developer community may never materialize. Facebook and AWS are great examples to follow here. They both actively engage with their developer communities and deliver rich sets of documentation and use-cases for their APIs.

0
0
18099

Sugandha Lahoti

22 Nov 2017

8 min read

7 promising real-world applications of AI-powered Mixed Reality

Sugandha Lahoti

22 Nov 2017

8 min read

Mixed Reality has become a disruptive force that is bridging the gap between reality and imagination. With AI, it is now poised to change the world as we see it! The global mixed reality market is expected to reach USD 6.86 billion by 2024. Mixed Reality has found application in not just the obvious gaming and entertainment industries but also has great potential in business and other industries ranging from manufacturing, travel, and medicine to advertising. Maybe that is why the biggest names in tech are battling it out to capture the MR market with their devices - Microsoft HoloLens, GoogleGlass 2.0, Meta2 handsets to name a few. Incorporating Artificial Intelligence is their next step towards MR market domination. So what’s all the hype about MR and how can AI take it to the next level? Through the looking glass: Understanding Mixed Reality Mixed reality is essentially a fantastic concoction of virtual reality (a virtual world with virtual objects) and augmented reality (the real world with digital information). This means virtual objects are overlaid in the real world and mixed reality enables the person who is experiencing the MR environment to perceive virtual objects as ”real ones”. While in augmented reality it is easy to break the illusion and recognize that the objects are not real (Hello Pokemon Go!), in Mixed Reality, it is harder to break the illusion as virtual objects behave like real-world objects. So when you lean in close or interact with a virtual object in MR, it gets closer in a way a real object would. The MR experience is made possible with mixed reality devices that are typically lightweight and wearable. They are generally equipped with front-mounted cameras to recognize the distinctive features of the real world (such as objects and walls) and blend them with the virtual reality as seen through the headset. They also include a processor for processing the information relayed by an array of sensors embedded in the headset. These processors run the algorithms used for pattern recognition on a cloud-based server. These devices then use a projector for displaying virtual images in real environments which are finally reflected to the eye with the help of beam-splitting technology. All this sounds magical already, what can AI do for MR to top this? Curiouser and curiouser: The AI-powered Mixed Reality Mixed Reality and Artificial Intelligence are two powerful technology tools. The convergence of the two means a seamless immersive experience for users that blends the virtual and physical reality. Mixed Reality devices already enable interaction of virtual holograms in the physical environment and thereby combine virtual worlds with reality. But most MR devices require a large number of calculations and adjustments to accurately determine the position of a virtual object in a real-world scenario. They then apply rules and logic to those objects to make them behave like real-world objects. As these computations happen on the cloud, the results have perceivable time lag which comes in the way of giving the user a truly immersive experience. Also, user mobility is restricted due to current device limitations. Recently there has been a rise of the AI coprocessor in Mixed Reality devices. The announcement of Microsoft’s HoloLens 2 project, an upgrade to the existing MR device which now includes an AI coprocessor is a case in point. By using AI chips for computing, the above calculations, for example, MR devices will deliver high precision results faster. It means algorithms and calculations can run instantaneously without the need for data to be sent to/from a cloud. Having the data locally on your headset will eliminate time lag, thereby creating more real-time immersive experiences. In other words, as the visual data is analyzed directly on the device and computationally-exhaustive tasks are performed close to the data source, the enhanced processing speed results in quicker performance. Since the data remains on your headset always, fewer computations are needed to be performed on the cloud, hence the data is more secure. Using an AI chip also allows flexible implementation of deep neural networks. They help in automating complex calculations such as depth perception estimations and generally provide a better understanding of the environment to the MR devices. Generative models from Deep Learning can be used to generate believable virtual characters (avatars) in the real world. Images can also be more intelligently compressed with AI techniques. This would enable faster transmission over wireless networks. Motion capture techniques are now employing AI functionalities such as phase-functioned neural networks and self-teaching AI. They use machine learning techniques to combine a vast library of stored movements and combine and fit them into new characters. By using AI-powered Mixed Reality devices, the plan is to provide a more realistic experience which is fast and provides more mobility. The ultimate goal is to build AI-powered mixed reality devices that are intelligent and self-learning. Follow the white rabbit: Applications of AI-power Mixed Reality Let us look at various sectors where Artificially Intelligent Mixed Reality has started finding traction. Gaming and Entertainment In the field of gaming, procedural content generation techniques allow automatic generation of Mixed Reality games (as opposed to manual creation by game designers) by encoding elements such as individual structures, enemies etc with their relationships. Artificial Intelligence enhances PCG algorithms in object identification and in recognizing other relationships between the real and virtual objects. Deep learning techniques can be used for tasks like super resolution, photo to texture mapping, and texture multiplication. Healthcare and Surgical Procedures AI-powered mixed reality tech has also found its use in the field of Healthcare and surgical operations. Scopis has announced a mixed reality surgical navigation system that uses the Microsoft HoloLens for spinal surgery applications. It employs image recognition and manipulation techniques which allow the surgeon to see both the patient and a superimposed image of the pedicle screws (used for vertebrae fixation surgeries) for surgical procedures. Retail Retail is another sector which is under the spell of this AI infused MR tech. DigitalBridge, a mixed-reality company uses mixed reality, artificial intelligence, and deep learning to create a platform that allows consumers to virtually try on home decor products before buying them. Image and Video Manipulation AI algorithms and MR techniques can also enrich video and image manipulation. As we speak, Microsoft is readying the release of Microsoft Remix 3D service. This software adds “mixed reality” digital images and animations to videos. It keeps the digital content in the same position in relation to the real objects using image recognition, computer vision, and AI algorithms. Military and Defence AI-powered Mixed Reality is also finding use in the defense sector, where MR training simulations controlled by artificially intelligent software combine real people and physical environments with a virtual setup. Construction and Homebuilding Builders can visualize their options in life-sized models with MR devices. With AI, they can leave virtual messages or videos at key locations to keep other technicians and architects up to date when they’re away. Using MR and AI techniques, an architect can call a remote expert into the virtual environment if need be, and virtual assistants can be utilized for further assistance. COINS is a construction and homebuilding organization which uses AI-powered Mixed Reality devices. They are collaborating with Sketchup for virtual messaging and 3D modeling and Microsoft for HoloLens and Skype Chatbot assistance. Industrial Farming Machine learning algorithms can be used to study a sensor-enabled field of crop and record the growth, requirements and anticipate future needs. An MR device can then provide a means to interact with the plants and analyze present conditions all the while adjusting future needs. Infosys’ Plant.IO is one such digital farm which when combined with the power of an MR device can overlay virtual objects over a real-world scenario. Conclusion Through these examples, we can see the rapid adoption of AI-powered Mixed Reality recipes across diverse fields enabled by the rise of AI chips and the employment of more exhaustive computations with complex machine learning and deep learning algorithms. The next milestone is to see the rise of a Mixed Reality environment which is completely immersive and untethered. This would be made possible by the adoption of more complex AI techniques and advances made in the field of AI both in hardware and software. Instead of voice or search based commands in the MR environments, AI techniques will be used to harness eye and body gestures. As MR devices become smaller and mobile, AI-powered mixed reality will also give rise to intelligent application development incorporating mixed reality through the phone’s camera lens. AI technologies would thus help expand the scope of MR not only as an interesting tool, with applications in gaming and entertainment, but also as a practical and useful approach for how we see, interact and learn from our environment.

0
0
17790

article-image-why-retailers-need-to-prioritize-ecommerce-automation-in-2019

Guest Contributor

14 Jan 2019

6 min read

Why Retailers need to prioritize eCommerce Automation in 2019

Guest Contributor

14 Jan 2019

6 min read

The retail giant Amazon plans to reinvent in-store shopping in much the same way it revolutionized online shopping. Amazon’s cashierless stores (3000 of which you can expect by 2021) give a glimpse into what the future of eCommerce could look like with automation. eCommerce automation involves combining the right eCommerce software with the right processes and the right people to streamline and automate the order lifecycle. This reduces the complexity and redundancy of many tasks that an eCommerce retailer typically faces from the time a customer places her order until the time it is delivered to them. Let’s look at why eCommerce retailers should prioritize automation in 2019. 1. Augmented Customer Experience + Personalization A PwC study titled “Experience is Everything” suggests that 42% of consumers would pay more for a friendly, welcoming experience. Way back in 2015, Gartner predicted that nearly 50% of companies would be implementing changes in their business model in order to augment customer experience. This is especially true for eCommerce, and automation in eCommerce certainly represents one of these changes. Customization and personalization of services are a huge boost for customer experience. A BCG report revealed that retailers who implemented personalization strategies saw a 6-10% boost in their sales. How can automation help? To start with, you can automate your email marketing campaigns and make them more personalized by adding recommendations, discount codes, and more. 2. Fraud Prevention The scope for fraud on the Internet is huge. According to the October 2017 Global Fraud Index, Account Takeover fraud cost online retailers a whopping $3.3 billion dollars in Q2 2017 alone. Additionally, the average transaction rate for eCommerce fraud has also been on this rise. eCommerce retailers have been using a number of fraud prevention tools such as address verification service, CVN (card verification number), credit history check, and more to verify the buyer’s identity. An eCommerce tool equipped with Machine Learning capabilities such as Shuup, can detect fraudulent activity and effectively run through thousands of checks in the blink of an eye. Automating order handling can ensure that these preliminary checks are carried out without fail, in addition to carrying out specific checks to assess the riskiness of a particular order. Depending on the nature and scale of the enterprise, different retailers would want to set different thresholds for fraud detection, and eCommerce automation makes that possible. If a medium-risk transaction is detected, the system can be automated to notify the finance department for immediate review, whereas high-risk transactions can be canceled immediately. Automating your eCommerce software processes allows you to break the mold of one-size-fits-all coding and make the solution specific to the needs of your organization. 3. Better Customer Service Customer service and support is an essential part of the buying process. Automated customer support does not necessarily mean your customers will get canned responses to all their queries. However, it means that common queries can be dealt with in a more efficient manner. Live chats and chatbots have become incredibly popular with eCommerce retailers because these features offer convenience to both the customer and the retailer. The retailer’s support staff will not be held up with routine inquiries and the customer can get their queries resolved quickly. Timely responses are a huge part of what constitutes positive customer experience. Live chat can even be used for shopping carts in order to decrease cart abandonment rates. Automating priority tickets as well as the follow-up on resolved and unresolved tickets is another way to automate customer service. With new generation automated CRM systems/ help desk software you can automate the tags that are used to filter tickets. It saves the customer service rep’s time and ensures that priority tickets are resolved quickly. Customer support/service can be made as a strategic asset in your overall strategy. The same Oracle study mentioned above suggests that back in 2011, 50% of customers would give a brand at most one week to respond to a query before they stopped doing business with them. You can imagine what those same numbers for 2018 would be. 4. Order Fulfillment Physical fulfillment of orders is prone to errors as it requires humans to oversee the warehouse selection. With an automated solution, you can set up an order fulfillment to match the warehouse requirements as closely as possible. This ensures that the closest warehouse which has the required item in stock is selected. This guarantees timely delivery of the order to the customer. It could also be set up to integrate with your billing software so as to calculate accurate billing/shipping charges, taxes (for out of state/overseas shipments), etc. Automation can also help manage your inventory effectively. Product availability is updated automatically with each transaction, be it a return or the addition of a new product. If the stock of a particular in-demand item was nearing danger levels, your eCommerce software would send an automated email to the supplier asking to replenish its stock ASAP. Automation ensures that your prerequisites to order fulfillment are fulfilled for successful processing and delivery of an order. Challenges with eCommerce automation The aforementioned benefits are not without some risks, just as any evolving concept tends to be. One of the most important challenges to making automation work for eCommerce smoothly is data accuracy. As eCommerce platforms gear up for greater multichannel/omnichannel retailing strategies, automation is certainly going to help them bolster and enhance their workflows, but only if the right checks are in place. Automation still has a long way to go, so for now, it might be best to focus on automating tasks that take up a lot of time such as updating inventory, reserving products for certain customers, updating customers on their orders, etc. Advanced applications such as fraud detection might still take a few years to be truly ‘automated’ and free from any need for human review. For now, eCommerce retailers still have a whole lot to look forward to. All things considered, automating the key tasks/functions of your eCommerce platform will impart flexibility, agility, and a scope for improved customer experience. Invest rightly in automation solutions for your eCommerce software to stay competitive in the dynamic, unpredictable eCommerce retail scenario. Author Bio Fretty Francis works as a Content Marketing Specialist at SoftwareSuggest, an online platform that recommends software solutions to businesses. Her areas of expertise include eCommerce platforms, hotel software, and project management software. In her spare time, she likes to travel and catch up on the latest technologies. Software developer tops the 100 Best Jobs of 2019 list by U.S. News and World Report IBM Q System One, IBM’s standalone quantum computer unveiled at CES 2019 Announcing ‘TypeScript Roadmap’ for January 2019- June 2019

0
0
17765

article-image-how-sports-analytics-is-changing-industry

Amey Varangaonkar

14 Nov 2017

7 min read

Of perfect strikes, tackles and touchdowns: how analytics is changing sports

Amey Varangaonkar

14 Nov 2017

7 min read

The rise of Big Data and Analytics is drastically changing the landscape of many businesses - and the sports industry is one of them. In today’s age of cut-throat competition, data-based strategies are slowly taking the front seat when it comes to crucial decision making - helping teams gain that decisive edge over their competition.Sports Analytics is slowly becoming the next big thing! In the past, many believed that the key to conquering the opponent in any professional sport is to make the player or the team better - be it making them stronger, faster, or more intelligent. ‘Analysis’ then was limited to mere ‘clipboard statistics’ and the intuition built by coaches on the basis of raw video footage of games. This is not the case anymore. From handling media contracts and merchandising to evaluating individual or team performance on matchday, analytics is slowly changing the landscape of sports. The explosion of data in sports The amount and quality of information available to decision-makers within the sports organization have increased exponentially over the last two decades. There are several factors contributing to this: Innovation in sports science over the last decade, which has been incredible, to say the least. In-depth records maintained by trainers, coaches, medical staff, nutritionists and even the sales and marketing departments Improved processing power and lower cost of storage allowing for maintaining large amounts of historical data. Of late, the adoption of motion capture technology and wearable devices has proved to be a real game-changer in sports, where every movement on the field can be tracked and recorded. Today, many teams in a variety of sports such as Boston Red Sox and Houston Astros in Major League Baseball (MLB), San Antonio Spurs in NBA and teams like Arsenal, Manchester City and Liverpool FC in football (soccer) are adopting analytics in different capacities. Turning sports data into insights Needless to say, all the crucial sports data being generated today need equally good analytics techniques to extract the most value out of it. This is where Sports Analytics comes into the picture. Sports analytics is defined as the use of analytics on current as well as historical sport-related data to identify useful patterns, which can be used to gain a competitive advantage on the field of play. There are several techniques and algorithms which fall under the umbrella of Sports Analytics. Machine learning, among them, is a widely used set of techniques that sports analysts use to derive insights. It is a popular form of Artificial Intelligence where systems are trained using large datasets to give reliable predictions on random data. With the help of a variety of classification and recommendation algorithms, analysts are now able to identify patterns within the existing attributes of a player, and how they can be best optimized to improve his performance. Using cross-validation techniques, the machine learning models then ensure there is no degree of bias involved, and the predictions can be generalized even in cases of unknown datasets. Analytics is being put to use by a lot of sports teams today, in many different ways. Here are some key use-cases of sports analytics: Pushing the limit: Optimizing player performance Right from tracking an athlete’s heartbeats per minute to finding injury patterns, analytics can play a crucial role in understanding how an individual performs on the field. With the help of video, wearables and sensor data, it is possible to identify exactly when an athlete’s performance drops and corrective steps can be taken accordingly. It is now possible to assess a player’s physiological and technical attributes and work on specific drills in training to push them to an optimal level. Developing search-powered data intelligence platforms seems to be the way forward. The best example for this is Tellius, a search-based data intelligence tool which allows you to determine a player’s efficiency in terms of fitness and performance through search-powered analytics. Smells like team spirit: Better team and athlete management Analytics also helps the coaches manage their team better. For example, Adidas has developed a system called miCoach which works by having the players use wearables during the games and training sessions. The data obtained from the devices highlights the top performers and the ones who need rest. It is also possible to identify and improve patterns in a team’s playing styles, and developing a ‘system’ to improve the efficiency in gameplay. For individual athletes, real-time stats such as speed, heart rate, and acceleration could help the trainers plan the training and conditioning sessions accordingly. Getting intelligent responses regarding player and team performances and real-time in-game tactics is something that will make the coaches’ and management’s life a lot easier, going forward. All in the game: Improving game-day strategy By analyzing the real-time training data, it is possible to identify the fitter, in-form players to be picked for the game. Not just that, analyzing opposition and picking the right strategy to beat them becomes easier once you have the relevant data insights with you. Different data visualization techniques can be used not just with historical data but also with real-time data, when the game is in progress. Splashing the cash: Boosting merchandising What are fans buying once they’re inside the stadium? Is it the home team’s shirt, or is it their scarfs and posters? What food are they eating in the stadium eateries? By analyzing all this data, retailers and club merchandise stores can store the fan-favorite merchandise and other items in adequate quantities, so that they never run out of stock. Analyzing sales via online portals and e-stores also help the teams identify the countries or areas where the buyers live. This is a good indicator for them to concentrate sales and marketing efforts in those regions. Analytics also plays a key role in product endorsements and sponsorships. Determining which brands to endorse, identifying the best possible sponsor, the ideal duration of sponsorship and the sponsorship fee - these are some key decisions that can be taken by analyzing current trends along with the historical data. Challenges in sports analytics Although the advantages offered by analytics are there for all to see, many sports teams have still not incorporated analytics into their day-to-day operations. Lack of awareness seems to be the biggest factor here. Many teams underestimate or still don’t understand, the power of analytics. Choosing the right Big Data and analytics tool is another challenge. When it comes to the humongous amounts of data, especially, the time investment needed to clean and format the data for effective analysis is problematic and is something many teams aren’t interested in. Another challenge is the rising demand for analytics and a sharp deficit when it comes to supply, driving higher salaries. Add to that the need to have a thorough understanding of the sport to find effective insights from data - and it becomes even more difficult to get the right data experts. What next for sports analytics? Understanding data and how it can be used in sports - to improve performance and maximize profits - is now deemed by many teams to be the key differentiator between success and failure. And it’s not just success that teams are after - it’s sustained success, and analytics goes a long way in helping teams achieve that. Gone are the days when traditional ways of finding insights were enough. Sports have evolved, and teams are now digging harder into data to get that slightest edge over the competition, which can prove to be massive in the long run. If you found the article to be insightful, make sure you check out our interview on sports analytics with ESPN Senior Stats Analyst Gaurav Sundararaman.

0
2
17622

article-image-a-quick-look-at-ml-in-algorithmic-trading-strategies

Natasha Mathur

14 Feb 2019

6 min read

A Quick look at ML in algorithmic trading strategies

Natasha Mathur

14 Feb 2019

6 min read

Algorithmic trading relies on computer programs that execute algorithms to automate some, or all, elements of a trading strategy. Algorithms are a sequence of steps or rules to achieve a goal and can take many forms. In the case of machine learning (ML), algorithms pursue the objective of learning other algorithms, namely rules, to achieve a target based on data, such as minimizing a prediction error. In this article, we have a look at use cases of ML and how it is used in algorithmic trading strategies. These algorithms encode various activities of a portfolio manager who observes market transactions and analyzes relevant data to decide on placing buy or sell orders. The sequence of orders defines the portfolio holdings that, over time, aim to produce returns that are attractive to the providers of capital, taking into account their appetite for risk. This article is an excerpt taken from the book 'Hands-On Machine Learning for Algorithmic Trading' written by Stefan Jansen. The book explores effective trading strategies in real-world markets using NumPy, spaCy, pandas, scikit-learn, and Keras. Ultimately, the goal of active investment management consists in achieving alpha, that is, returns in excess of the benchmark used for evaluation. The fundamental law of active management applies the information ratio (IR) to express the value of active management as the ratio of portfolio returns above the returns of a benchmark, usually an index, to the volatility of those returns. It approximates the information ratio as the product of the information coefficient (IC), which measures the quality of forecast as their correlation with outcomes, and the breadth of a strategy expressed as the square root of the number of bets. The use of ML for algorithmic trading, in particular, aims for more efficient use of conventional and alternative data, with the goal of producing both better and more actionable forecasts, hence improving the value of active management. Quantitative strategies have evolved and become more sophisticated in three waves: In the 1980s and 1990s, signals often emerged from academic research and used a single or very few inputs derived from the market and fundamental data. These signals are now largely commoditized and available as ETF, such as basic mean-reversion strategies. In the 2000s, factor-based investing proliferated. Funds used algorithms to identify assets exposed to risk factors like value or momentum to seek arbitrage opportunities. Redemptions during the early days of the financial crisis triggered the quant quake of August 2007 that cascaded through the factor-based fund industry. These strategies are now also available as long-only smart-beta funds that tilt portfolios according to a given set of risk factors. The third era is driven by investments in ML capabilities and alternative data to generate profitable signals for repeatable trading strategies. Factor decay is a major challenge: the excess returns from new anomalies have been shown to drop by a quarter from discovery to publication, and by over 50% after publication due to competition and crowding. There are several categories of trading strategies that use algorithms to execute trading rules: Short-term trades that aim to profit from small price movements, for example, due to arbitrage Behavioral strategies that aim to capitalize on anticipating the behavior of other market participants Programs that aim to optimize trade execution, and A large group of trading based on predicted pricing The HFT funds discussed above most prominently rely on short holding periods to benefit from minor price movements based on bid-ask arbitrage or statistical arbitrage. Behavioral algorithms usually operate in lower liquidity environments and aim to anticipate moves by a larger player likely to significantly impact the price. The expectation of the price impact is based on sniffing algorithms that generate insights into other market participants' strategies, or market patterns such as forced trades by ETFs. Trade-execution programs aim to limit the market impact of trades and range from the simple slicing of trades to match time-weighted average pricing (TWAP) or volume-weighted average pricing (VWAP). Simple algorithms leverage historical patterns, whereas more sophisticated algorithms take into account transaction costs, implementation shortfall or predicted price movements. These algorithms can operate at the security or portfolio level, for example, to implement multileg derivative or cross-asset trades. Let's now have a look at different applications in Trading where ML is of key importance. Use Cases of ML for Trading ML extracts signals from a wide range of market, fundamental, and alternative data, and can be applied at all steps of the algorithmic trading-strategy process. Key applications include: Data mining to identify patterns and extract features Supervised learning to generate risk factors or alphas and create trade ideas Aggregation of individual signals into a strategy Allocation of assets according to risk profiles learned by an algorithm The testing and evaluation of strategies, including through the use of synthetic data The interactive, automated refinement of a strategy using reinforcement learning Supervised learning for alpha factor creation and aggregation The main rationale for applying ML to trading is to obtain predictions of asset fundamentals, price movements or market conditions. A strategy can leverage multiple ML algorithms that build on each other. Downstream models can generate signals at the portfolio level by integrating predictions about the prospects of individual assets, capital market expectations, and the correlation among securities. Alternatively, ML predictions can inform discretionary trades as in the quantamental approach outlined above. ML predictions can also target specific risk factors, such as value or volatility, or implement technical approaches, such as trend following or mean reversion. Asset allocation ML has been used to allocate portfolios based on decision-tree models that compute a hierarchical form of risk parity. As a result, risk characteristics are driven by patterns in asset prices rather than by asset classes and achieve superior risk-return characteristics. Testing trade ideas Backtesting is a critical step to select successful algorithmic trading strategies. Cross-validation using synthetic data is a key ML technique to generate reliable out-of-sample results when combined with appropriate methods to correct for multiple testing. The time series nature of financial data requires modifications to the standard approach to avoid look-ahead bias or otherwise contaminate the data used for training, validation, and testing. In addition, the limited availability of historical data has given rise to alternative approaches that use synthetic data. Reinforcement learning Trading takes place in a competitive, interactive marketplace. Reinforcement learning aims to train agents to learn a policy function based on rewards. In this article, we briefly discussed how ML has become a key ingredient for different stages of algorithmic trading strategies. If you want to learn more about trading strategies that use ML, be sure to check out the book 'Hands-On Machine Learning for Algorithmic Trading'. Using machine learning for phishing domain detection [Tutorial] Anatomy of an automated machine learning algorithm (AutoML) 10 machine learning algorithms every engineer needs to know

0
0
17585

article-image-deep-learning-set-revolutionize-music-industry

Sugandha Lahoti

11 Dec 2017

6 min read

Deep Learning is all set to revolutionize the music industry

Sugandha Lahoti

11 Dec 2017

6 min read

Isn’t it spooky how Facebook can identify faces of your friends before you manually tag them? Have you been startled by Cortana, Siri or Google Assistant when they instantly recognize and act on your voice as you speak to these virtual assistants? Deep Learning is the driving force behind these uncanny yet innovative applications. The next thing that is all set to dive into deep learning is the music industry. Neural networks not only ease production and generation of songs, but also assist in music recommendation, transcription and classification. Here are some ways that deep learning will elevate music and the listening experience itself: Generating melodies with Neural Nets At the most basic level, a deep learning algorithm follows 3 simple steps for music generation: First, the neural net is trained with a sample data set of songs which are labelled. The labelling is done based on the emotions you want your song to convey (happy, sad, funny, etc). For training, the program converts the speech of the data set in text format and then creates vector for each word. The training data can also be in the form of MIDI format which is a standard protocol for encoding musical notes. After completing the training, the program is fed with a set of emotions as input. It identifies the associated input vectors and compares them to training vectors. The output is a melody or chords that represent the desired emotions. Long short-term memory (LSTM) architectures are also used for music generation. They take structured input of a music notation. These inputs are then encoded as vectors and fed into an LSTM at each timestep. LSTM then predicts the encoding of the next timestep. Fully connected convolutional layers are utilized to increase the music quality and to represent rich features in the frequency domain. Magenta, the popular art and music project of Google has launched Performance RNN, which is an LSTM-based recurrent neural network. It is designed to produce multiple sounds with expressive timing and dynamics. In other words, Performance RNN determines which notes to play, when to play them, and how hard to strike each note. IBM’s Watson Beat uses a neural network to produce complete tracks by understanding music theory, structure, and emotional intent. According to Richard Daskas, a music composer working on the Watson Beat project, “Watson only needs about 20 seconds of musical inspiration to create a song.” Transcripting music with deep learning Deep learning methods can also be used for arranging a piece of music for a different instrument. LSTM networks are a popular choice for music transcription and modelling. These networks are trained using a large dataset of pre-labelled music transcriptions (expressed with ABC notation). These transcriptions are then used to generate new music transcriptions. In fact, transformed audio data can be used to predict the group of notes currently being played. This can be achieved by treating the transcription model as an image classification problem. For this, an image of an audio is used, called as Spectrogram. A spectrogram displays how the spectrum or frequency content changes over time. A Short Time Fourier Transform (STFT) or a constant Q transform is used to create this spectrogram. The spectrogram is then feeded to a Convolutional Neural network(CNN). The CNN estimates current notes from audio data and determines what specific notes are present by analysing 88 output nodes for each of the piano keys. This network is generally trained using large number of examples from MIDI files spanning several different genres of music. Magenta has developed The NSynth dataset, which is a high-quality multi-note dataset for music transcription. It is inspired by image recognition datasets and has a huge collection of annotated musical notes. Make better music recommendations Neural Nets are also used to make intelligent music recommendations and are a step ahead of the traditional Collaborative filtering networks. Using neural networks, the system can analyse the songs saved by the users, and then utilize those songs to make new recommendations. Neural nets can also be used to analyze songs based on musical qualities such as pitch, chord progression, bass, etc. Using the similarities between songs having the same traits as each other, neural networks can detect and predict new songs. Thus providing recommendation based on similar lyrical and musical styles. Convolutional neural networks (CNNs) are utilized for making music recommendations. A time-frequency representation of the audio signal is fed into the network as the input. 3 second audio clips are randomly chosen from the audio samples to train the neural network. The CNNs are then used to predict latent factors from music audio by taking the average of the predictions for consecutive clips. The feature extraction layers and pooling layers permits operation on several timescales. Spotify is working on a music recommendation system with a CNN. This recommendation system, when trained on short clips of songs, can create playlists based on the audio content only. Classifying music according to genre Classifying music according to a genre is another achievement of neural nets. At the heart of this application lies the LSTM network. At the very first stage, convolutional layers are used for feature extraction from the spectrograms of the audio file. The sequence of features so obtained is given as input to the LSTM layer. LSTM evaluates dependencies of the song across both short time period as well as long term structure. After the LSTM, the input is fed into a fully connected, time-distributed layer which essentially gives us a sequence of vectors. These vectors are then used to output the network's evaluation of the genre of the song at the particular point of time. Deepsound uses GTZAN dataset and an LSTM network to create a model for music genre recognition. On comparing the mean output distribution with the correct genre, the model gives almost 67% of accuracy. For musical pattern extraction, MFCC feature dataset is used for audio analysis. First, the audio signal is extracted in the MFCC format. Next, the input song is modified into an MFCC map. This Map is then split to feed it as the input of the CNN. Supervised learning is used for automatically obtaining musical pattern extractors, considering the song label is provided. The extractors so acquired, are used for restoring high-order pattern-related features. After high-order classification, the result is combined and undergoes a voting process to produce the song-level label. Scientists from Queen Mary University of London trained a neural net with over 6000 songs in a ballad, hip-hop, and dance to develop a neural network that achieves almost 75% accuracy in song classification. The road ahead Neural networks have advanced the state of music to whole new level where one would no longer require physical instruments or vocals to compose music. The future would see more complex models and data representations to understand the underlying melodic structure. This would help models create compelling artistic content on their own. Combination of music with technology would also foster a collaborative community consisting of artists, coders and deep learning researchers, leading to a tech-driven, yet artistic future.

0
0
17527

article-image-15-ways-to-make-blockchains-scalable-secure-and-safe

Aaron Lazar

27 Dec 2017

14 min read

15 ways to make Blockchains scalable, secure and safe!

Aaron Lazar

27 Dec 2017

14 min read

[box type="note" align="" class="" width=""]This article is a book extract from Mastering Blockchain, written by Imran Bashir. In the book he has discussed distributed ledgers, decentralization and smart contracts for implementation in the real world.[/box] Today we will look at various challenges that need to be addressed before blockchain becomes a mainstream technology. Even though various use cases and proof of concept systems have been developed and the technology works well for most scenarios, some fundamental limitations continue to act as barriers to mainstream adoption of blockchains in key industries. We’ll start by listing down the main challenges that we face while working on Blockchain and propose ways to over each of those challenges: Scalability This problem has been a focus of intense debate, rigorous research, and media attention for the last few years. This is the single most important problem that could mean the difference between wider adaptability of blockchains or limited private use only by consortiums. As a result of substantial research in this area, many solutions have been proposed, which are discussed here: Block size increase: This is the most debated proposal for increasing blockchain performance (transaction processing throughput). Currently, bitcoin can process only about three to seven transactions per second, which is a major inhibiting factor in adapting the bitcoin blockchain for processing micro-transactions. Block size in bitcoin is hardcoded to be 1 MB, but if block size is increased, it can hold more transactions and can result in faster confirmation time. There are several Bitcoin Improvement Proposals (BIPs) made in favor of block size increase. These include BIP 100, BIP 101, BIP 102, BIP 103, and BIP 109. In Ethereum, the block size is not limited by hardcoding; instead, it is controlled by gas limit. In theory, there is no limit on the size of a block in Ethereum because it's dependent on the amount of gas, which can increase over time. This is possible because miners are allowed to increase the gas limit for subsequent blocks if the limit has been reached in the previous block. Block interval reduction: Another proposal is to reduce the time between each block generation. The time between blocks can be decreased to achieve faster finalization of blocks but may result in less security due to the increased number of forks. Ethereum has achieved a block time of approximately 14 seconds and, at times, it can increase. This is a significant improvement from the bitcoin blockchain, which takes 10 minutes to generate a new block. In Ethereum, the issue of high orphaned blocks resulting from smaller times between blocks is mitigated by using the Greedy Heaviest Observed Subtree (GHOST) protocol whereby orphaned blocks (uncles) are also included in determining the valid chain. Once Ethereum moves to Proof of Stake, this will become irrelevant as no mining will be required and almost immediate finality of transactions can be achieved. Invertible Bloom lookup tables: This is another approach that has been proposed to reduce the amount of data required to be transferred between bitcoin nodes. Invertible Bloom lookup tables (IBLTs) were originally proposed by Gavin Andresen, and the key attraction in this approach is that it does not result in a hard fork of bitcoin if implemented. The key idea is based on the fact that there is no need to transfer all transactions between nodes; instead, only those that are not already available in the transaction pool of the synching node are transferred. This allows quicker transaction pool synchronization between nodes, thus increasing the overall scalability and speed of the bitcoin network. Sharding: Sharding is not a new technique and has been used in distributed databases for scalability such as MongoDB and MySQL. The key idea behind sharding is to split up the tasks into multiple chunks that are then processed by multiple nodes. This results in improved throughput and reduced storage requirements. In blockchains, a similar scheme is employed whereby the state of the network is partitioned into multiple shards. The state usually includes balances, code, nonce, and storage. Shards are loosely coupled partitions of a blockchain that run on the same network. There are a few challenges related to inter-shard communication and consensus on the history of each shard. This is an open area for research. State channels: This is another approach proposed for speeding up the transaction on a blockchain network. The basic idea is to use side channels for state updating and processing transactions off the main chain; once the state is finalized, it is written back to the main chain, thus offloading the time-consuming operations from the main blockchain. State channels work by performing the following three steps: First, a part of the blockchain state is locked under a smart contract, ensuring the agreement and business logic between participants. Now off-chain transaction processing and interaction is started between the participants that update the state only between themselves for now. In this step, almost any number of transactions can be performed without requiring the blockchain and this is what makes the process fast and a best candidate for solving blockchain scalability issues. However, it could be argued that this is not a real on-blockchain solution such as, for example, sharding, but the end result is a faster, lighter, and robust network which can prove very useful in micropayment networks, IoT networks, and many other applications. Once the final state is achieved, the state channel is closed and the final state is written back to the main blockchain. At this stage, the locked part of the blockchain is also unlocked. This technique has been used in the bitcoin lightning network and Ethereum's Raiden. 6. Subchains: This is a relatively new technique recently proposed by Peter R. Rizun which is based on the idea of weak blocks that are created in layers until a strong block is found. Weak blocks can be defined as those blocks that have not been able to be mined by meeting the standard network difficulty criteria but have done enough work to meet another weaker difficulty target. Miners can build sub-chains by layering weak blocks on top of each other, unless a block is found that meets the standard difficulty target. At this point, the subchain is closed and becomes the strong block. Advantages of this approach include reduced waiting time for the first verification of a transaction. This technique also results in a reduced chance of orphaning blocks and speeds up transaction processing. This is also an indirect way of addressing the scalability issue. Subchains do not require any soft fork or hard fork to implement but need acceptance by the community. 7. Tree chains: There are also other proposals to increase bitcoin scalability, such as tree chains that change the blockchain layout from a linearly sequential model to a tree. This tree is basically a binary tree which descends from the main bitcoin chain. This approach is similar to sidechain implementation, eliminating the need for major protocol change or block size increase. It allows improved transaction throughput. In this scheme, the blockchains themselves are fragmented and distributed across the network in order to achieve scalability. Moreover, mining is not required to validate the blocks on the tree chains; instead, users can independently verify the block header. However, this idea is not ready for production yet and further research is required in order to make it practical. Privacy Privacy of transactions is a much desired property of blockchains. However, due to its very nature, especially in public blockchains, everything is transparent, thus inhibiting its usage in various industries where privacy is of paramount importance, such as finance, health, and many others. There are different proposals made to address the privacy issue and some progress has already been made. Several techniques, such as indistinguishability obfuscation, usage of homomorphic encryption, zero knowledge proofs, and ring signatures. All these techniques have their merits and demerits and are discussed in the following sections. Indistinguishability obfuscation: This cryptographic technique may serve as a silver bullet to all privacy and confidentiality issues in blockchains but the technology is not yet ready for production deployments. Indistinguishability obfuscation (IO) allows for code obfuscation, which is a very ripe research topic in cryptography and, if applied to blockchains, can serve as an unbreakable obfuscation mechanism that will turn smart contracts into a black box. The key idea behind IO is what's called by researchers a multilinear jigsaw puzzle, which basically obfuscates program code by mixing it with random elements, and if the program is run as intended, it will produce expected output but any other way of executing would render the program look random and garbage. This idea was first proposed by Sahai and others in their research paper Candidate Indistinguishability Obfuscation and Functional Encryption for All Circuits. Homomorphic encryption: This type of encryption allows operations to be performed on encrypted data. Imagine a scenario where the data is sent to a cloud server for processing. The server processes it and returns the output without knowing anything about the data that it has processed. This is also an area ripe for research and fully homomorphic encryption that allows all operations on encrypted data is still not fully deployable in production; however, major progress in this field has already been made. Once implemented on blockchains, it can allow processing on cipher text which will allow privacy and confidentiality of transactions inherently. For example, the data stored on the blockchain can be encrypted using homomorphic encryption and computations can be performed on that data without the need for decryption, thus providing privacy service on blockchains. This concept has also been implemented in a project named Enigma by MIT's Media Lab. Enigma is a peer-to-peer network which allows multiple parties to perform computations on encrypted data without revealing anything about the data. Zero knowledge proofs: Zero knowledge proofs have recently been implemented in Zcash successfully, as seen in previous chapters. More specifically, SNARKs have been implemented in order to ensure privacy on the blockchain. The same idea can be implemented in Ethereum and other blockchains also. Integrating Zcash on Ethereum is already a very active research project being run by the Ethereum R&D team and the Zcash Company. State channels: Privacy using state channels is also possible, simply due to the fact that all transactions are run off-chain and the main blockchain does not see the transaction at all except the final state output, thus ensuring privacy and confidentiality. Secure multiparty computation: The concept of secure multiparty computation is not new and is based on the notion that data is split into multiple partitions between participating parties under a secret sharing mechanism which then does the actual processing on the data without the need of the reconstructing data on single machine. The output produced after processing is also shared between the parties. MimbleWimble: The MimbleWimble scheme was proposed somewhat mysteriously on the bitcoin IRC channel and since then has gained a lot of popularity. MimbleWimble extends the idea of confidential transactions and Coinjoin, which allows aggregation of transactions without requiring any interactivity. However, it does not support the use of bitcoin scripting language along with various other features of standard Bitcoin protocol. This makes it incompatible with existing Bitcoin protocol. Therefore, it can either be implemented as a sidechain to bitcoin or on its own as an alternative cryptocurrency. This scheme can address privacy and scalability issues both at once. The blocks created using the MimbleWimble technique do not contain transactions as in traditional bitcoin blockchains; instead, these blocks are composed of three lists: an input list, output list, and something called excesses which are lists of signatures and differences between outputs and inputs. The input list is basically references to the old outputs, and the output list contains confidential transactions outputs. These blocks are verifiable by nodes by using signatures, inputs, and outputs to ensure the legitimacy of the block. In contrast to bitcoin, MimbleWimble transaction outputs only contain pubkeys, and the difference between old and new outputs is signed by all participants involved in the transactions. Coinjoin: Coinjoin is a technique which is used to anonymize the bitcoin transactions by mixing them interactively. The idea is based on forming a single transaction from multiple entities without causing any change in inputs and outputs. It removes the direct link between senders and receivers, which means that a single address can no longer be associated with transactions, which could lead to identification of the users. Coinjoin needs cooperation between multiple parties that are willing to create a single transaction by mixing payments. Therefore, it should be noted that, if any single participant in the Coinjoin scheme does not keep up with the commitment made to cooperate for creating a single transaction by not signing the transactions as required, then it can result in a denial of service attack. In this protocol, there is no need for a single trusted third party. This concept is different from mixing a service which acts as a trusted third party or intermediary between the bitcoin users and allows shuffling of transactions. This shuffling of transactions results in the prevention of tracing and the linking of payments to a particular user. Security Even though blockchains are generally secure and make use of asymmetric and symmetric cryptography as required throughout the blockchain network, there still are few caveats that can result in compromising the security of the blockchain. There are two types of Contract Verification that can solve the issue of Security. Why3 formal verification: Formal verification of solidity code is now available as a feature in the solidity browser. First the code is converted into Why3 language that the verifier can understand. In the example below, a simple solidity code that defines the variable z as maximum limit of uint is shown. When this code runs, it will result in returning 0, because uint z will overrun and start again from 0. This can also be verified using Why3, which is shown below: Once the solidity is compiled and available in the formal verification tab, it can be copied into the Why3 online IDE available at h t t p ://w h y 3. l r i . f r /t r y /. The example below shows that it successfully checks and reports integer overflow errors. This tool is under heavy development but is still quite useful. Also, this tool or any other similar tool is not a silver bullet. Even formal verification generally should not be considered a panacea because specifications in the first place should be defined appropriately: Oyente tool: Currently, Oyente is available as a Docker image for easy testing and installation. It is available at https://github.com/ethereum/oyente, and can be quickly downloaded and tested. In the example below, a simple contract taken from solidity documentation that contains a re-entrancy bug has been tested and it is shown that Oyente successfully analyzes the code and finds the bug: This sample code contains a re-entrancy bug which basically means that if a contract is interacting with another contract or transferring ether, it is effectively handing over the control to that other contract. This allows the called contract to call back into the function of the contract from which it has been called without waiting for completion. For example, this bug can allow calling back into the withdraw function shown in the preceding example again and again, resulting in getting Ethers multiple times. This is possible because the share value is not set to 0 until the end of the function, which means that any later invocations will be successful, resulting in withdrawing again and again. An example is shown of Oyente running to analyze the contract shown below and as can be seen in the following output, the analysis has successfully found the re-entrancy bug. The bug is proposed to be handled by a combination of the Checks-Effects-Interactions pattern described in the solidity documentation: We discussed the scalability, security, confidentiality, and privacy aspects of blockchain technology in this article. If you found it useful, go ahead and buy the book, Mastering Blockchain by Imran Bashir to become a professional Blockchain developer.

0
0
17453

Savia Lobo

28 Sep 2018

5 min read

What is Core ML?

Savia Lobo

28 Sep 2018

5 min read

Introduced by Apple, CoreML is a machine learning framework that powers the iOS app developers to integrate machine learning technology into their apps. It supports natural language processing (NLP), image analysis, and various other conventional models to provide a top-notch on-device performance with minimal memory footprint and power consumption. This article is an extract taken from the book Machine Learning with Core ML written by Joshua Newnham. In this article, you will get to know the basics of what CoreML is and its typical workflow. With the release of iOS 11 and Core ML, performing inference is just a matter of a few lines of code. Prior to iOS 11, inference was possible, but it required some work to take a pre-trained model and port it across using an existing framework such as Accelerate or metal performance shaders (MPSes). Accelerate and MPSes are still used under the hood by Core ML, but Core ML takes care of deciding which underlying framework your model should use (Accelerate using the CPU for memory-heavy tasks and MPSes using the GPU for compute-heavy tasks). It also takes care of abstracting a lot of the details away; this layer of abstraction is shown in the following diagram: There are additional layers too; iOS 11 has introduced and extended domain-specific layers that further abstract a lot of the common tasks you may use when working with image and text data, such as face detection, object tracking, language translation, and named entity recognition (NER). These domain-specific layers are encapsulated in the Vision and natural language processing (NLP) frameworks; we won't be going into any details of these frameworks here, but you will get a chance to use them in later chapters: It's worth noting that these layers are not mutually exclusive and it is common to find yourself using them together, especially the domain-specific frameworks that provide useful preprocessing methods we can use to prepare our data before sending to a Core ML model. So what exactly is Core ML? You can think of Core ML as a suite of tools used to facilitate the process of bringing ML models to iOS and wrapping them in a standard interface so that you can easily access and make use of them in your code. Let's now take a closer look at the typical workflow when working with Core ML. CoreML Workflow As described previously, the two main tasks of a ML workflow consist of training and inference. Training involves obtaining and preparing the data, defining the model, and then the real training. Once your model has achieved satisfactory results during training and is able to perform adequate predictions (including on data it hasn't seen before), your model can then be deployed and used for inference using data outside of the training set. Core ML provides a suite of tools to facilitate getting a trained model into iOS, one being the Python packaged released called Core ML Tools; it is used to take a model (consisting of the architecture and weights) from one of the many popular packages and exporting a .mlmodel file, which can then be imported into your Xcode project. Once imported, Xcode will generate an interface for the model, making it easily accessible via code you are familiar with. Finally, when you build your app, the model is further optimized and packaged up within your application. A summary of the process of generating the model is shown in the following diagram: The previous diagram illustrates the process of creating the .mlmodel;, either using an existing model from one of the supported frameworks, or by training it from scratch. Core ML Tools supports most of the frameworks, either internal or as third party plug-ins, including Keras, Turi, Caffe, scikit-learn, LibSVN, and XGBoost frameworks. Apple has also made this package open source and modular for easy adaption for other frameworks or by yourself. The process of importing the model is illustrated in this diagram: In addition; there are frameworks with tighter integration with Core ML that handle generating the Core ML model such as Turi Create, IBM Watson Services for Core ML, and Create ML. We will be introducing Create ML in chapter 10; for those interesting in learning more about Turi Create and IBM Watson Services for Core ML then please refer to the official webpages via the following links: Turi Create; https://github.com/apple/turicreate IBM Watson Services for Core ML; https://developer.apple.com/ibm/ Once the model is imported, as mentioned previously, Xcode generates an interface that wraps the model, model inputs, and outputs. Thus, in this post, we learned about the workflow of training and how to import a model. If you've enjoyed this post, head over to the book Machine Learning with Core ML to delve into the details of what this model is and what Core ML currently supports. Emotional AI: Detecting facial expressions and emotions using CoreML [Tutorial] Build intelligent interfaces with CoreML using a CNN [Tutorial] Watson-CoreML: IBM and Apple’s new machine learning collaboration project

0
0
17342

Janu Verma

01 Jun 2017

5 min read

What is Keras?

Janu Verma

01 Jun 2017

5 min read

What is keras? Keras is a high-level library for deep learning, which is built on top of Theano and Tensorflow. It is written in Python, and provides a scikit-learn type API for building neural networks. It enables developers to quickly build neural networks without worrying about the mathematical details of tensor algebra, optimization methods, and numerical techniques. The key idea behind Keras is to facilitate fast prototyping and experimentation. In the words of Francois Chollet, the creator of Keras: Being able to go from idea to result with the least possible delay is key to doing good research. This is a huge advantage, especially for scientists and beginner developers because one can jump right into deep learning without getting their hands dirty. Because deep learning is currently on the rise, the demand for people trained in deep learning is ever increasing. Organizations are trying to incorporate deep learning into their workflows, and Keras offers an easy API to test and build deep learning applications without considerable effort. Deep learning research is such a hot topic right now, scientists need a tool to quickly try out their ideas, and they would rather spend time on coming up with ideas than putting together a neural network model. I use Keras in my own research, and I know a lot of other researchers relying on Keras for its easy and flexible API. What are the key features of Keras? Keras is a high-level interface to Theano or Tensorflow and any of these can be used in the backend. It is extremely easy to switch from one backend to another. Training deep neural networks is a memory and time intensive task. Modern deep learning frameworks like Tensorflow, Caffe, Torch, etc. can also run on GPU, though there might be some overhead in setting up and running the GPU. Keras runs seamlessly on both CPU and GPU. Keras supports most of the neural layer types e.g. fully connected, convolution, pooling, recurrent, embedding, dropout, etc., which can be combined in any which way to build complex models. Keras is modular in nature in the sense that each component of a neural network model is a separate, standalone, fully-configurable module, and these modules can be combined to create new models. Essentially, layers, activation, optimizers, dropout, loss, etc. are all different modules that can be assembled to build models. A key advantage of modularity is that new features are easy to add as separate modules. This makes Keras fully expressive, extremely flexible, and well-suited for innovative research. Coding in Keras is extremely easy. The API is very user-friendly with the least amount of cognitive load. Keras is a full Python framework, and all coding is done in Python, which makes it easy to debug and explore. The coding style is very minimalistic, and operations are added in very intuitive python statements. How is Keras built? The core component of Keras architecture is a model. Essentially, a model is a neural network model with layers, activations, optimization, and loss. The simplest Keras model is Sequential, which is just a linear stack of layers; other layer arrangements can be formed using the Functional model. We first initialize a model, add layers to it one by one, each layer followed by its activation function (and regularization, if desired), and then the cost function is added to the model. The model is then compiled. A compiled model can be trained, using the simple API (model.fit()), and once trained the model can be used to make predictions (model.predict()). The similarity to scikit-learn API can be noted here. Two models can be combined sequentially or parallel. A model trained on some data can be saved as an HDF5 file, which can be loaded at a later time. This eliminates the need to train a model again and again; train once and make predictions whenever desired. Keras provides an API for most common types of layers. You can also merge or concatenate layers for a parallel model. It is also possible to write your own layers. Other ingredients of a neural network model like loss function, metric, optimization method, activation function, and regularization are all available with most common choices. Another very useful component of Keras is the preprocessing module with support for manipulating and processing image, text, and sequence data. A number of deep learning models and their weights obtained by training on a big dataset are made available. For example, we have VGG16, VGG19, InceptionV3, Xception, ResNet50 image recognition models with their weights after training on ImageNet data. These models can be used for direct prediction, feature building, and/or transfer learning. One of the greatest advantage of Keras is a huge list of example code available on the Keras GitHub repository (with discussions on accompanying blog) and on the wider Internet. You can learn how to use Keras for text classification using a LSTM model, generate inceptionistic art using deep dream, using pre-trained word embeddings, building variational autoencoder, or train a Siamese network, etc. There is also a visualization module, which provides functionality to draw a Keras model. This uses the graphviz library to plot and save the model graph to a file. All in all, Keras is a library worth exploring, if you haven't already. Janu Verma is a Researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational biology and healthcare analytics. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute. He has written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals etc. His current focus is on the development of visual analytics systems for prediction and understanding. He advises startups and companies on data science and machine learning in the Delhi-NCR area, email to schedule a meeting. Check out his personal website here.

0
0
17262

article-image-tackle-trolls-machine-learning-filtering-inappropriate-content

Amarabha Banerjee

15 Aug 2018

4 min read

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Amarabha Banerjee

15 Aug 2018

4 min read

The most feared online entities in the present day are trolls. Trolls, a fearsome bunch of fake or pseudo online profiles, tend to attack online users, mostly celebrities, sports person or political profiles using a wide range of methods. One of these methods is to post obscene or NSFW (Not Safe For Work) content on your profile or website where User Generated Content (USG) is allowed. This can create unnecessary attention and cause legal troubles for you too. The traditional way out is to get a moderator (or a team of them). Let all the USGs pass through this moderation system. This is a sustainable solution for a small platform. But if you are running a large scale app, say a publishing app where you publish one hundred stories a day, and the success of these stories depend on the user interaction with them, then this model of manual moderation becomes unsustainable. More the number of USGs, more is the turn-around time, larger the moderation team size. This results in escalating costs, for a purpose that’s not contributing to your business growth in any manner. That’s where Machine Learning could help. Machine Learning algorithms that can scan images and content for possible abusive or adult content is a better solution that manual moderation. Tech giants like Microsoft, Google, Amazon have a ready solution for this. These companies have created APIs which are commercially available for developers. You can incorporate these APIs in your application to weed out the filth served by the trolls. The different APIs available for this purpose are Microsoft moderation, Google Vision, AWS Rekognition & Clarifai. Dataturks have made a comparative study on using these APIs on one particular dataset to measure their efficiency. They used a YACVID dataset with 180 images, manually labelled 90 of these images as nude and the rest as non-nude. The dataset was then fed to the 4 APIs mentioned above, their efficiency was tested based on the following parameters. True Positive (TP): Given a safe photo, the API correctly says so False Positive (FP): Given an explicit photo but the API incorrectly classifies it as safe. False negative (FN): Given a safe photo but the API is not able to detect so and True negative(TN): Given an explicit photo and the API correctly says so. TP and TN are two cases which meant the system behaved correctly. An FP meant that the app was vulnerable to attacks from trolls, FN meant the efficiency of the systems were low and hence not practically viable. 10% of the cases would be such that the API can’t decide whether its explicit or not. Those would be sent for manual moderation. This would bring down the maintenance cost of the moderation team. The results that they received are shown below: Source: Dataturks As it is evident from the above table, the best standalone API is Google vision with a 99% accuracy and 94% recall value. Recall value implies that if the same images are repeated, it can recognize them with 94% precision. The best results however were received with the combination of Microsoft and Google. The comparison of the response times are mentioned below: Source: dataturks The response time might have been affected with the fact that all the images accessed by the APIs were stored in Amazon S3. Hence AWS API might have had an unfair advantage on the response time. The timings were noted for 180 image calls per API. The cost is the lowest for AWS Rekognition - $1 for 1000 calls to the API. It’s $1.2 for Clarifai, $1.5 for both Microsoft and Google. The one notable drawback of the Amazon API was that the images had to be stored as S3 objects, or converted into that. All the other APIs accepted any web links as possible source of images. What this study says is that the power of filtering out negative and explicit content in your app is much easier now. You might still have to have a small team of moderators, but their jobs will be made a lot easier with the ML models implemented in these APIs. Machine Learning is paving the way for us to be safe from the increasing menace of Trolls, a threat to free speech and open sharing of ideas which were the founding stones of internet and the world wide web as a whole. Will this discourage Trolls from continuing their slandering or will it create a counter system to bypass the APIs and checks? We can only know in time. Facebook launches a 6-part Machine Learning video series Google’s new facial recognition patent uses your social network to identify you! Microsoft’s Brad Smith calls for facial recognition technology to be regulated

0
0
17236

Tech Guides - Data

Uber's kepler.gl, an open source toolbox for GeoSpatial Analysis

DevOps might be the key to your Big Data project success

Is Initiative Q a pyramid scheme or just a really bad idea?

Emoji Scavenger Hunt showcases TensorFlow.js

Effective Product Development needs developers and product managers collaborating on success metrics

What is the API Economy?

7 promising real-world applications of AI-powered Mixed Reality

Why Retailers need to prioritize eCommerce Automation in 2019

Of perfect strikes, tackles and touchdowns: how analytics is changing sports

A Quick look at ML in algorithmic trading strategies

Trending Topics

Deep Learning is all set to revolutionize the music industry

15 ways to make Blockchains scalable, secure and safe!

What is Core ML?

What is Keras?

Tackle trolls with Machine Learning bots: Filtering out inappropriate content just got easy

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access