Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides - Data Analysis

34 Articles
article-image-5-data-science-tools-matter-2018
Richard Gall
12 Dec 2017
3 min read
Save for later

5 data science tools that will matter in 2018

Richard Gall
12 Dec 2017
3 min read
We know your time is valuable. That's why what matters is important. We've written about the trends and issues that are going to matter in data science, but here you can find 5 data science tools that you need to pay attention to in 2018. Read our 5 things that matter in data science in 2018 here. 1. TensorFlow Google's TensorFlow has been one of the biggest hits of 2017 when it comes to libraries. It’s arguably done a lot to make machine learning more accessible than ever before. That means more people actually building machine learning and deep learning algorithms, and the technology moving beyond the domain of data professionals and into other fields. So, if TensorFlow has passed you by we recommend you spend some time exploring it. It might just give your skill set the boost you’re looking for. Explore TensorFlow content here. 2.Jupyter Jupyter isn’t a new tool, sure. But it’s so crucial to the way data science is done that it’s importance can’t be understated. And as pressure is placed on data scientists and analysts to communicate and share data in ways that empower stakeholders in a diverse range of roles and departments. It’s also worth mentioning its relationship with Python - we’ve seen Python go from strength to strength throughout 2017, and showing no signs of letting up; the close relationship between the two will only serve to make Jupyter more popular across the data science world. Discover Jupyter eBooks and videos here. 3. Keras In a year when deep learning has captured the imagination, it makes sense to include both libraries helping to power it. It’s a close call between Keras and TensorFlow which deep learning framework is ‘better’ - ultimately, like everything, it’s about what you’re trying to do. This post explores the difference between Keras and TensorFlow very well - the conclusion is ultimately that while TensorFlow offers more ‘control’, Keras is the library you want if you simply need to get up and running. Both libraries have had a huge impact in 2017, and we’re only going to be seeing more of them in 2018. Learn Keras. Read Deep Learning with Keras. 4. Auto SkLearn Automated machine learning is going to become incredibly important in 2018. As pressure mounts on engineers and analysts to do more with less, tools like Auto SKLearn will be vital in reducing some of the ‘manual labour’ of algorithm selection and tuning. 5. Dask This one might be a little unexpected. We know just how popular Apache Spark is when it comes to distributed and parallel computing, but Dask represents an interesting competitor that’s worth watching throughout 2018. It’s high-level API integrates exceptionally well with Python libraries like NumPy and pandas; it’s also much more lightweight than Spark, so it could be a good option if you want to avoid building out a weighty big data tech stack. Explore Dask in the latest edition of Python High Performance.
Read more
  • 0
  • 0
  • 7858

article-image-why-enterprises-love-the-elastic-stack
Pravin Dhandre
31 May 2018
2 min read
Save for later

Why Enterprises love the Elastic Stack

Pravin Dhandre
31 May 2018
2 min read
Business insights has always been a hotspot by companies and with data that keep flowing, growing and becoming fat by the day, analytics need to be quicker, real-time and reliable. Analytics that can’t match up today’s data provide insights that become almost lifeless to market dynamics. The question then is, is there an analytics solution that can tackle the data hydra? Elastic Stack is your answer. It is power packed with tools like Elasticsearch, Kibana, Logstash, X-Pack and Beats that takes data from any source, in any format, and provide instant search, analysis, and visualization in real time. With over 225 million downloads, it is a clear crowd favorite. Enterprises get an addon benefit in using it as a single analytical suite or getting it integrated with other products, delivering real-time actionable insights and decisions every time. Why Enterprises love the Elastic Stack? Some of the common things that enterprises love about the Elastic Stack is its being open source platform. The next thing that IT companies enjoys is its super fast distributed search mechanism that makes your queries run faster and much efficient. Apart from this, its bundling with Kibana and Logstash makes it awesome for IT infrastructure and DevOps teams who can aggregate and analyze billions of logs with ease. Its simple and robust analysis platform provides distinct advantage over Splunk, Solr, Sphinx, Ambar and many other alternative product suites. Also, its SaaS option allows customers to perform log analytics, full text search and application monitoring over the cloud with utmost ease and reasonable pricing. Companies like Amazon, Bloomberg, Ebay, SAP, Citibank, Sony, Mozilla, Wordpress, SalesForce are already been using Elastic Stack, powering their search and analytics to combat their daily business challenges. Whether it is an educational institution, travel agency, e-commerce, or a financial institution, the Elastic stack is empowering millions of companies with real-time metrics, strong analytics, better search experience and high customer satisfaction. How to install Elasticsearch in Ubuntu and Windows How to perform Numeric Metric Aggregations with Elasticsearch CRUD (Create Read, Update and Delete) Operations with Elasticsearch
Read more
  • 0
  • 0
  • 7541

article-image-whats-difference-between-data-scientist-and-data-analyst
Erik Kappelman
10 Oct 2017
5 min read
Save for later

What's the difference between a data scientist and a data analyst

Erik Kappelman
10 Oct 2017
5 min read
It sounds like a fairly pedantic question to ask what the difference between a data scientist and data analyst is. But it isn't - in fact, it's a great question that illustrates the way data-related roles have evolved in businesses today. It's pretty easy to confuse the two job roles - there's certainly a lot of misunderstanding on the difference between a data scientist and a data analyst even within a managerial environment. Comparing data analysts and data scientists Data analysts are going to be dealing with data that you might remember from your statistics classes. This data might come from survey results, lab experiments of various sorts, longitudinal studies, or another form of social observation. Data may also come from observation of natural or created phenomenons, but the data’s form would still be similar. Data scientists on the other hand, are going to looking at things like metadata from billions of phone calls, data used to forecast Bitcoin prices that have been scraped from various places around the Internet, or maybe data related to Internet searches before and after some important event. So their data is often different, but is that all? The tools and skillset required for each is actually quite different as well. Data science is much more entwined with the field of computer science than data analysis. A good data analyst should have working knowledge of how computers, networks, and the Internet function, but they don’t need to be an expert in any of these things. Data analyst really just need to know a good scripting language that is used to handle data, like Python or R, and maybe a more mathematically advanced tool like MatLab or Mathematica for more advanced modeling procedures. A data analyst could have a fruitful career knowing only about that much in the realm of technology. Data scientists, however, need to know a lot about how networks and the Internet work. Most data scientists will need to have mastered HTTP, HTML, XML and SQL as well as scripting languages like Ruby or Python, and also object-oriented languages like Java or C. This is because data scientists spend a lot more time capturing, manipulating, storing and moving around data than a data analyst would. These tasks require a different skillset. Data analysts and data scientists have different forms of conceptual understanding There will also likely be a difference in the conceptual understanding of a data analyst versus a data scientist. If you were to ask both a data scientist and a data analyst to derive and twice differentiate the log likelihood function of the binomial logistic regression model, it is more likely the data analyst would be able to do it. I would expect data analysts to have a better theoretical understanding of statistics than a data scientist. This is because data scientists don’t really need much theoretical understanding in order to be effective. A data scientist would be better served by learning more about capturing data and analyzing streams of data than theoretical statistics. Differences are not limited to knowledge or skillset, how data scientists and data analysts approach their work is also different. Data analysts generally know what they are looking for as they begin their analysis. By this I mean, a data analyst may be given the results of a study of a new drug, and the researcher may ask the analyst to explore and hopefully quantify the impact of a new drug. A data analyst would have no problem performing this task. A data scientist on the other hand, could be given the task of analyzing locations of phone calls and finding any patterns that might exist. For the data scientist, the goal is often less defined than it is for a data analyst. In fact, I think this is the crux of the entire difference. Data scientists perform far more exploratory data analysis than their data analyst cousins. This difference in approach really explains the difference in skill sets. Data scientists have skill sets that are primarily geared toward extracting, storing and finding uses for data. The skill set to perform these tasks is the skill set of a data scientist. Data analysts primarily analyze data and their skill set reflects this. Just to add one more little wrinkle, while calling a data scientist a data analyst is basically correct, calling a data analyst a data scientist is probably not correct. This is because the data scientist is going to have a handle on more of the skills required of a data analyst than a data analyst would of a data scientist. This is another reason there is so much confusion around this subject. Clearing up the difference between a data scientist and data analyst So now, hopefully, you can tell the difference between a data scientist and a data analyst. I don’t believe either field is superior to the other. If you are choosing between which field you would like to pursue, what’s important is that you choose the field that best compliments your skill set. Luckily it's hard to go wrong because both data scientists and analysts usually have interesting and rewarding careers.
Read more
  • 0
  • 0
  • 6272

article-image-aspiring-data-analyst-meet-your-new-best-friend-excel
Akram Hussain
30 Jun 2014
4 min read
Save for later

Aspiring Data Analyst, Meet Your New Best Friend: Excel

Akram Hussain
30 Jun 2014
4 min read
In general, people want to associate themselves with cool job titles and one that indirectly says both that you’re clever and you get paid well, so what’s better than telling someone you’re a data analyst? Personally, as a graduate in Economics I always thought my natural career progression would be to go into a role of an analyst working for a banking organization, a private hedge fund, or an investment firm. I’m guessing at some point all people with a background in maths or some form of statistics have envisaged becoming a hotshot investment banker, right? However, the story was very different for me; I somehow was fortunate enough to fall into the tech world and develop a real interest in programming. What I found really interesting was that programming languages and data sets go hand in hand surprisingly well, which uncovered a relatively new field to me known as data science. Here’s how the story goes – I combined my academic skills with programming, which opened up a world of opportunity, allowing me to appreciate and explore data analysis on a whole new level. Nowadays, I’m using languages like Python and R to mix background knowledge of statistical data with my new-found passion. Yet that’s not how it started. It started with Excel. Now if you want to eventually move into the field of data science, you have to become competent in data analysis. I personally recommend Excel as a starting point. There are many reasons for this, one being that you don’t have to be technical wizard to get started and more importantly, Excel’s functionalities for data analysis are more powerful than you would expect and a lot quicker and efficient in resolving queries and allowing you to visualize them too. Excel has an inbuilt Data tab to get you started: The screenshot shows the basic analytical features to get you started within Excel. It’s separate to any functions and sum calculations that could be used. However, one useful and really handy plugin called Data Analysis is missing from that list. If you click on: File | Options | Add-ins and then choose Analysis tool and Analysis tool pack - VBA from the list and select Go, you will be prompt with the following image: Once you select the add-ins (as shown above) you will now find an awesome new tag in your data tab called Data Analysis: This allows you to run different methods of analysis on your data, anything from histograms, regressions, correlations, to t-tests. Personally I found this to save me tons of time. Excel also offers features such as Pivot-tables and functions like V-look ups, both extremely useful for data analysis, especially when you require multiple tables of information for large sets of data. A V-look up function is very useful when trying to identify products in a database that have the same set of IDs but are difficult to find. A more useful feature for analysis I found was using pivot tables. One of the best things about a pivot table is that it saves so much time and effort when you have a large set of data that you need to categorize and analyze quickly from a database. Additionally, there’s a visual option named a pivot chart, which allows you to visualize all your data in the pivot table. There are many useful tutorials and training available online on pivot tables for free. Overall, Excel provides a solid foundation for most analysts starting out. A general search on the job market for “Excel data” returns a search result of over 120,000 jobs all specific to an analyst role. To conclude, I wouldn’t underestimate Excel for learning the basics and getting valuable experience with large sets of data. From there, you can progress to learning a language like Python or R (and then head towards the exciting and supercool field of data science). With R’s steep learning curve, Python is often recommended as the best place to start, especially for people with little or no background in programming. But don’t dismiss Excel as a powerful first step, as it can easily become your best friend when entering the world of data analysis.
Read more
  • 0
  • 0
  • 5308
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
Modal Close icon
Modal Close icon