I’ve always seen spring cleaning as too little, too late. For me, January is when you order, prioritize, and start answering the coming year’s questions. That’s why I’ve picked January to answer a question you may have had a while: what is machine learning?
I’ll answer that, and also define some other terms you’ll need to know to stay on top of 2017. If you’re interested in what business intelligence software can do for you, you’ll need to know these basic terms first.
I’ve put machine learning first, as it’s one of Gartner’s Top Ten Strategic Technology Trends for 2017, but the rest of the entries are arranged alphabetically.
Before machine learning, computers had to be told (programmed) how to think. With machine learning, computers can think (sort of) for themselves.
I recently spoke with Michael Finley, Head of Machine Learning at BI software company AnswerRocket, who helped elaborate. Before machine learning, most software “ran the way it was programmed: people turned instructions into computer code, and the computer did what that code told it to do.” A very simple example would be a calculator: you fed the calculator numbers, told it what to do (add, subtract), and the calculator gave you results. With machine learning, however, the software can adapt. Finley continues: “Software with machine learning doesn’t do the same thing the day you install it as it does the tenth or hundredth day you run it.” If the values being fed into the computer change, the software will adapt to those values. A computer with machine learning learns how to incorporate them.
Finley characterizes machine learning as software that knows how to deal with the concept of “like,” as in, “I want to hear a song like the last one I just heard.” The concept is easy for people, but it’s tough for computers. Finley explained that computers are good at understanding which numbers are bigger or smaller, and at matching numbers and names, but they struggle with the idea of similarity. Machine learning helps computers understand why one thing is “like” another. Machine learning’s grasp of similarity is especially helpful in predicting customer desires.
Machine learning is behind the next song you hear on Pandora, or the movie Netflix suggests. Pandora and Netflix’s machine learning algorithms are “fed” your choices (and actual “likes,” in Pandora’s case), and use that to predict what similar songs or shows you might enjoy.
Feed those machine learning algorithms different data, and they’ll react differently. If your usual diet of horror movies suddenly and inexplicably includes a romantic comedy, Netflix’s ML algorithms will react to that data, and start suggesting other romantic comedies, or horror romance.
Declining taste in movies aside, why does machine learning matter to SMBs? It can help them compete with bigger competitors. BI software with machine learning takes in new numbers every time you refresh. You’re not basing strategy off a yearly report’s numbers, you’re basing it off nearly real-time information, and algorithms that know how to adapt to that shifting data. Finley explains that the traditional, homogeneous way a business scales whatever they do can be revolutionized with ML:
“I might have laid out best practices and want to repeat processes. But what if you could lay out best practices every day, if you had info to change them and reformulate your strategy every day? You’ve got data thanks to ML that can rewrite the strategy every day, and that’s how the SMB’s are really eating the lunch of the bigger guys.”
For the SMB interested in agile business strategy, machine learning may be more than a way to stay alive. It could be a way to start taking parts of the established players’ market shares.
- Ad hoc analytics
- Ad hoc query
- Advanced Analytics
- Artificial Intelligence
- Big Data
- Contextual Data
- Data Point
- Data Quality
- Data Visualization
- Data Warehouse
- Modern BI
- Traditional BI
- SaaS/Cloud Software
- Terms you want to know…
Ad hoc analytics
Ad hoc analytics is analysis when you need it, at a level that the the non-IT, non-specialist can understand.
If accessible business intelligence seems like an obvious thing to want, it wasn’t always achievable. For a long time, BI professionals had to be able to “speak computer” (i.e., write in a coding language) to query business intelligence programs. Didn’t know how to code in SQL, R or Python? Ask someone in IT who does. Then wait. And then wait for the business intelligence programs to work, and then wait some more for the analysis.
Thankfully, BI has finally matured to ad hoc analytics. : With this system, you don’t need to wait on IT, or the slower pace of producing traditional reports, to get you the necessary data. It makes your job, and theirs, easier and less stressful.
If you don’t have an IT staff, ad hoc analytics solves that problem. Ad hoc analytics also creates a quicker time-to-insight (this is another buzzword you may see; it means it takes a shorter time to get the info you need).
Ad hoc query
“Queries” are questions you might ask your business intelligence software to answer. For example, you might ask your BI software for an alphabetical list of all brown-eyed customers born since 1970. You could just as easily call a query a question, but how often do you get to say “query” in conversation?
An ad hoc query is one you can ask for when you need it. As with older business analytics, older queries needed someone in IT to ask them. Queries also tended to take place as part of regular reports you’d get on that monthly or yearly basis. With older BI software, you’d have to ask that query in a programming language. SQL was one longtime standard in business intelligence; these days, R and Python are popular ones.
You can look at computer programs, BI included, as branches of a bureaucracy, from the DOJ to HHS. They’re technically there to accomplish things, but each one has its own language, and works its own particular way. A programmer’s like a bureaucrat who speaks the language and knows how to navigate each program/department.
This term actually goes beyond business intelligence. “Business intelligence” traditionally has dealt with analyzing what’s happened. Advanced analytics goes further, whether that’s forecasting what will happen in the future, or analyzing details and factors commonly not associated with business intelligence. Some examples of advanced analytics are data and text mining, predictive analytics, forecasting, location analytics, sentiment analysis, and machine learning.
Machine learning is one part of AI, but AI’s a much bigger concept. AI includes anything you could call “intelligence exhibited by machines.” “Intelligence,” in the AI sense, means the ability to get something done. So, the common understanding of “intelligence” as just knowing a lot isn’t the sort of intelligence found in AI.
The “somethings” AI can get done are already varied. For instance, Daisy Intelligence uses AI to examine retailers’ data, then make recommendations that they claim can “grow sales by 5% or more.” If, like me, you enjoy scheduling as much as waiting at the DMV, a virtual assistant like Amy, that can schedule meetings based on attendees’ preferences, could be your best new imaginary friend.
Big data is extremely large data sets. Though I normally agree with Stephen King that “the road to hell is paved with adverbs,” that “extremely” is warranted. A small amount of data would be, say, a short book. A PDF of the first Harry Potter book is about one megabyte (MB).
Big Data would be something like a petabyte of data. To continue the book example, everything written, since the start of recorded history, is 50 petabytes. Mega corporations, like Google, are the sort that deal with petabytes. Google’s Mesa system, which monitors Google’s ad traffic, tracks petabytes of data.
Contextual data is additional data about a person, place, or event (which are called “entities” in dataspeak). Contextual data helps round out what a business knows about a potential customer, and even predict what they might want.
Though it’s not a business, the University of Manchester in England uses contextual data in its admissions process to “to build up a full and rounded view of your achievement and potential.” Along with the student’s admission form, UM considers factors like the candidate’s zip code, the quality of the school where you took your exams, and “whether you have been looked after or in care for more than three months.”
For a business, contextual data might help sales. For a very broad example, contextual data about a past customer, based on their location’s weather, could drive revenue. A customer in Tuscon, Arizona is more likely to buy popsicles in October than one in International Falls, Minnesota.
A data point is a single scrap of data. A data point is any self-contained unit, or datum, among the data you track. A single data point could be anything from “the size of an investment” to a single click on an ad you bought on Google. In the case of Uber, location is an important data point—one so important they actually track it after your ride is done.
If you’re familiar with key performance indicators, you’re familiar with data points. KPI’s measure certain types of data points, like revenue or time it takes to complete a project.
Data quality is the measure of your data’s usefulness. High quality data is clean, organized and available. If a library’s data is its books, a library with high quality data would have books the population wants and needs, in good condition, shelved in the right places.
There are six dimensions of data quality:
A data visualization is any image, visual or graphic that displays your data. Pie charts and bar graphs would be the most common kinds. There’s a much wider range of visualizations out there, though. Gartner’s Evaluation Criteria for Business Intelligence and Analytics Platforms for 2016 (paywall protected; worth it) rates more advanced chart types as “preferred” items to look for in your BI solution. Some of those higher quality, preferred chart types to look for are:
A data warehouse is the computer system where the data from various databases and transactional systems is kept and organized. You’ll often see the term with an “enterprise” on the front, as you’ll need a large, enterprise-sized amount of data to need a data warehouse.
A database is data, organized so you can get easily what you need. Ever been to IMDB? Of course you have. That’s a database: movies, actors, directors, producers, all organized for easy searching, like when you need to cheat in a game of six degrees of Kevin Bacon.
That pic is before this explanation because it’s easier to show what a dashboard looks like.
For a formal definition: a dashboard is a visual representation of data you’re tracking. Your BI program absolutely needs to have a dashboard. You wouldn’t buy a car without a dashboard. The same goes for BI software.
When you’re shopping for BI software, make sure that your program’s dashboards have these two baseline criteria recommended by Gartner (paywall protected; worth it):
- “The ability to design dashboards with, at a minimum, basic chart types including tables, bar charts, line charts, area charts and pie charts without requiring third-party options, coding or scripting.”
- “What you see is what you get (WYSIWYG) design,” the ability to design a dashboard and analyze data without knowing how to code.
Drill down refers to the ability to take a general piece of information, like yearly sales figures, and drill down into by month, week, or even day. “Drill down” means you can narrow from the general to the particulars that often make the difference between information and insight. Drill down is sort of like the business intelligence version of that old “powers of ten” film.
ETL—or extract, transform, load—takes place between data collection and placing that data in the data warehouse.
The need to “extract” comes from that fact that data is collected in databases or ERP software before it gets to the data warehouse. The need to transform comes from the fact that those multiple data sources are often in different formats, and need to be transformed into the proper format to be stored and searched in the data warehouse. The need to load is self-explanatory; you’ve got to put it in the data warehouse before you can search and compare one data source against another.
Metadata is data about data. If that sounds, meta, it is…it’s metadata!
Metadata is information about your data. There are three categories:
- Technical: the technical details about your data, like its models, format, and measures.
- Business: descriptions of the data in user-friendly terms (i.e., plain English)
- Process: data that tells you what was done with which pieces of data, and when.
“Metric” is just a fancy word for whatever things you’re measuring.
Are you tracking your net profits? That’s a metric. Keeping an eye on how many people are using the BI software at your company? That’s a metric, too. Keeping an eye on that conversation rate? That’s a metric, as well. The trick with metrics is to pick the ones that are best for your company. Every company has different needs, and it’s a good idea to consider your needs and priorities when picking metrics.
A modern BI platform supports IT-enabled analytic content development. It is defined by a self-contained architecture that enables nontechnical users to autonomously execute full-spectrum analytic workflows from data access, ingestion and preparation to interactive analysis and the collaborative sharing of insights.
Simply put, modern BI puts the business user first. You won’t need to depend on someone from IT, or you’ll need to depend on them far less, to use a modern BI program. Where traditional, older BI programs were set up to only allow the IT folks to author content, for example, modern BI programs make it easy for business users to create content on their own.
Traditional business intelligence programs lean heavily on IT personnel. They usually require that users know SQL (a programming language, see below), and it takes far longer to get answers, as you’ve got to manually enter multiple queries in that language. As such, they’re far less agile, and experts like those at Gartner suggest buyers instead look for the sort of features found in modern BI programs.
Software as a service is a model where buyers purchase licenses to use software, rather than buying and installing it. Most SaaS software is done over the internet (i.e., in the cloud), which reduces the upfront costs of buying and installing. It also does away with the need to monitor the servers where the software’s kept; the SaaS company keeps track of any potential outages.
Slicing and dicing large datasets up to either look at the data from different perspectives, or look at certain parts in more detail. Slice and dice capabilities are what, for instance, allow you to check data by week, then month, then individual day. Instead of waiting for a report, slice and dice lets you take the initiative and check out the specific data when you need.
Pronounced “sequel,” SQL is a common programming language used to get information from databases. If you speak English, the database speaks SQL, and it will only know how to answer questions phrased that way. Unless, of course, your business intelligence software has natural language query (NLQ), which lets you ask questions the same way you would a search engine.
Terms you want to know…
Or that you think would benefit readers of this list? Let me know them in the comments below. Ideally, the comments section could become another place for people to request definitions, and me to provide them.
If you want to know how these terms can help you better, check out one of the options in Capterra’s business intelligence software directory, and reach out to a vendor.