As technology evolves faster than most organizations can adapt, it’s clear that data-centric companies will devour their competitors.
For most organizations, however, being data-centric requires increasingly specialized skills to be able to connect disparate and unwieldy sets of data and derive value from it. Most organizations (or institutions) are collecting data like crazy—from multiple legacy systems and numerous modern tools and apps:
Despite the growing number of tools, only 4% of companies—across industries—are using that data to its fullest potential (Bain and Company).
This digital Darwinism doesn’t just threaten businesses. Universities also face their own set of challenges. For example, only 56% of students graduate within six years. Why? Well, not because they’re failing. More than 40 percent of dropouts left their studies with at least a B average. So universities, like businesses, are using tools to collect data to get to the bottom of this question and many others. And in the midst of growing skepticism over the affordability and necessity of college, those who can figure out what causes students to drop out have the opportunity to attract—and retain—more students.
Of course, having data is not the same as having insights. Even the best analytics tools are of little use unless their data can be consolidated with every other data source and analyzed based on a single source of truth. The first step to making data valuable is centralizing it. Which is why every college needs its own very secure data warehouse.
Data Warehousing 101
Data warehousing is a technology that combines structured data from numerous sources, campus-wide and beyond. It not only houses the data securely for analysis, but it puts data into a uniform format and removes duplicate data points. Called normalization, there are a few factors to consider when deciding how you want data to “live” in your data warehouse: speed, storage and ease of use. But once you choose the form best for you, your data warehouse will be a reliable source of clean, organized, quality data.
It may be worth noting here that data warehouses differ from databases in that they use a different design. While databases are optimized to maintain strict accuracy of data in the moment, data warehouses are designed to give a long-range view of data over time (including all the data from the database). They are a single, actionable, comprehensive source of truth for business intelligence.
How to Begin
Like with any big data tool, when it comes to setting up your data warehouse, a myriad of options exist. Deciding which platform to choose is an entire post (and maybe a series) in and of itself. If you’re not sure how to begin, here’s a great article about the criteria to use when selecting a platform for data warehousing, as well as some of the most popular options for performance, scalability, reliability, security, backup and recovery and more (full disclosure: it’s written by a smart warehouse product called Panoply and overtly biased, but a lot can be learned as they compare and contrast their product to others based on individual criterion).
Here are some of the top data warehousing platforms and why people choose them:
- Amazon Redshift – Fast, managed, petabyte-scale data warehouse
- Google BigQuery – Fully managed, NoOps, low cost data analytics
- IBM DashDB – Fully managed cloud data warehousing service
- SAP Hana – In-memory, columnar, relational database system
- Snowflake – Fast, fully managed, built for the cloud
Once you choose which warehouse to use, you’re ready to fill it with data. Worth noting: more isn’t always better. Consider Steven Covey’s adage and “begin with the end in mind.” What are the questions you want to answer? Maybe you want to know:
- Percentage of freshman students who return the following year
- Percentage of graduating students relative to the total number of students
- Total operating expenditures per full-time student
- Total number of enrolled students and the number of hours enrolled per concentration
- Level of student participation in intramural athletics and sports clubs
To answer those five questions, identify the data points that must be present and add external data sets, if necessary. Say you’re determining operating expenditures. Are all sources of operating revenue included, from student income to useable gifts? Would it be helpful to have historical operating expenditures for analysis? What about future projections with embedded business logic?
When you determine what you want to know and what data you need to discover it, add it to your data warehouse. Then the answer is at your fingertips.
Visualizing Your Data with Dashboards
Once set up, any data warehouse can be easily queried, but connecting your data warehouse to a dashboard enables clear visualization of data. If you have somebody with the technical skills to craft your own dashboard, there are some great options built on open source software (free to use and tailor to your needs). Or, there are other great visualization options, for varying prices and with built-in analytics capabilities, like:
- Tableau – Fast, embedded analytics and interactive visualizations
- Yellowfin – Relational databases and multi-dimensional cubes
- GoodData – SaaS business intelligence and analytics
- Sisense – Analyze and visualize complex datasets
- Periscope – Unify business data from multiple sources
- Chartio – Analyze critical data through an intuitive interface
- Looker – Make data accessible to the organization
- Mode – Provide online services for analyzing data
Take Tableau, for instance. Once it’s connected to your data warehouse, you can create multiple dashboards for use across your campus.
To understand use of space…
Identify donation patterns…
Analyze enrollment trends …
…or whatever else you need to know. While many of these tools do fantastic analytics on their own, hooking them up to a data warehouse with all data sets safely stored makes for more reliable, more robust queries.
Benefits of Data Warehousing
The obvious benefit to having the data to answer your questions and make essential decisions—not to mention visualize results!—is really just the tip of the iceberg. With data accessible, you can also begin to experiment with new technologies, like predictive analytics, real-time analytics, artificial intelligence and more.
Others are already treading these waters, like Georgia State University (GSU), who partnered with AdmitHub, to launch a university chatbot that supplies on-demand answers to applicant and student questions. Students can ask anything, from “When is my financial aid application due?” to “Can my dog live in the dorm with me?” and get answers on demand.
GSU also uses real-time student data to create trigger systems. If a student gets an unsatisfactory grade in his or major, doesn’t take a required course at the right time or signs up for a class outside his or major, an academic advisor gets an alert. Trigger systems can be based on past student data too, like at Marist College and Purdue University. Within two weeks of the start of a course, they can use historical data to predict whether a student will complete a course successfully. Then a trigger alerts a faculty member who can intervene.
As more and more universities begin to use progressive technologies, the use cases will become infinitely creative. But what are the real benefits to being prepared to use progressive technologies? According to GSU, Marist and Purdue, enhanced recruiting, better enrollment, higher grades, increased retention and graduation rates and closing the graduation gap for low-income, first-generation and minority students, to name a few.
Make Your Data Warehouse a Home
Getting to these benefits first requires getting started with data warehousing, which is a little like buying a physical house. Shop around, buy the house, clean it up, move in and decorate it just the way you like it. Then keep it safe with a new fence and maybe install a state-of-the-art security system. After that, it’s yours to show off or keep private and come and go freely, as you please.
Same with data warehousing. Choose the warehouse that best fits you, what data you want inside and how to visualize it. Then add your own security requirements to protect social security numbers, financial and health information or other sensitive details. This includes classifying data to restrict more delicate information or limiting some users to read-only access to certain data. Safety is of the utmost importance, so set the parameters that are right for you.
And at that point, the data warehouse and everything inside should be completely yours. Some companies provide secure data services, but charge a huge fee to connect and analyze data—and an even bigger fee to access your own data sets, if they make them accessible to you at all. With all the great storage and analytics options available, there’s no reason your information shouldn’t stay in your hands, for a reasonable cost.
Of course, as with any relocation, setting up your data warehouse can be pretty time-consuming. The data engineering necessary to fetch the data sets you need, then continually extract, transform and load (ETL) them into their new home costs most analysts “the bulk” of their time. If you want to get started quickly, without the ongoing headache of data preparation, Astronomer is a data engineering platform that handles the heavy lifting. You have full access to connect any data source you need. And while data flows through Astronomer’s platform, full control and ownership stays with you.
Ready to get started? (TL;DR)
To keep up with big data and today’s technology, it’s important to have your data stored securely, visualized meaningfully and owned and managed by those who need it to make decisions. If you don’t warehouse data yet, here’s a checklist to get you started right away:
- Determine your KPIs
- Figure out what data you need to answer them
- Choose your data warehouse
- Pick a dashboard and/or method of querying works for your team
- Connect and route data with a data engineering platform
- Reap the benefits!