Vantage enables data analytics at scale.

What Working “at Scale” Really Means

Rob Armstrong
Rob Armstrong
2019年6月25日 3 最小阅读

Solving the ten hundred thousand problem

Things seem to come in waves, and in the past few weeks I have had the same conversation at least seven times. I call it solving the ten hundred thousand problem (and that is not the same as a million). What this means is that if you have 10 users, running 10 queries, involving 10 rows or tables it is not that hard to manage. Now increase the scale to hundreds instead of tens and you may get by with a squad of good DBAs and sysadmins. Keep going to thousands (and even millions!!) and it goes beyond the ability of people to manage the environment.

When you need to move from the departmental solution or POC to “operational and production systems working at scale” you encounter three main challenges: optimizing the individual queries, managing the diverse workload, and overall system monitoring. Let’s take a look at each of these in a bit of detail.

Vantage is built upon the Teradata database, which has the richest optimizer for analytic workloads. It takes that foundation and has expanded it to include optimization with machine learning & graph functions as well as incorporating data from other systems.

Optimizer

The first challenge is that each individual query has to be effectively optimized every time it runs. That means that you cannot rely on people to provide hints or expect that users will write the best code or think that a business tool that generates generic SQL is best suited for every platform. The optimizer must be fully aware of data relationships, fully leverage parallelism, and be able to execute snippets of a query that in turn can be used to better optimize the next steps in that query.

Teradata Vantage is built upon the Teradata database, which has the richest optimizer for analytic workloads. Vantage takes that foundation and has expanded it to include optimization with machine learning and graph functions as well as incorporating data from other systems.

Workload Management

Once each query has been optimized, the workload as a whole needs to be managed. Not every query carries the same importance. There are web access queries that have stringent service levels that must be met, or customer experience suffers. The other end of the spectrum are long running queries of a “what if” nature where the run time is not as much of a concern, as long as insight can be gained from the completed query.

Of course, the mix along this spectrum is constantly changing. Some hours are heavily skewed to the tactical while other times are skewed to the strategic. As concurrency grows, overlapping priorities and conflicting requirements will become a problem to manage. 

Teradata Vantage has world class workload management whereby rules and service levels are broadly defined, and the system takes over from there to ensure all workloads are receiving the proper resources to satisfy the user expectations.

System Monitoring

While it is good to have queries optimized and workloads managed, there is the broader challenge of keeping the whole environment running smoothly as a single system. This would include message traffic, monitoring, error recovery, space management, as well as continuity in systems where loads and queries are happening all day. To add to the challenge is a concept like failover where multiple systems are being kept in synch for highly operational environments.

No small task indeed and this is where Teradata Vantage again leverages the rich experience of the past. Excelling not only in single system management but bringing a full suite of tools within IntelliSphere to simplify and automate multi-system ecosystems.

Conclusion

There is a big difference between running a POC and running “at scale.” There is also a difference between a departmental or point solution and running across your enterprise with consistent and integrated data. The worst time to understand the limitations of a solution is once value is shown and you need to grow by adding more users, more data, and more queries.

Teradata Vantage brings the power of all the above in addition to unparalleled scalability. And as new advanced analytics engines are added into the mix, Teradata will be working to bring the same rigor to those operations as well.

All of this combines to solve the ”ten hundred thousand” problem, allowing your business to go from insight to production without worry and to drive millions to your bottom line.

主题:

关于我们 Rob Armstrong

Since 1987, Rob has contributed to virtually every aspect of the data warehouse and analytical arenas. Rob’s work has been dedicated to helping companies become data-driven and turn business insights into action. Currently, Rob works to help companies not only create the foundation but also incorporate the principles of a modern data architecture into their overall analytical processes.

查看所有帖子 Rob Armstrong

随时了解情况

订阅 Teradata 的博客,获取每周向您提供的见解



我同意作为本网站提供商的Teradata天睿公司可能偶尔向我发送Teradata市场沟通电子邮件,其中包含有关产品、数据分析、活动和网络研讨会邀请的信息。我了解我可以随时通过点击我收到的任何电子邮件底部的取消订阅链接取消订阅。

您的隐私很重要。您的个人信息将根据Teradata全球隐私政策收集、存储和处理,您可以通过单击此隐私链接阅读和打印。

从 Teradata 查看更多信息