What Is Fiber Optics?
October 29, 2024
Article
Learn Data Analysis for Big Data. Master using SQL for data analysis on distributed big data systems
Instructors: Glynn Durham
28,915 already enrolled
Included with
(1,176 reviews)
(1,176 reviews)
Add to your LinkedIn profile
Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
This Specialization teaches the essential skills for working with large-scale data using SQL.
Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you.
Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable.
To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines.
This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala.
Applied Learning Project
Each course in this Specialization includes a hands-on, peer-graded assignment. To earn the Specialization Certificate, you must successfully complete the hands-on, peer-graded assignment in each course. For this Specialization, there is not a separate Capstone Project like there is in some other Coursera Specializations.
Distinguish operational from analytic databases, and understand how these are applied in big data
Understand how database and table design provides structures for working with data
Appreciate how differences in volume and variety of data affects your choice of an appropriate database system
Recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis
Understand the basics of SELECT statements
Understand how and why to filter results
Explore grouping and aggregation to answer analytic questions
Work with sorting and limiting results
Use different tools to browse existing databases and tables in big data systems
Use different tools to explore files in distributed big data filesystems and cloud storage
Create and manage big data databases and tables using Apache Hive and Apache Impala
Describe and choose among different data types and file formats for big data systems
At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Earn a degree from world-class universities - 100% online
Upskill your employees to excel in the digital economy
Yes, the courses in this Specialization are intended to be taken in order:
A fourth course entitled Advanced SQL for Big Data Analysis is currently under development. When it is completed, it will be added to this Specialization.
To use the hands-on environment for the courses in this Specialization, you need to download and install a virtual machine and the software on which to run it. Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements: • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)
Successfully completing this Specialization confers a Coursera Specialization Certificate. This is different from the Cloudera Certified Associate (CCA) Data Analyst credential. You can earn the CCA Data Analyst credential by passing a 120-minute performance-based exam. For pricing and other details, see CCA Data Analyst. If you complete this Specialization, including the honors lessons, then you should be well prepared to take the certification exam, but we cannot guarantee that you will pass it and earn the certification credential.
Each course in this Specialization includes a hands-on, peer-graded assignment. To earn the Specialization Certificate, you must earn the Course Certificate for each course in this Specialization. This requires that you successfully complete the hands-on, peer-graded assignment in each course. For this Specialization, there is not a separate Capstone Project like there is in some other Coursera Specializations.
Please go to https://www.coursera.org/enterprise for more information, to contact Coursera, and to pick a plan. For each plan, you decide the number of courses each person can take and hand-pick the collection of courses they can choose from.
This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings and assignments anytime and anywhere via the web or your mobile device.
If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.
Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. If you only want to read and view the course content, you can audit the course for free. If you cannot afford the fee, you can apply for financial aid.
This Specialization doesn't carry university credit, but some universities may choose to accept Specialization Certificates for credit. Check with your institution to learn more.
Financial aid available,