• For Individuals
  • For Businesses
  • For Universities
  • For Governments
Coursera
Log In
Join for Free
Coursera
University of London
Foundations of Data Science: K-Means Clustering in Python
  • About
  • Modules
  • Recommendations
  • Testimonials
  • Reviews
  1. Browse
  2. Data Science
  3. Machine Learning
University of London

Foundations of Data Science: K-Means Clustering in Python

Dr Matthew Yee-King
Dr Betty Fyn-Sydney
Dr Jamie A Ward

Instructors: Dr Matthew Yee-King

Instructors

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

4.7 (308 ratings)
Dr Matthew Yee-King
Dr Matthew Yee-King
University of London
21 Courses•422,493 learners
Dr Betty Fyn-Sydney
Dr Betty Fyn-Sydney
University of London
1 Course•76,030 learners
Dr Jamie A Ward
Dr Jamie A Ward
University of London
2 Courses•77,714 learners
Dr Larisa Soldatova
Dr Larisa Soldatova
University of London
1 Course•76,030 learners

76,030 already enrolled

Included with Coursera Plus

•Learn more
5 modules
Gain insight into a topic and learn the fundamentals.
4.6

(724 reviews)

Beginner level

Recommended experience

Recommended experience

Beginner level

You will need mathematical and statistical knowledge and skills at least at high-school level.

Flexible schedule
Approx. 29 hours
Learn at your own pace
95%
Most learners liked this course

5 modules
Gain insight into a topic and learn the fundamentals.
4.6

(724 reviews)

Beginner level

Recommended experience

Recommended experience

Beginner level

You will need mathematical and statistical knowledge and skills at least at high-school level.

Flexible schedule
Approx. 29 hours
Learn at your own pace
95%
Most learners liked this course
  • About
  • Modules
  • Recommendations
  • Testimonials
  • Reviews

What you'll learn

  • Define and explain the key concepts of data clustering

  • Demonstrate understanding of the key constructs and features of the Python language.

  • Implement in Python the principle steps of the K-means algorithm.

  • Design and execute a whole data clustering workflow and interpret the outputs.

Skills you'll gain

  • NumPy
  • Machine Learning
  • Machine Learning Algorithms
  • Matplotlib
  • Data Analysis
  • Probability & Statistics
  • Data Manipulation
  • Descriptive Statistics
  • Statistics
  • Pandas (Python Package)
  • Unsupervised Learning
  • Data Visualization
  • Python Programming
  • Data Science

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

39 assignments

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business
 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

There are 5 modules in this course

Organisations all around the world are using data to predict behaviours and extract valuable real-world insights to inform decisions. Managing and analysing big data has become an essential part of modern finance, retail, marketing, social science, development and research, medicine and government.

This MOOC, designed by an academic team from Goldsmiths, University of London, will quickly introduce you to the core concepts of Data Science to prepare you for intermediate and advanced Data Science courses. It focuses on the basic mathematics, statistics and programming skills that are necessary for typical data analysis tasks. You will consider these fundamental concepts on an example data clustering task, and you will use this example to learn basic programming skills that are necessary for mastering Data Science techniques. During the course, you will be asked to do a series of mathematical and programming exercises and a small data clustering project for a given dataset.

This week we will introduce you to the course and to the team who will be guiding you through the course over the next 5 weeks. The aim of this week's material is to gently introduce you to Data Science through some real-world examples of where Data Science is used, and also by highlighting some of the main concepts involved.

What's included

9 videos4 assignments3 discussion prompts

9 videos•Total 21 minutes
  • Welcome and Introduction•2 minutes•Preview module
  • Introduction to Data Science•2 minutes
  • What is Data?•1 minute
  • Types of Data•1 minute
  • Machine Learning•3 minutes
  • Supervised vs Unsupervised Learning•2 minutes
  • K-Means Clustering•4 minutes
  • Preparing your Data•1 minute
  • A Real World Dataset•0 minutes
4 assignments•Total 100 minutes
  • Types of Data – Review Information•15 minutes
  • Supervised vs Unsupervised – Review Information•15 minutes
  • K-Means Clustering – Review Information•30 minutes
  • Week 1 Summative Assessment•40 minutes
3 discussion prompts•Total 270 minutes
  • Welcome!•30 minutes
  • Examples of Data•120 minutes
  • Machine Learning in the News•120 minutes

What's included

11 videos4 readings10 assignments1 peer review1 ungraded lab

11 videos•Total 36 minutes
  • 2.0: Week 2 Introduction•0 minutes•Preview module
  • 2.1 – Introduction to Mathematical Concepts of Data Clustering•1 minute
  • 2.2 – Mean of One Dimensional Lists•2 minutes
  • 2.3 – Variance and Standard Deviation•3 minutes
  • 2.4 Jupyter Notebooks•6 minutes
  • 2.5 Variables•4 minutes
  • 2.6 Lists•4 minutes
  • 2.7 Computing the Mean•3 minutes
  • 2.8 Better Lists: NumPy•3 minutes
  • 2.9 Computing the Standard Deviation•6 minutes
  • Week 2 Conclusion•0 minutes
4 readings•Total 50 minutes
  • Population vs Sample, Bias•10 minutes
  • Variability, Standard Deviation and Bias•10 minutes
  • Python Style Guide•10 minutes
  • Numpy and Array Creation•20 minutes
10 assignments•Total 122 minutes
  • Population vs Sample – Review Information•5 minutes
  • Mean of One Dimensional Lists – Review Information•3 minutes
  • Variance and Standard Deviation – Review Information•4 minutes
  • Jupyter Notebooks – Review Information•20 minutes
  • Variables – Review Information•10 minutes
  • Lists – Review Information•10 minutes
  • Computing the Mean – Review Information•10 minutes
  • Better Lists – Review Information•10 minutes
  • Computing the Standard Deviation – Review Information•10 minutes
  • Week 2 Summative Assessment•40 minutes
1 peer review•Total 30 minutes
  • Use Jupyter Notebooks•30 minutes
1 ungraded lab•Total 15 minutes
  • Jupyter Notebook Environment•15 minutes

What's included

16 videos10 readings15 assignments

16 videos•Total 52 minutes
  • Week 3 Introduction•0 minutes•Preview module
  • 3.1 Multidimensional Data Points and Features•2 minutes
  • 3.2 Multidimensional Mean•2 minutes
  • 3.3 Dispersion: Multidimensional Variables•3 minutes
  • 3.4 Distance Metrics•5 minutes
  • 3.5 Normalisation•1 minute
  • 3.6 Outliers•1 minute
  • 3.7 Basic Plotting•2 minutes
  • 3.7a Storing 2D Coordinates in a Single Data Structure•6 minutes
  • 3.8 Multidimensional Mean•4 minutes
  • 3.9 Adding Graphical Overlays•5 minutes
  • 3.10 Calculating the Distance to the Mean•3 minutes
  • 3.11 List Comprehension•3 minutes
  • 3.12 Normalisation in Python•5 minutes
  • 3.13 Outliers and Plotting Normalised Data•2 minutes
  • Week 3 Conclusion•0 minutes
10 readings•Total 120 minutes
  • Multidimensional Data Points and Features Recap•10 minutes
  • Multidimensional Mean Recap•10 minutes
  • Multidimensional Variables Recap•10 minutes
  • Distance Metrics Recap•10 minutes
  • Normalisation Recap•10 minutes
  • Note on Matplotlib•10 minutes
  • Matplotlib Scatter Plot Documentation•20 minutes
  • Matplotlib Patches Documentation•10 minutes
  • List Comprehension Documentation•20 minutes
  • 3.12 Errata•10 minutes
15 assignments•Total 290 minutes
  • Multidimensional Data Points and Features – Review Information•3 minutes
  • Multidimensional Mean – Review Information•3 minutes
  • Dispersion: Multidimensional Variables – Review Information•5 minutes
  • Distance Metrics – Review Information•6 minutes
  • Normalisation – Review Information•3 minutes
  • Outliers – Review Information•30 minutes
  • Basic Plotting – Review Information•5 minutes
  • Storing 2D Coordinates – Review Information•30 minutes
  • Multidimensional Mean – Review Information•30 minutes
  • Adding Graphical Overlays – Review Information•30 minutes
  • Calculating Distance – Review Information•30 minutes
  • List Comprehension – Review Information•30 minutes
  • Normalisation in Python – Review Information•30 minutes
  • Outliers – Review Information•30 minutes
  • Week 3 Summative Assessment•25 minutes

What's included

8 videos6 readings7 assignments1 peer review

8 videos•Total 36 minutes
  • Week 4 Introduction•0 minutes•Preview module
  • 4.1: Using the Pandas Library to Read csv Files•5 minutes
  • 4.1a: Sorting and Filtering Data Using Pandas•8 minutes
  • 4.1b: Labelling Points on a Graph•4 minutes
  • 4.1c: Labelling all the Points on a Graph•3 minutes
  • 4.2: Eyeballing the Data•5 minutes
  • 4.3: Using K-Means to Interpret the Data•8 minutes
  • Week 4: Conclusion•0 minutes
6 readings•Total 60 minutes
  • Week 4 Code Resources•5 minutes
  • Pandas Read_CSV Function•15 minutes
  • More Pandas Library Documentation•10 minutes
  • The Pyplot Text Function•10 minutes
  • For Loops in Python•10 minutes
  • Documentation for sklearn.cluster.KMeans•10 minutes
7 assignments•Total 75 minutes
  • Using the Pandas Library to Read csv Files – Review Information•5 minutes
  • Sorting and Filtering Data Using Pandas – Review Information•10 minutes
  • Labelling Points on a Graph – Review Information•5 minutes
  • Labelling all the Points on a Graph – Review Information•5 minutes
  • Eyeballing the Data – Review Information•5 minutes
  • Using K-Means to Interpret the Data – Review Information•5 minutes
  • Week 4 Summative Assessment•40 minutes
1 peer review•Total 60 minutes
  • Create a Labelled Plot of the Happiness Data•60 minutes

What's included

9 videos3 readings3 assignments3 peer reviews5 discussion prompts

9 videos•Total 29 minutes
  • Introduction to Week 5•1 minute•Preview module
  • 5.1 Can a Machine Detect Fake Notes?•1 minute
  • 5.2 Working for a Client•4 minutes
  • 5.3 How to Organize Work on Your Project•3 minutes
  • 5.4 Dealing With Difficulties•3 minutes
  • 5.5 No Data no Data Science: Introduction of the Dataset•4 minutes
  • 5.6 Modelling•4 minutes
  • 5.7 Presenting the Project Results•3 minutes
  • 5.8 Concluding Remarks•1 minute
3 readings•Total 25 minutes
  • Week 5 Code Resource – the Dataset for our Project•10 minutes
  • Saving plt.scatter Outputs as Figures•10 minutes
  • Additional Recommended Reading for Week 5•5 minutes
3 assignments•Total 44 minutes
  • How Would You Help? – Review Information•10 minutes
  • Python – Review Information•4 minutes
  • Week 5 Summative Assessment•30 minutes
3 peer reviews•Total 180 minutes
  • Exploratory Data Analysis•60 minutes
  • Clustering•60 minutes
  • Your Report•60 minutes
5 discussion prompts•Total 130 minutes
  • What Is Required to Train a Machine to Detect Fake Notes?•40 minutes
  • Your Project Plan•60 minutes
  • Self-reflection•10 minutes
  • Tips for Other Learners•10 minutes
  • Do You have Data Science Plans?•10 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

4.7 (308 ratings)
Dr Matthew Yee-King
Dr Matthew Yee-King
University of London
21 Courses•422,493 learners
Dr Betty Fyn-Sydney
Dr Betty Fyn-Sydney
University of London
1 Course•76,030 learners

Instructors

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

4.7 (308 ratings)
Dr Matthew Yee-King
Dr Matthew Yee-King
University of London
21 Courses•422,493 learners
Dr Betty Fyn-Sydney
Dr Betty Fyn-Sydney
University of London
1 Course•76,030 learners
Dr Jamie A Ward
Dr Jamie A Ward
University of London
2 Courses•77,714 learners
Dr Larisa Soldatova
Dr Larisa Soldatova
University of London
1 Course•76,030 learners

Offered by

University of London

Offered by

University of London

The University of London is a federal University which includes 17 world leading Colleges. With extensive experience in distance learning since 1858, University of London has enriched the lives of thousands of students, delivering high quality degrees across the globe. Today, University of London is a global leader in flexible study, offering degree programmes to over 45,000 students in over 190 countries, delivering world-leading research across the world. To find out more about University of London, visit www.london.ac.uk

Goldsmiths, University of London

Offered by

Goldsmiths, University of London

Championing research-rich degrees that provoke thought, stretch the imagination and tap into tomorrow’s world, at Goldsmiths we’re asking the questions that matter now in subjects as diverse as the arts and humanities, social sciences, cultural studies, computing, and entrepreneurial business and management. We are a community defined by its people: innovative in spirit, analytical in approach and open to all.

Explore more from Machine Learning

  • Status: Free Trial
    Free Trial
    U

    University of London

    Statistics and Clustering in Python

    Course

  • Status: Free Trial
    Free Trial
    U

    University of Colorado Boulder

    Data Analysis with Python

    Specialization

  • Status: Free Trial
    Free Trial
    U

    University of California, Irvine

    Data Science Fundamentals

    Specialization

  • P

    Packt

    Cluster Analysis and Unsupervised Machine Learning in Python

    Course

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.6

724 reviews

  • 5 stars

    73.06%

  • 4 stars

    19.61%

  • 3 stars

    4.41%

  • 2 stars

    1.10%

  • 1 star

    1.79%

Showing 3 of 724

T
TF
5

Reviewed on Jul 11, 2022

I learnt alot, a very good foundation course. It made me have more interest in learning more in Data Science particularly using Python language

A
AB
5

Reviewed on Jun 3, 2019

This course is at right level for a beginner (python and analytics) while going into details around K means clustering

M
MM
5

Reviewed on Jun 28, 2020

Very interesting course! The lecturers explain concepts thoroughly which makes the concepts easy to understand even for people without much knowledge in Data Science

View more reviews
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

  • The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

  • The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

You will be eligible for a full refund until two weeks after your payment date, or (for courses that have just launched) until two weeks after the first session of the course begins, whichever is later. You cannot receive a refund once you’ve earned a Course Certificate, even if you complete the course within the two-week refund period. See our full refund policyOpens in a new tab.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

More questions

Visit the learner help center

Financial aid available,

Coursera Footer

Technical Skills

  • ChatGPT
  • Coding
  • Computer Science
  • Cybersecurity
  • DevOps
  • Ethical Hacking
  • Generative AI
  • Java Programming
  • Python
  • Web Development

Analytical Skills

  • Artificial Intelligence
  • Big Data
  • Business Analysis
  • Data Analytics
  • Data Science
  • Financial Modeling
  • Machine Learning
  • Microsoft Excel
  • Microsoft Power BI
  • SQL

Business Skills

  • Accounting
  • Digital Marketing
  • E-commerce
  • Finance
  • Google
  • Graphic Design
  • IBM
  • Marketing
  • Project Management
  • Social Media Marketing

Career Resources

  • Essential IT Certifications
  • High-Income Skills to Learn
  • How to Get a PMP Certification
  • How to Learn Artificial Intelligence
  • Popular Cybersecurity Certifications
  • Popular Data Analytics Certifications
  • What Does a Data Analyst Do?
  • Career Development Resources
  • Career Aptitude Test
  • Share your Coursera Learning Story

Coursera

  • About
  • What We Offer
  • Leadership
  • Careers
  • Catalog
  • Coursera Plus
  • Professional Certificates
  • MasterTrack® Certificates
  • Degrees
  • For Enterprise
  • For Government
  • For Campus
  • Become a Partner
  • Social Impact
  • Free Courses
  • ECTS Credit Recommendations

Community

  • Learners
  • Partners
  • Beta Testers
  • Blog
  • The Coursera Podcast
  • Tech Blog

More

  • Press
  • Investors
  • Terms
  • Privacy
  • Help
  • Accessibility
  • Contact
  • Articles
  • Directory
  • Affiliates
  • Modern Slavery Statement
  • Do Not Sell/Share
Learn Anywhere
Download on the App Store
Get it on Google Play
Logo of Certified B Corporation
© 2025 Coursera Inc. All rights reserved.
  • Coursera Facebook
  • Coursera Linkedin
  • Coursera Twitter
  • Coursera YouTube
  • Coursera Instagram
  • Coursera TikTok
Coursera

Welcome back

​
Your password is hidden
​

or

New to Coursera?


Having trouble logging in? Learner help center

This site is protected by reCAPTCHA Enterprise and the Google Privacy Policy and Terms of Service apply.