1. Prerequisites
  2. Students are expected to have the following background:
    • Knowledge of basic computer science principles and skills, at a level sufficient to write a reasonably non-trivial computer program in Python/numpy. (CS106A or CS106B, CS106X.)
    • Familiarity with probability theory. (CS 109, MATH151, or STATS 116)
    • Familiarity with multivariable calculus and linear algebra (relevant classes include, but not limited to MATH 51, MATH 104, MATH 113, CS 205, CME 100.) The Stanford Math 51 course text can be found here.
  3. Friday TA Lectures
  4. To review material from the prerequisites or to supplement the lecture material, additional lectures led by TAs will be held 1:00 - 2:30 every Friday on Zoom. Links to the lectures will be on Canvas. Attending these lectures is optional, but encouraged.
  5. Honor Code
  6. We strongly encourage students to form study groups. Students may discuss and work on homework problems in groups. However, each student (or pair of students, if submitting as a pair) must write down the solution independently, and without referring to written notes from the joint session. Each student must understand the solution well enough in order to reconstruct it by themself. It is an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo. We run plagiarism-detection software on your code against past solutions as well as student submissions from previous years. Please take the time to familiarize yourself with the Stanford Honor Code and the Stanford Honor Code as it pertains to CS courses.
  7. Course Materials
  8. There is no required text for this course. Notes will be posted periodically on the class syllabus.
  9. Ed and Gradescope
  10. We use Ed for Q&A and Gradescope for assignment submission. Ed and Gradescope access will be granted after enrollment to the class as we periodically synchronize with the official course roster.
  11. Grading
  12. There will be four assignments, one midterm, and a final project. The assignments will contain written questions and questions that require some Python programming. The grading breakdown is as follows: assignments are collectively worth 45%, the midterm is worth 15%, and the final project is worth 40%. The assignments are weighted by their respective point values - for example, if {p1, p2, p3, p4} denotes the point values of each assignment, then HW1 is worth p1 / (p1+p2+p3+p4) * 45% of the total grade. This quarter's grading basis is letter grade or CR/NC. Please make sure on Axess that you are enrolled with your desired grading basis. We highly encourage students to answer each others’ questions on Ed. To incentivize this, we will be giving bonuses (applied after grade cutoffs have been determined) for sustained and helpful contributions; see Ed for the specific details.
  13. Submitting Assignments
  14. To limit access to the assignments to only enrolled students and to avoid having the solutions show up online publicly, the assignments will only be posted on Ed (not the course website nor Canvas). Assignments will be submitted through Gradescope. You will receive an invite to Gradescope for CS229 Machine Learning Fall 2021. If you have not received an invite email after the first few days of class, first log in to Gradescope with your @stanford.edu email and see whether you find the course listed; if not please post a private message on Ed for us to add you. All assignments must be submitted individually.

    This quarter, as in some past quarters of 229, we are allowing pair submissions for homeworks. You may submit solo or with a partner (in which case please submit only once on Gradescope, and add their name). This is unrestricted; you may do some or all assignments solo, or with the same (or different) partners.

    If you work with a partner, you must be sure you both understand all written material / code submitted. Don't just divide up the work; this will potentially put you at a disadvantage on the midterm and elsewhere in your AI/ML career! We strongly recommend working through each problem together.

    Regardless of whether you work solo or with a partner, you may discuss the homeworks at a high level with other students in the class. Talking through approaches is OK, but you may not e.g. directly trade answers; in particular, you must not look at any written work or code from anyone but your partner (if any).

  15. Late Assignments
  16. Each student will have a total of three free late (calendar) days to use for homeworks and the project proposal and milestone. Students cannot use late days on the project final report or poster. Once these late days are exhausted, any assignments turned in late will be penalized 20% per late day. However, no assignment will be accepted more than three days after its due date. Each 24 hours or part thereof that a homework is late uses up one full late day. Please note that late days are applied individually.
  17. Lecture Video Policy
  18. Lectures will be livestreamed on Zoom webinar. Please find the Zoom webinar link on Ed or the course Canvas page. You will need to sign in with your Stanford credentials to join the lecture. All lectures this quarter are recorded and will be posted on Canvas soon after the lecture is given. For your convenience, you can access these recordings by logging into the course Canvas site. These recordings might be reused in other Stanford courses, viewed by other Stanford students, faculty, or staff, or used for other education and research purposes. Note that while the cameras are positioned with the intention of recording only the instructor, occasionally a part of your image or voice might be incidentally captured. If you have questions, please contact a member of the teaching team.
  19. Midterm Policy
  20. The teaching staff will provide more details on the exam once it is finalized.
  21. Online Office hours
  22. The office hour schedule will be posted on the course Canvas page. We will be using Nooks to hold remote office hours this quarter.
  23. Incomplete Requests from Previous Quarter
  24. If you have an Incomplete from previous quarter and you wish to complete the course this quarter, please contact Christopher Wolff (cw0@stanford.edu) to notify us that you would like to complete CS229 this quarter.


  1. Difference between 3 and 4 units
  2. The class can be taken with 3 or 4 units for undergraduates and graduate students.There is no difference in workload between them. We set it up this way mainly to give people more flexibility, and you're welcome to pick either. We generally encourage students to register for 4, but if you'd rather do 3 for any reason (such as if you have a cap on your number of units), registering for 3 is fine too (you do not need to ask for approval). Also note that SCPD students may be required to take the course for 4 units; please check with SCPD.
  3. Is this the same class as the free machine learning class?
  4. No, that is a different class, which does not confer Stanford academic credit. You can learn more about it at www.ml-class.org.
  5. When will solutions for problem sets be released?
  6. Solutions will be released after problem sets have been graded and around the same time as grades are published. For HW0, solutions will be released soon after the submission deadline.
  7. Can I take courses that overlap with CS229?
  8. Yes, lectures will be recorded. If you require an instructor’s signature, please reach out to course staff on Ed and we’ll be able to help get you the signatures.
  9. Why am I seeing an outdated webpage with information from previous quarters?
  10. We try our best to keep the website up-to-date starting from a few days before the quarter starts. You might want to force reload the page and override local cache:
      On Mac, use Command + Shift + R.
      On Windows/Linux, use Ctrl + Shift + R.
  11. How I should ask for TAs to help me debug the code:
  12. Please note that the teaching staff will not debug code longer than 2-3 lines via Ed. Learning to debug is a critical skill for software programmers, and remote requests for help with code usually end up with the teaching staff giving you the answer rather than you learning how to find the answer.
    Moreover, since programming at the level of CS106A/B is a prerequisite for this course and the course’s focus is on machine learning techniques rather than coding, the TAs are discouraged from helping you look at and debug large blocks of your code during the office hours. The TAs are also generally discouraged from helping debug compilation errors.
    The best way to use office hours and ask TAs for coding questions would be:
    • You should come to office hours having done your own legwork and ruled out basic logical errors. Identify the place where the error is suspected to come from by doing ablation studies. (Please see below for some common debugging tips.)
    • During the office hours, you should articulate what your goals are and what you have observed with experiments, what you have tried/observed, what you think might be the problem, and what advice you need to move forward.
    • The TAs will mostly help you by looking at and analyzing the outputs of your code instead of looking at the original code. Typical advice that the TAs might offer would be to ask you to do more analytical or ablation studies about your code. For example, when you observe that your test error doesn’t decrease when training for longer, the TAs might ask you to check if your training error decreases. If your training error does not decrease, then the TAs might ask to check if the gradient of your algorithm is implemented correctly.
    Here are some common debugging strategies that might be useful (courtesy of CS221)
    • Construct small test cases that you have worked through by hand and see if your code matches the manual solution.
    • Spend some time understanding exactly what the test cases are doing and what outputs they are expecting from your code.
    • If possible, write your code in small chunks and test that each part is doing exactly what you expect.
    • PDB is the default python debugger. It is very helpful and allows you to set breakpoints. You can set a breakpoint with the following line: import pdb; pdb.set_trace()
    • Printing the state of your computation frequently can help you make sure that things are working as expected and can help you narrow down which portion of your code is causing the bug you are seeing, e.g. print(“var1 has current value: {}“.format(var))
    Debugging tips for timeouts:
    • Set operations in general are pretty slow, so if you have any see if you can do them in some other way.
    • Check if all loops / linear operations are necessary. For example, with searching through a list for a specific item, sometimes you can make that constant time by giving each of them an ID (say 0, 1, 2, 3) and then using a dictionary as a cache (although sometimes you just have to live with the cost).
    • If you have a specific helper function you’re calling a lot, see if there’s anything in there you can optimize.
    • Vectorize your code! Nested for loops are too slow to solve some of the problems in this course. Convert your code to matrix operations so that numpy can run as efficiently as possible.
    Other debugging tips:
    • If you don’t know what type a variable is, use type(.)
    • If you are running into issues where “None” pops up, a function may not be returning what you are expecting.
    • For indexing into lists: example_list[a: b] is INCLUSIVE for a but EXCLUSIVE for b
    • If a function has optional arguments, make sure you are feeding in the proper arguments in the proper places (very easy to mess up)
    • Since python 3.6, you can use f-strings for printing debug messages, rather than format.
    • Because of broadcasting and other implicit operations, it's useful to assert shapes of np arrays (and tensors for deep learning) after each operation that can change the shape.