W Randolph Franklin home page
... (old version) Login

Quick links


  1. Lectures
  2. Homeworks: Homework 1, Homework 2, Homework 3, Homework 4, Homework 5, Homework 6.

1.  Course content

1.1  Catalog description

A computer engineering course. Engineering techniques for parallel processing. Providing the knowledge and hands-on experience in developing applications software for processors on inexpensive widely-available computers with massively parallel computing resources. Multithread shared memory programming with OpenMP. NVIDIA GPU multicore programming with CUDA and Thrust. Using NVIDIA gaming and graphics cards on current laptops and desktops for general purpose parallel computing using linux.

Mon and Thurs 4-5:20. 3 credits.

Instructor: W. Randolph Franklin.


This is a new experimental course to provide students with knowledge and hands-on experience in developing applications software for processors with massively parallel computing resources. Specifically, this course will target NVIDIA GPUs because of their low cost (useful gaming cards cost only a few hundred dollars), and ubiquity (a majority of modern desktops and laptops have NVIDIA GPUs). The techniques learned here will also be applicable to larger parallel machines -- number 2 on the top 500 list has 18,688 NVIDIA GPUs.

Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors. The target audiences of the course are students who want to develop exciting applications for these processors, as well as those who want to develop programming tools and future implementations for these processors.

Students will learn tools such as OpenMP, CUDA, and Thrust via extensive programming work.

The target audiences are ECSE seniors and others with comparable background who wish to develop parallel software.

Prereq: ECSE-2660 CANOS or equivalent, knowledge of C++, access to a computer with NVIDIA GPU running CUDA 2.1 or newer.

This course will draw on Programming Massively Parallel Processors with CUDA

1.2  Why (Not) To Take This Course

Since you're spending a lot of money to take this course, you need to know some keys to success (or the alternative). Here are some indications that other courses might be a better fit.

  1. You don't like programming.
  2. You don't like documenting your programs.
  3. You don't like math.
  4. You don't like reading.

OTOH, here are some reasons that you might prefer to take a course from me.

  1. I acknowledge that you are simultaneously taking several courses, and so try to make the workload fair. E.g., if you're taking 6 3-credit courses, then you should not be required to spend more than {$\frac{168}{6} $} hours per week per course :-).
  2. I acknowledge that you're paying a lot of money for this course, and try to provide value. One such way is by digesting a lot of badly organized online material to give you the best.
  3. I keep the course up-to-date and relevant.

2.  Instructor

2.1  Professor

W. Randolph Franklin. BSc (Toronto), AM, PhD (Harvard)

I've been programming since the 1960s, and parallel programming since the 1980s. I graduated two PhD students in parallel computational geometry around 1990. I've been at RPI since 1978, apart from several absences, including a year at Berkeley, 3 months at Genoa (Italy), and shorter times at Laval University in Quebec City (Canada), the Commonwealth Scientific and Industrial Research Organization in Canberra (Australia), and the National University of Singapore. I also spent 2 years 7 months as Director of the Numeric, Symbolic, and Geometric Computation Program at the National Science Foundation, recommending how to spend about $30M of your tax dollars (thanks!).

A recent funded research project is (together with Cutler and Zimmie) modeling how levees fail when overtopped during a flood.

Another recent completed research project was on representing terrain elevation, and compressing it, and siting observers and planning paths on it, was largely supported by the Defense Advanced Research Projects Agency. DARPA people are crazy. My main worry was that I wasn't crazy enough for them.

This last summer, I was sponsored by FAPEMIG, the science funding agency of the state of Minas Gerais in Brazil to spend a month in Brazil working with researchers at various universities.

I also like to examine terrain on foot; in summer 2008 I walked 164km, including 11km up, from Chamonix to Zermatt, in 12 days. I spent July 2009 visiting universities in Brazil, with a few days kayaking down a tributary of the Amazon, sleeping in a hammock tied to trees, and hiking for hours through the jungle.

Office Jonsson Engineering Center (JEC) 6026
Phone +1 (518) 276-6077 (forwards)
Email mailto: mail@wrfranklin.org

Writing from an account showing your name, at least in the comment field, and prefixing the subject with a hashtag of #par are helpful. GPG is welcomed.

Web http://wrfranklin.org/
Office hours After each lecture, usually as long as anyone wants to talk. Also by appointment.
Informal meetings If you would like to lunch with me, either individually or in a group, just mention it. We can then talk about most anything.
Preferred communication medium Email.

3.  Course material

3.1  Text

Sanders and Kandrot, CUDA by example. It gets excellent reviews. I didn't list in with the RPI bookstore because Amazon has so many options, including Kindle and renting hardcopies.

3.2  Web

There is a lot of free material on the web, which I'll reference. My local cache is here.

4.  Computer systems used

This course will use your personal computer that runs 64-bit linux and has a recent Nvidia GPU.

You may also have remote access to my lab computer, which has dual 8-core Xeons with 128GB of memory and K20x and K5000 Nvidia processors.

5.  Relation to other RPI courses

I try not to duplicate existing RPI courses. Parallel computing is such a large topic that there is room for many courses. You may usefully take all the parallel courses at RPI.

This unique features of this course are as follows:

  1. Use of only the Nvidia GPU. (As of now) this is the most widely used and least expensive parallel platform (although Intel looks promising).
  2. Emphasis on learning several programming packages, at the expense of theory. However you will learn a lot about parallel architecture.

6.  LMS

RPI LMS (formerly WebCT) will be used only for you to submit homeworks and for me to distribute grades.

Announcements and the homeworks themselves will be available on this website.

7.  Times & places

Mon & Thurs, 4-6pm, in JEC4107 (lectures).

8.  Lecture summaries and announcements

will be posted here.

9.  Assessment measures, i.e., grades

  1. There will be no exams.
  2. The grade will be based on homeworks, a term project, and class presentations.

9.1  Homeworks

There will be frequent homeworks. You are encouraged to do the homework in teams of 2, and submit one solution per team, on RPILMS, in any reasonable format. The other term member should submit only a note listing the team and saying who submitted the solution.

"Reasonable" means a format that I can read. A scan of neat handwriting is acceptable. I would type material with a wiki like pmwiki or blogging tool, sketch figures with xournal or draw them with inkscape, and do the math with mathjax. Your preferences are probably different.

9.2  Term project

  1. For the latter part of the course, most of your homework time will be spent on a term project.
  2. You are encouraged do it in teams of up to 3 people. A team of 3 people would be expected to do twice as much work as 1 person.
  3. You may combine this with work for another course, provided that both courses know about this and agree. I always agree.
  4. You may build on existing work, either your own or others'. You have to say what's new, and have the right to use the other work. E.g., using any GPLed code or any code on my website is automatically allowable (because of my Creative Commons licence).
  5. You will implement, demonstrate, and document something vaguely related to parallel computing.
  6. You will give a 10 minute fast forward Powerpoint talk in class. A fast forward talk is a timed Powerpoint presentation, where the slides advance automatically.
  7. You may demo it to me if you wish.

Size of term project

It's impossible to specify how many lines of code makes a good term project. E.g., I take pride in writing code that is can be simultaneously shorter, more robust, and faster than some others. See my 8-line program for testing whether a point is in a polygon here.

According to Big Blues, when Bill Gates was collaborating with around 1980, he once rewrote a code fragment to be shorter. However, according to the IBM metric, number of lines of code produced, he had just caused that unit to officially do negative work.


  1. An implementation showing parallel computing.
  2. An extended abstract or paper on your project, written up like a paper.
  3. A more detailed manual, showing how to use it.
  4. A talk in class.

9.3  Correcting the Prof's errors

Occasionally I make mistakes, either in class or on the web site. The first person to correct each nontrival error will receive an extra point on his/her grade. One person may accumulate several such bonus points.

9.4  Grade distribution & verification

  1. I'll post homework grading comments on LMS. We'll return graded midterm exams in class. Please report any errors disagreements or appeals by email within one week.
  2. From time to time, I'll post your grades to LMS. Please report any missing grades within one week to the TA, with a copy to the prof.
  3. It is not allowed to wait until the end of the semester, and then go back 4 months to try to find extra points. It is especially not allowed to wait until the end of the following semester, and then to ask what you may do to raise your grade.
  4. I maintain standards (and the value of your diploma) by giving the grades that are earned, not the grades that are desired. Nevertheless, this course's average grade is competitive with other courses.
  5. Appeal first to me (WRF), then to any other prof in ECSE acting as a mediator (such as Prof Wozny, the curriculum chair), and then to the ECSE Head. It is preferable to state your objection in writing.

9.5  Mid-semester assessment

Before the drop date, I will email you your performance to date.

10.  Academic integrity

  1. See the Student Handbook for the general policy. Specifics for this course are as follows.
  2. You may collaborate on homeworks, but each team of 1 or 2 people must write up the solution separately (one writeup per team) using their own words. We willingly give hints to anyone who asks.
  3. The penalty for two teams handing in identical work is a zero for both.
  4. You may get help from anyone for the term project. You may build on a previous project, either your own or someone else's. However you must describe and acknowledge any other work you use, and have the other person's permission, which may be implicit. E.g., my web site gives a blanket permission to use it for nonprofit research or teaching. You must add something creative to the previous work. You must write up the project on your own.
  5. However, writing assistance from the Writing Center and similar sources in allowed, if you acknowledge it.
  6. The penalty for plagiarism is a zero grade. The most common form of plagiarism is copying passages from other documents without acknowledging them.

11.  Student feedback

Since it's my desire to give you the best possible course in a topic I enjoy teaching, I welcome feedback during (and after) the semester. You may tell me, or contact a third party, such as Prof Wozny, if you wish anonymity. If you feel that I am treating you unjustly, you may complain to Prof Wozny, or to Prof Boyer, the Head of ECSE, and then to people up the hierarchy.

12.  The formatting of this website

Please report any problems viewing my online material. The goal is to make my pages legible on everything from a hires monitor to a smart phone, on every major browser. However, this may not be completely possible. (E.g., an early version of Chrome crashed on my website. Early verions of IE rendered very slowly.) I may not know about a failure unless someone reports it. Thanks.