PAR Syllabus

Titles: ECSE-4740-01 Applied Parallel Computing for Engineers, CRN 39207 ECSE-6967-01 Parallel Computing for Engineers, CRN 39429 Spring term annually 3 credit hours Mon and Thurs 4-5:20, JEC 4104.

2   History

The 4000-level version of this course has been run twice before as the experimental course ECSE-4965-01. The experiment was deemed a success, so this is now a permanent course listed in the catalog as ECSE-4740-01.

There is also now an affiliated 6000-level topics course. The lectures are in common, but this course requires more work.

3   Description

1. This is intended to be a computer engineering course to provide students with knowledge and hands-on experience in developing applications software for affordable parallel processors. This course will cover hardware that any lab can afford to purchase. It will cover the software that, in the prof's opinion, is the most useful. There will also be some theory.
2. The target audiences are ECSE seniors and grads and others with comparable background who wish to develop parallel software.
3. This course will have minimal overlap with parallel courses in Computer Science. We will not teach the IBM BlueGene, because it is so expensive, nor cloud computing and MPI, because most big data problems are in fact small enough to fit on our hardware.
4. You may usefully take all the parallel courses at RPI.
5. This unique features of this course are as follows:
1. Use of only affordable hardware that any lab might purchase, such as Nvidia GPUs. This is currently the most widely used and least expensive parallel platform.
2. Emphasis on learning several programming packages, at the expense of theory. However you will learn a lot about parallel architecture.
6. Hardware taught, with reasons:
Multicore Intel Xeon:
universally available and inexpensive, comparatively easy to program, powerful
Intel Xeon Phi: affordable, potentially powerful, somewhat harder to program
Nvidia GPU accelerator:
widely available (Nvidia external graphics processors are on 1/3 of all PCs), very inexpensive, powerful, but harder to program. Good cards cost only a few hundred dollars.
7. Software that might be taught, with reasons:
OpenMP C++ extension:
widely used, easy to use if your algorithm is parallelizable, backend is multicore Xeon.
Thrust C++ functional programming library:
FP is nice, hides low level details, backend can be any major parallel platform.
MATLAB: easy to use parallelism for operations that Mathworks has implemented in parallel, etc.
Mathematica: interesting powerful front end:
CUDA C++ extension and library for Nvidia:
8. The techniques learned here will also be applicable to larger parallel machines -- number 3 on the top 500 list uses NVIDIA GPUs, while number 2 uses Intel Xeon Phis. (Number 4 is a BlueGene.)
9. Effectively programming these processors will require in-depth knowledge about parallel programming principles, as well as the parallelism models, communication models, and resource limitations of these processors.

4   Prerequisite

ECSE-2660 CANOS or equivalent, knowledge of C++.

5   Why (Not) To Take This Course

Other courses might be a better fit if:

1. You don't like programming.
2. You don't like documenting your programs.
3. You don't like math.
5. You don't like writing exams at the official scheduled times. The final exam may be as late as Dec 20.

OTOH, here are some reasons that you might prefer to take a course from me.

1. I teach stuff that's fun and useful.
2. I acknowledge that you are simultaneously taking several courses, and so try to make the workload fair. E.g., if you're taking 6 3-credit courses, then you should not be required to spend more than $\frac{168}{6}$ hours per week per course :-).
3. I try to base exam questions more on important topics that occupied a lot of class time, and which are described in writing, often on this wiki.
4. I keep the course up-to-date and relevant.

6   Instructors

6.1   Professor

W. Randolph Franklin. BSc (Toronto), AM, PhD (Harvard)

6.2   Teaching assistant

1. Yin Li, liyN@rpi.edu, replacing N with 2*17 (5 hrs/wk)
2. Office hours:
1. ECSE Flip Flop lounge in JEC 6037.
2. Fri 4pm
3. Come near the start of the time; if there is no one there he may leave.
4. He will try to stay as long as there are students asking questions, but will leave after 15 minutes if no one has arrived.
5. If you need more time, or a different time, then write, and he will try to accommodate you.
6. He also attends most lectures.

7   Course websites

The homepage has lecture summaries, syllabus, homeworks, etc.

There is a separate page for important dates.

8.1   Text

There is no required text, but the following inexpensive books may be used.

1. Sanders and Kandrot, CUDA by example. It gets excellent reviews, although it is several years old. Amazon has many options, including Kindle and renting hardcopies.
2. Kirk and Hwu, 2nd edition, Programming massively parallel processors. It concentrates on CUDA.

One problem is that even recent books may be obsolete. For instance they may ignore the recent CUDA unified memory model, which simplifies CUDA programming at a performance cost. Even if the current edition of a book was published after unified memory was released, the author might not have updated the examples.

8.2   Web

There is a lot of free material on the web, which I'll reference, and may cache locally.

9   Computer systems used

This course will use (remotely via ssh) parallel.ecse.rpi.edu. It has:

1. a dual 14-core Intel Xeon E5-2660 2.0GHz
2. 256GB of DDR4-2133 ECC Reg memory
3. Nvidia GeForce GTX 1080 processor with 8GB
4. Intel Xeon Phi 7120A
5. Samsung Pro 850 1TB SSD
6. WD Red 6TB 6GB/s hard drive
7. CentOS
8. CUDA
9. OpenMP 4.0
10. Thrust

You may also use other compatible computers.

10   LMS

RPI LMS (formerly WebCT) will be used only for you to submit homeworks and for me to distribute grades.

Announcements and the homeworks themselves will be available on this website. You do not have to log in to see them.

11   Class times & places

Lectures are Mon & Thurs, 4-6pm, in JEC4107.

12   Lecture summaries and announcements

will be posted here.

1. There will be no exams.

2. The grade will be based on homeworks, a term project, class presentations, and possible iclicker questions TBD.

3. Deliverables for the term project:

1. A 2-minute project proposal given to the class around the middle of the semester.
1. A 5-minute project presentation given to the class in the last week.
1. Some progress reports.
1. A write-up uploaded on the last class day. This will contain an academic paper, code and perhaps video or user manual.

13.1   Homeworks

There will be frequent homeworks. You are encouraged to do the homework in teams of 2, and submit one solution per team, on RPILMS, in any reasonable format. The other term member should submit only a note listing the team and saying who submitted the solution.

"Reasonable" means a format that I can read. A scan of neat handwriting is acceptable. I would type material with a wiki like pmwiki or blogging tool, sketch figures with xournal or draw them with inkscape, and do the math with mathjax. Your preferences are probably different.

13.2   Term project

1. For the latter part of the course, most of your homework time will be spent on a term project.
2. You are encouraged do it in teams of up to 3 people. A team of 3 people would be expected to do twice as much work as 1 person.
3. You may combine this with work for another course, provided that both courses know about this and agree. I always agree.
4. If you are a grad student, you may combine this with your research, if your prof agrees, and you tell me.
5. You may build on existing work, either your own or others'. You have to say what's new, and have the right to use the other work. E.g., using any GPLed code or any code on my website is automatically allowable (because of my Creative Commons licence).
6. You will implement, demonstrate, and document something vaguely related to parallel computing.
7. You will give a 5 minute fast forward Powerpoint talk in class. A fast forward talk is a timed Powerpoint presentation, where the slides advance automatically.
8. You may demo it to the TA if you wish.

13.2.1   Size of term project

It's impossible to specify how many lines of code makes a good term project. E.g., I take pride in writing code that is can be simultaneously shorter, more robust, and faster than some others. See my 8-line program for testing whether a point is in a polygon: Pnpoly.

According to Big Blues, when Bill Gates was collaborating with around 1980, he once rewrote a code fragment to be shorter. However, according to the IBM metric, number of lines of code produced, he had just caused that unit to officially do negative work.

13.2.2   Deliverables

1. An implementation showing parallel computing.
2. An extended abstract or paper on your project, written up like a paper. You should follow the style guide for some major conference (I don't care which, but can point you to one).
3. A more detailed manual, showing how to use it.
4. A talk in class.

A 10-minute demonstration to the TA is optional. If you do, she will give me a modifier of up to 10 points either way. I.e., a good demo will help, a bad one hurt.

13.3   Correcting the Prof's errors

Occasionally I make mistakes, either in class or on the web site. The first person to correct each nontrival error will receive an extra point on his/her grade. One person may accumulate several such know-it-all points.

13.4   Iclickers

Iclicker questions will be posed in some classes. The questions are intended to be easy. Please bring your iclickers.

13.5   Missing or late work

1. We will drop the lowest homework grade. That will handle excused absences, unexcused absences, dying relatives, illnesses, team trips, and other problems.
2. If your term project is late, and you have an acceptable excuse for an incomplete, you will be given that, and the project will be graded next fall. Note that RPI has tightened the rules for incompletes; they are not automatic.

1. I'll post homework grading comments on LMS. Please report any errors disagreements or appeals by email within one week.
2. From time to time, I'll post your grades to LMS. Please report any missing grades within one week to the TA, with a copy to the prof.
3. It is not allowed to wait until the end of the semester, and then go back 4 months to try to find extra points. It is especially not allowed to wait until the end of the following semester, and then to ask what you may do to raise your grade.
4. I maintain standards (and the value of your diploma) by giving the grades that are earned, not the grades that are desired. Nevertheless, this course's average grade is competitive with other courses, and last year's students seemed to like the course.
5. Appeal first to the TA, then to the prof, then to any other prof in ECSE acting as a mediator (such as Prof Wozny, the curriculum chair), and then to the ECSE Head. It is preferable to state your objection in writing.

13.7   Mid-semester assessment

Before the drop date, I will email you your performance to date.

13.8   Early warning system (EWS)

As required by the Provost, we may post notes about you to EWS, for example, if you're having trouble doing homeworks on time, or miss an exam. E.g., if you tell me that you had to miss a class because of family problems, then I may forward that information to the Dean of Students office.

See the Student Handbook for the general policy. The summary is that students and faculty have to trust each other. After you graduate, your most important possession will be your reputation.

Specifics for this course are as follows.

1. You may collaborate on homeworks, but each team of 1 or 2 people must write up the solution separately (one writeup per team) using their own words. We willingly give hints to anyone who asks.
2. The penalty for two teams handing in identical work is a zero for both.
3. You may collaborate in teams of up to 3 people for the term project.
4. You may get help from anyone for the term project. You may build on a previous project, either your own or someone else's. However you must describe and acknowledge any other work you use, and have the other person's permission, which may be implicit. E.g., my web site gives a blanket permission to use it for nonprofit research or teaching. You must add something creative to the previous work. You must write up the project on your own.
5. However, writing assistance from the Writing Center and similar sources in allowed, if you acknowledge it.
6. The penalty for plagiarism is a zero grade.
7. Cheating will be reported to the Dean of Students Office.

15   Student feedback

Since it's my desire to give you the best possible course in a topic I enjoy teaching, I welcome feedback during (and after) the semester. You may tell me or write me or the TA, or contact a third party, such as Prof Gary Saulnier, the ECSE undergrad head, or Prof Mike Wozny, the ECSE Dept head.