Archive for November, 2011

November 14, 2011

Speaking this week at the SC11 conference in Seattle

I’m privileged to once again be speaking at the SC conference. For those who don’t know it; “SC is the International Conference for High Performance Computing, Networking, Storage and Analysis.” If you are attending, I’ll be on a panel entitled Parallelism, the Cloud, and the Tools of the Future for the next generation of practitioners. I’ll be joining some of my compatriots in the Educational Alliance for a Parallel Future to once again discuss the skill sets that collegiate computer science programs should (and mostly aren’t) imparting to their students in the areas of parallel programming.

The abstract for the panel is as follows:

Industry, academia and research communities face increasing workforce preparedness challenges in parallel (and distributed) computing, due to the onslaught of multi-/many-core and cloud computing platforms. What initiatives have begun to address those challenges? What changes to hardware platforms, languages and tools will be necessary? How will we train the next generation of engineers for ubiquitous parallel and distributed computing? Following on from the successful model used at SC10, the session will be highly interactive, combining aspects of BOF, workshop, and Panel discussions. An initial panel will lay out some of the core issues in this topic with experts from multiple areas in education and industry. Following this will be moderated breakouts, much like collective mini-BOFS, for further discussion and to gather ideas from participants about industry and research needs.

If this sounds similar to the session from the Intel Developer Forum in September, there is good reason. It was the second most popular session of that conference. The IDF panel and breakout sessions covered some really interesting ground, and I really liked the format. I felt like the discussions I had with the people in my subgroup at IDF were deeper, more specific and more productive than a traditional panel format would have been.

While the speakers in this panel are different than the one in September, I think we’ll still end up splitting on the axis of using abstractions to teach fundamentals vs teaching from the first principles up. Which camp you are in seems at least somewhat determined by the fact that a number of panelists produce abstractions over the low-level elements as part of their work. I am very much in the fundamentals camp as I think that understanding what the abstractions are built on is fundamental to choosing the right abstraction, much as artists tend to start with representative figure drawing. What will make an interesting difference from IDF is the number of audience members who come from outside of computer science (HPC is used more by scientists for whom the computation is only a means to the end of solving a problem in a non-computational discipline). Those audience members are less likely to understand the fundamentals, nor care. For them parallelism is just a tool to get their answer faster. This should really make for a lively debate!

My statement for the panel is as follows (yes, I did crib the last paragraph from my earlier position):
The team I manage is building a single, modern, software product. A few years ago, that would have meant a desktop application written primarily in C++, most likely single-threaded. Today, it means software that runs on the desktop, but also on mobile devices and in the cloud. Working in my organization are developers who write shaders for the GPU, developers who write SSE (both x86 and ARM), developers using distributed computing techniques on EC2 and threads everywhere throughout the clients and server code. We write code in C, C++, ObjC, assembly, Lua, Java, C#, Perl, Python, Ruby and GLSL. We leverage Grand Central Dispatch, pThreads, TBB and boost threads. How many of the technologies that we use today in professional software development existed when we went to school? Nearly none. How many will still be used in a few years from now? Who knows. The reason we can continue to work in the field is that our education was grounded not just in programming techniques for the technology of the time, but also in computer architecture, operating systems, and programming languages (high level, low level and domain-specific).

Learning GPGPU was much easier for me because I could understand the architecture of graphics processors. I was able to understand Java’s garbage collection because I understood how memory management worked in C. I chose TBB over Grand Central Dispatch to solve a specific threading problem because I could evaluate both technologies given my experience
with pThreads.

We’re doing students a disservice if we teach them the concepts using high-level abstractions or only teach them a single programming language. Having an understanding of computer architecture is also critical to a computer science education.

These fundamentals of computer science do not necessarily need to be broken out into their own classes. They can and should be integrated throughout the curriculum. Threading should be part of every course. It is a critical part of modern software development. Different courses should use different programming languages to give students exposure to different programming models.

If I was a Dean of Computer Science somewhere, I¹d look to creating a curriculum where parallel programming using higher-level abstractions was part of the introductory courses using something like C++11, OpenMP or TBB. Mid-level requirements would include some computer architecture instruction. Specifically, how computer architecture maps to the software that runs on top of it. This may also include some lower level instruction in things like pThreads, Race conditions, lock-free programming or even GPU or heterogenous programming techniques using OpenCL. In later courses focused more on software engineering, specific areas like graphics, or
larger projects: I¹d encourage the students to use whichever tools they found most appropriate to the tasks at hand. This might even include very high level proprietary abstractions like DirectCompute or C++AMP as long as the students could make the tradeoffs intelligently because of their understanding of the area from previous courses.

You can read the position statements from the rest of the panel here.

11:09 AM Permalink