A moment of reflection

December 22, 2009

Data driven programming assignments

Filed under: Uncategorized — Melvin Zhang @ 5:48 pm

Originally appeared here on 09/01/09.

I attended a talk by Randall Bryant on Data Intensive Scalable Computing. His focus was on computer systems for processing large amounts of data.

I realized the importance of data driven computations earlier though my experience setting programming assignments. I think it is important to have problems that are realistic; where you want to write the programs in order to see the results. Unfortunately, most of the time we are technique driven. We try to form a task around a specific method. However, most problems usually have a rather trivial solution, therefore we need to impose some unreasonable constraints or increase the size of the problem to some unbelievably large size in order to force the use of specific techniques.

I think the correct approach is to start from the data. There are large amount of interesting data that is available over the web, from movies to tags. In the UNIX workshop I conducted for freshmen orientation 2008, I made use of the SMS corpus from the WING research group to motivate the use of UNIX pipes.

Dealing with large amounts of publicly available real world data gives rise to realistic computational problems where the effect of efficient algorithms become apparent. Computations that takes hours to run using a naive method can be completed in seconds using the correct approach.

Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Theme: Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.