Welcome to the October 14, 2009 edition of ACM TechNews, providing timely information for IT professionals three times a week.
HEADLINES AT A GLANCE
Training to Climb an Everest of Digital Data
New York Times (10/12/09) P. B1; Vance, Ashlee
A deluge of digital information exists thanks to rapid technological improvements, and the next generation of computer scientists has to think in terms of what could be described as Internet scale. IBM and Google are helping by making their vast computing resources accessible to university students. The computers have been equipped with software that Internet firms employ to execute their most difficult data analysis tasks, and IBM and Google established a system that allows students and researchers to tap into behemoth computers online. Furthermore, this year the U.S. National Science Foundation (NSF) has divided $5 million among 14 universities that want to educate their students on tackling major data challenges. By making their big data sets, simpler software, and computing products available for research and experiments, IBM and Google are aiding both the universities and the government. "Historically, it has been tough to get the type of data these researchers need out of industry," says NSF research director James C. French. "But we're at this point where a biologist needs to see these types of volumes of information to begin to think about what is possible in terms of commercial applications."
Computer Program Proves Shakespeare Didn't Work Alone, Researchers Claim
Times Online (United Kingdom) (10/12/09) Malvern, Jack
An expert on William Shakespeare at the University of London has used plagiarism-detection software to determine that Shakespeare co-wrote the unattributed play The Reign of King Edward III. [email protected], developed by researchers at the University of Maastricht to detect whether or not students are cheating, enabled Sir Brian Vickers to compare language used in the play, published anonymously in 1596 when Shakespeare was 32, with other plays of the period. Plays with different authors tend to have up to 20 common phrases of three or more words. "The computer is picking out three-word sequences that could just be chunks of grammar," Vickers says. "But when you get metaphors or unusual parts of speech, it is different." [email protected] found 200 matches of phrases in the play with Shakespeare's works published before 1596, with the matches coming from four scenes, or about 40 percent of the play. The software also found about 200 phrases that matched the language of the works of Thomas Kyd, another popular playwright during the period, in the remaining scenes, which indicates he wrote the other 60 percent of the play.
Inside Higher Ed (10/09/09) Stripling, Jack
A panel of researchers made a case for funding "high risk, high reward" research during a recent hearing of the U.S. House Science and Technology Subcommittee on Research and Science Education. The panelists told the subcommittee that grant proposals for funding agencies do not give researchers the freedom to pursue such projects, and they often lack the funding to investigate a potential return on the investment. The scientists said lawmakers should consider requiring federal agencies to set aside some grant funding for high risk research. "Breakthroughs often occur as total surprises,” said Rice University professor Neal Lane. "The research that was being done may strike us as somewhat routine, somewhat dull, and suddenly there's a surprise that comes out of the research." The panelists also said the agencies could provide money to promising investigators with no strings attached, which is the model used by the MacArthur "genius" grants and the Howard Hughes Medical Institute. "We're not going to tell you what to do with the money," said the Hughes Institute's Gerard Rubin of the model. "We're betting on you as an individual, and we're going to win or lose that bet." Rep. Daniel Lipinski (D-Ill.) said the pending re-authorization of the National Science Foundation and competitiveness legislation would offer opportunities for strengthening funding for riskier projects.
A View From the 2009 European Computer Science Summit
Computing Community Consortium (10/13/09) Bernat, Andrew
A recurring theme at Informatics Europe's recent European Computer Science Summit 2009, which took place Oct. 8-9 in Paris, was the concern that the European scientific community does not appreciate computing as a research discipline by itself, writes Computing Research Association (CRA) executive director Andrew Bernat. Participants said that researchers often view it merely as a tool for studying other subjects. At the meeting they proposed methods of making computer science more accessible to researchers of other disciplines. Also at the meeting, Informatics Europe announced that it is conducting a survey similar to the CRA's Taulbee Survey, monitoring departmental evaluation methods. Computing experts worry that they will be judged by a panel with little knowledge of the discipline that will hold them to arbitrary standards. Informatics Europe will contribute its own suggestions for a panel and evaluation method. Participants generally hoped that university computing departments would not be subject to a European ranking system.
The Web's Inventor Regrets One Small Thing
New York Times (10/12/09) Lohr, Steve
Governments around the world have put more of their data on the Web this year than previous years, and the United States and Britain have led the way, said Sir Tim Berners-Lee in an interview at a recent symposium on the future of technology in Washington, D.C. Berners-Lee, who is currently a professor at the Massachusetts Institute of Technology and director of the World Wide Web Consortium, is enthusiastic about having traffic, local weather, public safety, health, and other data in raw form online. People will create exciting applications once the data and online tools are available, he said. For example, a simple mash-up that combines roadway maps with bicycle accident reports could help bikers determine the most dangerous roads. "Innovation is serendipity, so you don't know what people will make," he said. "But the openness, transparency, and new uses of the data will make government run better, and that will make business run better as well." With regard to any regrets about the Web, Berners-Lee said that using the double slash "//" after the "http:" in Web addresses turned out to be unnecessary.
Computing Project Combats Card Counting
University of Dundee (10/09/09) Hill, Grant
A cost-effective computer system for identifying card counters and detecting dealer errors has been developed by a recent graduate of the University of Dundee. The Blackjack tracking system makes use of algorithms that employ methods such as contour analysis, template, and feature matching to recognize each card as it is dealt. "Computer vision was one of the options when it came to choosing subjects for our final year, and when it came to our final project, I started to think about combining what I was learning with something I was interested in," says Dundee graduate Kris Zutis. A live feed of a game is captured by stereo cameras, which track the game as it progresses, monitor the cards along with the player, and track the betting patterns. The algorithms analyze the correlation between the player's betting patterns and the game card count to determine that a player is card counting and alert the casino staff. Zutis is scheduled to present a research paper on the computer vision system at the International Conference on Computer Vision Systems in Liege, Belgium. "My system needs work to be commercially viable, but the potential has been demonstrated, and hopefully appearing at the event will help generate some interest in helping me to develop it further," he says.
Household Robots Do Not Protect Users' Security and Privacy, Researchers Say
UW News (10/08/09) Hickey, Hannah
A new University of Washington (UW) study has found that domestic robots present security and privacy risks for their owners. The researchers examined three household robots on the market as of October 2008, two of which can be controlled online. The researchers discovered that all three robots could be located using their wireless networks, their audio and video data could be interrupted or even stolen online, they did not always warn people that someone was accessing them, and they did not always alert nearby people to their presence. Moreover, the researchers found that in some cases a robot could be manipulated to hurt its owner or its owner's property. "In the future people may have multiple robots in the home that are much more capable and sophisticated," says UW doctoral student Tamara Denning. "Security and privacy risks with future household robots will likely be more severe, which is why we need to start addressing robot security and privacy today." The researchers say the solution could be as simple as encrypting wireless networks or removing the robots' Internet access. "People know to look for small parts in children's toys, or look for lead paint," says study co-author Cynthia Matuszek. "For products that combine more advanced technology and wireless capabilities, people should look at whether it protects privacy and security."
Securing the Web
MIT News (10/08/09) Hardesty, Larry
Massachusetts Institute of Technology (MIT) researchers will present a new security system called Resin at the 2009 ACM Symposium on Operating Systems Principles, which takes place Oct. 11-14 in Big Sky, Mont. Resin, rather than checking every piece of code that a Web site runs, monitors the information that Web sites use. Although types of code are numerous and can vary widely, "the same data is being handled in all these hundreds of places," says MIT computer scientist Nickolai Zeldovich. So the MIT team has created a system that checks for security breaches every time that Web sites try to obtain that information. The team adapted 12 security applications written in Python and PHP for use in Resin. Resin managed to deflect both established attacks and new ones developed by the team. The developers say Resin will allow Web programmers to write security code only once, as opposed to copying it to hundreds of locations. However, because Resin needs to follow information down whatever path it takes, it requires extra software. The information-monitoring software would have to be adaptable to many different types of runtimes, which may cut down on its performance, says UCLA computer scientist Eddie Kohler. Still, he acknowledges that performance may not be a user's top priority. "A place like, maybe Facebook, say, that runs other people's code on their servers, already has an environment where they're much more worried about people stealing data out of their servers than they are necessarily about getting the last two percent of performance," he says.
An Interview With Brian Kernighan, Co-Developer of AWK and AMPL
Computerworld Australia (10/06/09) Clarke, Trevor
Brian Kernighan—a contributor to the development of the AWK and AMPL programming languages—says that he remains "very interested" in domain-specific languages as well as tools that ease the writing of code. "Programming today depends more and more on combining large building blocks and less on detailed logic of little things, though there's certainly enough of that as well," he notes. "A typical programmer today spends a lot of time just trying to figure out what methods to call from some giant package and probably needs some kind of IDE like Eclipse or XCode to fill in the gaps. There are more languages in regular use and programs are often distributed combinations of multiple languages. All of these facts complicate life, though it's possible to build quite amazing systems quickly when everything goes right." Kernighan points to an increase in scalable systems, and businesses that he thinks are making significant societal contributions include Google, through its wide scale access to a vast corpus of information. Kernighan observes that "for better or worse, the driving influence today [behind contemporary computing] seems to be to get something up and running and used via the Internet, as quickly as possible." However, he says that approach "only works because there is infrastructure: Open source software like Unix/Linux and GNU tools and Web libraries, dirt-cheap hardware, and essentially free communications."
Cheap Naked Chips Snap a Perfect Picture
New Scientist (10/07/09) Marks, Paul
Swiss Federal Polytechnic Institute (EPFL) engineer Edoardo Charbon and his team are developing a gigavision sensor, based on an ordinary memory chip, which they say can operate well with both bright and dim light. In the past, light has been the bane of the memory chip--it "simply destroys the information," says EPFL researcher Martin Vetterli. But Charbon and colleagues learned to aim the light hitting a memory chip so that each cell that is corrupted by the light changes depending on how much light is hitting it. The light effectively creates an image that the chip preserves. The researchers say the new sensor could enable cell phones and other devices to take richer, better pictures. Moreover, the process is far more efficient than sensors based on charge-coupled devices (CCDs) or complementary metal oxide semiconductors (CMOS). Both technologies use similar methods to store images. Each pixel holds a charge whose strength corresponds with the amount of light that hits it. Charges in a CCD are passed from one pixel to another, so that the image forms in a wave of light starting at one edge of the chip and finally reaching the other. An analog-to-digital converter (ADC) labels the pixels according to an 8-bit grayscale from zero to 255. CMOS uses the same scale, although it transforms charges into voltages before doing so. However, Vetterli says that because a memory chip creates an image immediately, its cells will always be 100 times smaller than those belonging to CMOS sensors. This means that it can pack 100 pixels in the space of just one digital camera sensor--a gigapixel camera. Memory chip sensors can only store zeros or ones--light or dark--and cannot yet record shades of gray. EPFL's Feng Yang is working on an algorithm that can assign shades of gray to 100 pixels of information. Dubbed spatial oversampling, the technique is more accurate than ADC. Vetterli hopes to have a functional gigavision memory chip by early 2011.
Machine Learning by Watching and Listening
A team led by University of Pennsylvania professor Ben Taskar has demonstrated that computers can be educated to associate the content of video clips with existing descriptions of characters and actions, and then deduce information about new material and categorize it based on its previously attained knowledge through new algorithms that blend video, sound, and text streams. The research team is using popular TV programs such as Lost and Alias to teach computers to learn through visual, audio, and textual observation. For example, Taskar is feeding a vast corpus of fan-generated online content about the show Lost--YouTube video clips, episode scripts, and so on--into computers. Through the use of novel algorithms that enable the computer to integrate the information with the video, the system can learn who characters are, what actions they are performing, and with whom they are engaged in such activity. Researchers can then ask the computer to show all scenes related to specific characters and actions, and can study the sequences for errors that suggest the algorithms and models require tweaking. This machine-learning technique is likely to yield advantages for general image and audio search and further propel the discipline toward unsupervised methods to make computers acquire knowledge about the world.
Why Desktop Multiprocessing Has Speed Limits
Computerworld (10/05/09) Vol. 43, No. 30, P. 24; Wood, Lamont
Despite the mainstreaming of multicore processors for desktops, not every desktop application can be rewritten for multicore frameworks, which means some bottlenecks will persist. "If you have a task that cannot be parallelized and you are currently on a plateau of performance in a single-processor environment, you will not see that task getting significantly faster in the future," says analyst Tom Halfhill. Adobe Systems' Russell Williams points out that performance does not scale linearly even with parallelization on account of memory bandwidth issues and delays dictated by interprocessor communications. Analyst Jim Turley says that, overall, consumer operating systems "don't do anything smart" with multicore architecture. "We have to reinvent computing, and get away from the fundamental premises we inherited from von Neumann," says Microsoft technical fellow Burton Smith. "He assumed one instruction would be executed at a time, and we are no longer even maintaining the appearance of one instruction at a time." Analyst Rob Enderle notes that most applications will operate on only a single core, which means that the benefits of a multicore architecture only come when multiple applications are run. "What we'd all like is a magic compiler that takes yesterday's source code and spreads it across multiple cores, and that is just not happening," says Turley. Despite the performance issues, vendors prefer multicore processors because they can facilitate a higher level of power efficiency. "Using multiple cores will let us get more performance while staying within the power envelope," says Acer's Glenn Jystad.
Abstract News © Copyright 2009 INFORMATION, INC.
To submit feedback about ACM TechNews, contact: [email protected]
Change your Email Address for TechNews (log into myACM)