ACM TechNews

ACM TechNews (HTML) Read the TechNews Online at: http://technews.acm.org

ACM TechNews
September 5, 2008

Learn about ACM's more than 3,000 online courses and 1,100 online books
MemberNet
CareerNews
Unsubscribe

Welcome to the September 5, 2008 edition of ACM TechNews, providing timely information for IT professionals three times a week.

HEADLINES AT A GLANCE:

25 Years of Conventional Evaluation of Data Analysis Proves Worthless in Practice
Multi-Core Chip Research to Lead to Performance Gains, Power Reduction for High- and Low-End Computing
Students Help Humanity With Open Source Software
Letter Lottery Defines Spam Load
Indian Researcher's Improved Anti-Hacking System for Wireless Networks
MIT Lincoln Laboratory Software Aims to Thwart Cyber Hackers
Small Communities Can Plan for Emergencies Too
Media Convergence--Television Is the Media Hub Device for Web, TV and Communication
Self-Help Software to Soothe Stressed Astronauts
A Network That Builds Itself
Fellowship Applications for Doctoral Students Pursuing HPC Research Due September 8
Lines and Bubbles and Bars, Oh My! New Ways to Sift Data
Q&A: Futurist Ray Kurzweil
Carolina Attracts World-Renowned Large-Scale Data Research Team, DICE
IT School to Watch: Carnegie Mellon University
Open Source: What You Should Learn From the French
Semantic Provenance for eScience: Managing the Deluge of Scientific Data

25 Years of Conventional Evaluation of Data Analysis Proves Worthless in Practice
Uppsala University (09/03/08)

For the past 25 years, two methods have been used to evaluate computer-based methods for classifying patient samples, but Swedish researchers at Uppsala University have found that this methodology is worthless when applied to practical problems. These methods are the basis for many technical applications, such as recognizing human speech, images, and fingerprints, and are now being used in new fields such as health care. However, to evaluate the performance of a classification model, a number of trial examples that were never used in the design of the model are needed. Unfortunately, there are seldom tens of thousands of test samples available for this type of evaluation, often because the samples are too rare or expensive to collect to use on an evaluation. Numerous methods have been proposed to solve this problem and since the 1980s two methods have dominated the field--cross validation and resampling/bootstrapping. The Uppsala researchers used both theory and computer simulations to show that those methods are worthless in practice when the total number of examples is small in relation to the natural variation that exists among different observations. What constitutes a small number depends on the problem being studied. The researchers say it is essentially impossible to determine whether the number of examples used is sufficient. "Our main conclusion is that this methodology cannot be depended on at all, and that it therefore needs to be immediately replaced by Bayesian methods, for example, which can deliver reliable measures of the uncertainty that exists," says Uppsala University professor Mats Gustafsson, who co-directed the study with professor Anders Isaksson. "Only then will multivariate analyses be in any position to be adopted in such critical applications as health care."
Click Here to View Full Article
to the top

Multi-Core Chip Research to Lead to Performance Gains, Power Reduction for High- and Low-End Computing
National Science Foundation (09/02/08)

The National Science Foundation (NSF) and the Semiconductor Research Corporation (SRC) have announced a three-year joint initiative for multicore chip design and architecture. The program will focus on several components of multicore system architecture design that could significantly enhance and accelerate solutions for advancing semiconductor performance. About $6 million in funding will be made available to U.S. universities, who have been invited to submit research proposals. The program is intended to lead to significant advances in state-of-the-art multicore chip design and architecture, bring about system-level performance improvements, and establish new and innovative research areas critical to the future of computing. The program will focus on computer-aided design for multicore systems; low-power innovations; and interconnect, packaging, and circuit techniques for multicore chips. NSF's Sankar Basu says as Moore's Law scaling becomes more difficult, researchers must explore new means of insuring continued technological advances in computing. SRC's Steven Hillenius says the partnership between the government, industry, and academia will help expose universities to critical computing challenges. He says that cooperative programs with the NSF help the SRC to deliver value to its industrial members' capabilities while enabling universities to improve their understanding of needs in the semiconductor industry.
Click Here to View Full Article
to the top

Students Help Humanity With Open Source Software
Wesleyan University (09/04/08)

The Humanitarian Free and Open Source Software (HFOSS) project is a joint venture between the computer science departments at Wesleyan University, Trinity College, and Connecticut College and was established to develop open source software that benefits humanity. HFOSS developed software used to coordinate volunteers for relief efforts. The software is part of Sahana, an open source system that was developed to aid in the recovery effort following the 2004 Asian tsunami. "Most of the computer programs that students write while in college are just exercises that have been solved many times before by many people," says HFOSS steering committee member Danny Krizanc, a computer science professor at Wesleyan. "These are necessary for training the mind but I think students get a real satisfaction out of working on something that potentially will have thousands of users that they will never meet." Students work on Sahana during the HFOSS Summer Institute, a 10-week internship program. Part of this year's effort was spent on developing a credentialing module for verifying relief workers. For example, the program confirms that someone who says they are a doctor is actually a doctor before giving them access to patients and drugs. Another project is InSTEDD, which aims to help develop artificial intelligence algorithms for identifying disease outbreaks by processing news reports from around the world.
Click Here to View Full Article
to the top

Letter Lottery Defines Spam Load
BBC News (09/01/08)

Obscure email addresses may be less likely to receive spam messages, says Richard Clayton, a University of Cambridge computer scientist, who scanned for patterns in more than 500 million junk messages. Clayton examined email messages sent over a two-month period and found that more than 40 percent of all emails sent to addresses beginning with a, m, s, r, and p were spam. By contrast, emails beginning with q, z, and y had a spam quotient of 20 percent or less. Clayton also found that more than 50 percent of the messages sent to emails beginning with u were junk. One way to explain these results is a recurring pattern of dictionary-style attacks. In this attack, a spammer would take an email address such as [email protected] and use the same name but change the domain to [email protected]. There are likely to be fewer email addresses beginning with q, z, and y, thus these addresses receive fewer junk messages, Clayton says.
Click Here to View Full Article
to the top

Indian Researcher's Improved Anti-Hacking System for Wireless Networks
Asian News International (09/04/08)

Florida Atlantic University researchers Avinash Srinivasan, Feng Li, and Jie Wu have developed the Probabilistic Voting-based Filtering Scheme (PVFS), which they say can protect and help improve the viability of wireless sensor networks (WSNs). WSNs are vulnerable to two types of cybersabotage, according to the International Journal of Security and Networks. The first is the fabricated report with false votes attack that sends phony data to the base stations with a forged validation. The second type of attack adds false validation votes to genuine incoming data, which labels genuine data as being false. Most WSN systems have built-in software to prevent false data from being given valid credentials, but the second type of attack is more difficult to detect. The researchers say the PVFS can counter both of these attacks simultaneously. To protect the WSN while maintaining normal filtering, the researchers use a general en-route filtering scheme that breaks WSNs into clusters and locks each cluster to a particular data encryption key. As data reaches the headquarters from the clusters, the main cluster-heads check the report together with the votes, acting as the verification nodes in PVFS. Should a saboteur compromise one or more of the sensors on a WSN, the PVFS will apply probability rules to determine the likelihood that the network was compromised, using data from other sensors in different clusters before reporting incoming data as false.
Click Here to View Full Article
to the top

MIT Lincoln Laboratory Software Aims to Thwart Cyber Hackers
MIT News (08/27/08)

Researchers at the Massachusetts Institute of Technology's Lincoln Laboratory are developing the Network Security Planning Architecture (NetSPA), software that will identify the most vulnerable points in a computer network. NetSPA uses information on networks, individual machines, and any programs running to create a graph that displays how hackers could infiltrate the network. System administrators can examine the graph and determine the best course of action. NetSPA relies on vulnerability scanners to identify known weaknesses in network-accessible programs that could allow an unauthorized person to access a machine. NetSPA also analyzes complex firewall and router rules to determine which vulnerabilities can be reached and exploited by attackers, and how attacks can spread within a network by moving from one vulnerable host to another. Richard Lippmann, leader of the development effort, says NetSPA enables network administrators to see which vulnerabilities pose the greatest threat to the network, allowing them to fix those problems first instead of patching or fixing vulnerabilities that are not accessible to attackers. NetSPA also can account for unforeseen avenues of attack, such as if a network had to share data with an outside vendor years ago, and now someone is forging that IP address to try to exploit the forgotten permission.
Click Here to View Full Article
to the top

Small Communities Can Plan for Emergencies Too
ICT Results (09/03/08)

European researchers are working on ERMA, a European Union-funded project to create technology for helping small communities respond better to emergency situations. ERMA researchers developed an online risk management system with a fixed and mobile-based telecoms alarm system by examining commercial software used by businesses to see how it could be adapted for rescue situations. Process management software has many parallels to emergency situations, with one process being followed by another. "It's not the automation aspects of the software, but the pre-planning aspect which dictates exactly what to do at each step of a process that can be translated into procedures for, say, a fire brigade to follow," says ERMA technical coordinator Gertraud Peinel. There also are similarities in customer relationship management software, which provides companies with contact and other details of their customers. The researchers developed a core system and tested the software with simulated floods in Romania and a simulated toxic cloud in a Spanish port. The challenge now is to persuade communities to use the software.
Click Here to View Full Article
to the top

Media Convergence--Television Is the Media Hub Device for Web, TV and Communication
Fraunhofer Institute (08/08)

A new project from Fraunhofer FOKUS and the RTL Deutschland media group brings TV, the Internet, and communication capabilities together to deliver interactive and personalized TV. Fraunhofer FOKUS is using IFA 2008 to show how Internet-based content from RTL Deutschland can be adapted for TV, integrated with the group's TV programs, and managed by remote control. "Highly graphic scenarios for tomorrow's media world can be built using the TV content and online products of RTL, VOX, n-tv, SUPER RTL, and Clipfish," says Stefan Arbanowski, head of the Media Interoperability division at Fraunhofer Institute FOKUS. "Join this up with the work done by FOKUS in convergence of technologies and content, and the shape of tomorrow's media world becomes visible to us all." As part of the "Media Convergence of the Future" project, FOKUS has developed an IPTV system that integrates set-top boxes, PCs, or other end devices in a shared media environment. New advertising concepts show the potential of integrating linear TV programs with nonlinear Internet content in such a manner.
Click Here to View Full Article
to the top

Self-Help Software to Soothe Stressed Astronauts
New Scientist (08/25/08) Powell, David

Researchers at Harvard Medical School are working on Virtual Space Station, software designed to help astronauts deal with the pressures of space travel. The multimedia program asks crew members to respond to multiple-choice questions about how to handle various problems that may arise in space, and to make lists of their concerns and how to solve them. The program focuses on the mental health challenges that were highlighted by a panel of 11 astronauts. The program features an interpersonal conflict widget that includes interactive videos to help negotiate and resolve conflicts. The videos are meant to help astronauts develop self-awareness of their own behaviors, and have already been used to train groups of new astronauts. Additional group studies will test the software's effectiveness with other stressful professions that rely on teamwork, including firefighters and emergency first responders. Another module of the Virtual Space Station focuses on depression, using an approach called problem-solving therapy, which is both clinically effective and relatively simple to encode into a software program. Instead of asking astronauts to talk about their feelings, the program has them create lists of concrete things that are bothering them and brainstorm practical ways of solving these problems.
Click Here to View Full Article
to the top

A Network That Builds Itself
Technology Review (09/03/08) Fitzgerald, Michael

The National Institute of Standards and Technology (NIST) has developed two experimental ad hoc wireless networks that instruct emergency workers how to deploy transmitters to ensure a good signal. The NIST prototypes, which have been under development for more than three years, use algorithms to monitor the signal-to-noise ratio of transmissions and automatically warn when a new node should be deployed. NIST's Nader Moayeri says the prototypes aim to avoid fixed rules because situations change depending on the area. The methods also need to be adaptable because deploying too many nodes can lead to excessive costs and communication delays. Initially, NIST considered sending short messages between nodes to see how data packets were lost in transit, except that the people deploying the network would not detect a weak connection immediately. Using an algorithm to measure the signal-to-noise ratio avoids this problem and provides a clearer picture of connection strength. NIST built two prototypes using off-the-shelf hardware, one that operates at 900 megaherz and uses motes to transmit radio signals, and one based on a Wi-Fi network operating at 2.4 gigahertz. The mote-based system has LED lights that automatically change from green to red when a new node needs to be deployed, and the Wi-Fi system issues alerts through a handheld or tablet computer connected to the network.
Click Here to View Full Article
to the top

Fellowship Applications for Doctoral Students Pursuing HPC Research Due September 8
SC Conference (09/01/08)

Sept. 8, 2008, is the last day to turn in applications for the High Performance Computing Ph.D. Fellowship Program. Applicants will have an opportunity to receive a stipend of $5,000 or more for one academic year, plus travel support to attend the SC conference. Up to three fellowships have been awarded each year. ACM partners with the IEEE Computer Society and the SC Conference Series to honor outstanding Ph.D. students around the world pursuing research in high performance computing (HPC), networking, storage, and analysis. Candidates must be enrolled in a full-time Ph.D. program at an accredited college or university, should have completed one year or more of study in their doctoral program at the time of their nomination, and must meet minimum scholarship requirements at the institutions. Full-time faculty members at Ph.D.-granting accredited institutions must nominate them. Selections will be based on applicants' research potential, academic progress, how their technical interests match those of the industry, and how they plan to use HPC resources.
Click Here to View Full Article
to the top

Lines and Bubbles and Bars, Oh My! New Ways to Sift Data
New York Times (08/31/08) P. BU4; Eisenberg, Anne

The Many Eyes experimental Web site organized by scientists at IBM's Watson Research Center offers more than a dozen ways to visually and collaboratively represent data for analysis. One method, known as an interleaved tag cloud, allows users to make side-by-side comparisons of the relative frequencies of the words in two passages. Users can incorporate images and links to visualizations in their blogs or Web sites. "When you have a group look at data, you protect against bias," says Stanford professor Pat Hanrahan. "You get more perspectives, and this can lead to more reliable decisions." One presentation on the Many Eyes site charted the deaths resulting from human violence in the 20th century, with one user visualizing casualties associated with specific events on a bubble graph. Later the originator supplied a line graph and a stack graph to plot casualty numbers against population growth, which revealed a decline in violent death in the century's latter decades, says Many Eyes co-creator and IBM researcher Martin Wattenberg. University of Maryland, College Park professor Ben Schneiderman says that sites such as Many Eyes are aiding the democratization of visualization tools. "The gift of the Internet is that everyone can participate, and the tools can be brought to a much wider audience," Schneiderman says. "The great fun of information visualization is that it gives you answers to questions you didn't know you had."
Click Here to View Full Article
to the top

Q&A: Futurist Ray Kurzweil
InformationWeek (08/26/08) Greene, Michael

Inventor Ray Kurzweil discusses speech technology innovations that can help disabled users, noting that this summer saw the introduction of a cell phone that also acts as a reading machine for the vision-impaired. The handheld can take pictures of signs and read them out, and Kurzweil says the product additionally functions as a GPS navigation system, email reader, MP3 player, phone, Web browser, and camera. The device's reading machine function can work in seven languages, while the user is guided via voice-directed output and voice prompts. "Speech recognition can also be used to provide intuitive interfaces into devices," notes Kurzweil, adding that these products will be especially helpful to senior citizens if ease of use is incorporated into their design. He says a good design principle for such interfaces is ensuring that people who have not read the manuals can easily figure out how to employ the devices. Kurzweil observes that the intelligence of technology is advancing, with speech recognition and character recognition becoming more useful. He also predicts that the next decade will see the emergence of fully immersive technologies that can replicate the experience of being in environments and interacting with people that are far removed from the user, as well as augmented reality systems that can enhance the user's real-world perspective with virtual overlays.
Click Here to View Full Article
to the top

Carolina Attracts World-Renowned Large-Scale Data Research Team, DICE
University of North Carolina, Chapel Hill (08/26/08)

The University of North Carolina (UNC) at Chapel Hill is the new home of the Data Intensive Cyber Environments (DICE) group, formerly known as the Data Intensive Computing Environments group of the University of California, San Diego's Supercomputer Center. The research group is experienced in the development of digital data technologies, including open source software for sharing data during collaborative research, the publication of digital libraries, and preserving data for future generations. "The opportunity to recruit an entire group of active researchers with an international reputation for vision, innovation, and accomplishment is rare," says UNC Chapel Hill chancellor Holden Thorp. Thorp says DICE's work is closely aligned with UNC's efforts in digital libraries and archives, databases, institutional repositories, information retrieval, and information management. Hosting the DICE group will give UNC students an opportunity to learn from and collaborate with a world-class research team, Thorp says. Group members will interact with colleagues in the school and other campus units on academic digital library and preservation research efforts, initially focusing on current collaborations, such as the National Archives and Records Administration Transcontinental Persistent Archive Prototype, the National Science Foundation Software Development for Cyberinfrastructure project, and the Library of Congress Video Archiving project. DICE also may work with UNC's computer science department and with the Renaissance Computing Institute on the visualization of large datasets.
Click Here to View Full Article
to the top

IT School to Watch: Carnegie Mellon University
Computerworld (08/18/08) Brandel, Mary

The master's program in human-computer interaction (HCI) at Carnegie Mellon University (CMU) features six faculty that are members of ACM's CHI Academy, more than any other organization. The HCI program trains students for careers in user interface and usability engineering, systems development, and interaction design. CMU's School of Computer Science also offers master's degrees in entertainment technology, e-business technology, software engineering, software engineering management, IT with a specialization in very large information systems, robotics, IT service management, IT-embedded software engineering, and an MBA track in technology leadership. All of the programs are intended to lead to professional positions instead of to additional research or academic appointments, although CMU also offers several academic master's degrees. CMU HCI master's graduate Madhu Prabaker says the program teaches in a way that is actionable, and he was able to learn what it is like to operate in a real-world scenario, helping him hit the ground running even in his first week as a professional. "Everyone must program all night to find that last bug, everyone runs tests with users who do things they never could have predicted, and everyone designs and is subjected to critiques by faculty and peers," says HCI program director Bonnie John. All of the skills the students learn are applied in a final project that is longer and more intense than similar programs at other schools, John says. The projects require that students work in teams of four or five students, simulating a real-world experience.
Click Here to View Full Article
to the top

Open Source: What You Should Learn From the French
IDG News Service (08/28/08) Kaneshige, Tom

About a decade ago, European countries took a strong lead in the open source movement, particularly France, and left U.S. developers far behind. France has used high-profile projects and policies to support the use of open source in all levels of technology in government and education. France is now continuing its support of open source through an economic commission, established by French President Nicolas Sarkozy, that recommends using tax benefits to stimulate even more open source development. The success of open source in France could serve as an example to U.S. developers that everyone can prosper when working under a single, shared technology vision. In France, all computer students learn open source, while in the United States most universities use traditional tools. Consequently, open source talent is prevalent in France, and development is faster, while sustaining high-quality software products. France's most important open source benefit may be the ability to unite various open source projects to create a single, unified platform. Miguel Valdes, co-founder of the Bonita Project, which developed an open source workflow system, believes that French open source developers have a better understanding than U.S. developers about reusing code and integrating it with other systems.
Click Here to View Full Article
to the top

Semantic Provenance for eScience: Managing the Deluge of Scientific Data
Internet Computing (08/08) Vol. 12, No. 4, P. 46; Sahoo, Satya S.; Sheth, Amit; Henson, Cory

Metadata that is essential to the effective management of exponentially growing volumes of scientific data from industrial-scale experiment protocols is known as provenance information in eScience. Expressive provenance information and domain-specific provenance ontologies are incorporated into the semantic provenance architecture for eScience data, which applies the information to data management. The authors write that provenance information has to be expressive and software-interpretable so that it can be employed effectively for eScience data management, and to accomplish this, the authors have combined the concept of provenance information with domain knowledge and ontological underpinning. A new approach that separates the task of producing high-quality semantic provenance from the core functionality of workflow engines is called for, and the authors present a "two-degrees-of-separation" strategy in which semantic provenance creation is handled by specialized services that cite one or more domain-specific provenance ontologies and can be embedded within scientific workflows on demand, while a workflow engine would be equipped with a set of services and a suite of domain-specific provenance ontologies as resources that could be flexibly blended into a scientific workflow based on user needs. The semantic provenance framework for eScience is described by the authors as incorporating three basic dimensions representing semantic provenance annotation, domain provenance ontologies, and usage. The first dimension entails a set of specialized tools interfacing with a scientific workflow on demand to generate semantic-provenance information; the second dimension utilizes domain-specific provenance ontologies to model scientific processes, data, and agents as formally defined concepts connected via named relationships; and the third dimension involves software agents using reasoning tools to process the semantic-provenance information and answer sophisticated domain queries.
Click Here to View Full Article
to the top

To submit feedback about ACM TechNews, contact: [email protected]

To be removed from future issues of TechNews, please submit your email address where you are receiving Technews alerts, at:
http://optout.acm.org/listserv_index.cfm?ln=technews

To re-subscribe in the future, enter your email address at:
http://signup.acm.org/listserv_index.cfm?ln=technews

As an alternative, log in at myacm.acm.org with your ACM Web Account username and password, and follow the "Listservs" link to unsubscribe or to change the email where we should send future issues.

to the top