25 Years of Conventional Evaluation of Data Analysis
Proves Worthless in Practice
Uppsala University (09/03/08)
For the past 25 years, two methods have been used to evaluate
computer-based methods for classifying patient samples, but Swedish
researchers at Uppsala University have found that this methodology is
worthless when applied to practical problems. These methods are the basis
for many technical applications, such as recognizing human speech, images,
and fingerprints, and are now being used in new fields such as health care.
However, to evaluate the performance of a classification model, a number
of trial examples that were never used in the design of the model are
needed. Unfortunately, there are seldom tens of thousands of test samples
available for this type of evaluation, often because the samples are too
rare or expensive to collect to use on an evaluation. Numerous methods
have been proposed to solve this problem and since the 1980s two methods
have dominated the field--cross validation and resampling/bootstrapping.
The Uppsala researchers used both theory and computer simulations to show
that those methods are worthless in practice when the total number of
examples is small in relation to the natural variation that exists among
different observations. What constitutes a small number depends on the
problem being studied. The researchers say it is essentially impossible to
determine whether the number of examples used is sufficient. "Our main
conclusion is that this methodology cannot be depended on at all, and that
it therefore needs to be immediately replaced by Bayesian methods, for
example, which can deliver reliable measures of the uncertainty that
exists," says Uppsala University professor Mats Gustafsson, who co-directed
the study with professor Anders Isaksson. "Only then will multivariate
analyses be in any position to be adopted in such critical applications as
health care."
Click Here to View Full Article
to the top
Multi-Core Chip Research to Lead to Performance Gains,
Power Reduction for High- and Low-End Computing
National Science Foundation (09/02/08)
The National Science Foundation (NSF) and the Semiconductor Research
Corporation (SRC) have announced a three-year joint initiative for
multicore chip design and architecture. The program will focus on several
components of multicore system architecture design that could significantly
enhance and accelerate solutions for advancing semiconductor performance.
About $6 million in funding will be made available to U.S. universities,
who have been invited to submit research proposals. The program is
intended to lead to significant advances in state-of-the-art multicore chip
design and architecture, bring about system-level performance improvements,
and establish new and innovative research areas critical to the future of
computing. The program will focus on computer-aided design for multicore
systems; low-power innovations; and interconnect, packaging, and circuit
techniques for multicore chips. NSF's Sankar Basu says as Moore's Law
scaling becomes more difficult, researchers must explore new means of
insuring continued technological advances in computing. SRC's Steven
Hillenius says the partnership between the government, industry, and
academia will help expose universities to critical computing challenges.
He says that cooperative programs with the NSF help the SRC to deliver
value to its industrial members' capabilities while enabling universities
to improve their understanding of needs in the semiconductor industry.
Click Here to View Full Article
to the top
Students Help Humanity With Open Source Software
Wesleyan University (09/04/08)
The Humanitarian Free and Open Source Software (HFOSS) project is a joint
venture between the computer science departments at Wesleyan University,
Trinity College, and Connecticut College and was established to develop
open source software that benefits humanity. HFOSS developed software used
to coordinate volunteers for relief efforts. The software is part of
Sahana, an open source system that was developed to aid in the recovery
effort following the 2004 Asian tsunami. "Most of the computer programs
that students write while in college are just exercises that have been
solved many times before by many people," says HFOSS steering committee
member Danny Krizanc, a computer science professor at Wesleyan. "These are
necessary for training the mind but I think students get a real
satisfaction out of working on something that potentially will have
thousands of users that they will never meet." Students work on Sahana
during the HFOSS Summer Institute, a 10-week internship program. Part of
this year's effort was spent on developing a credentialing module for
verifying relief workers. For example, the program confirms that someone
who says they are a doctor is actually a doctor before giving them access
to patients and drugs. Another project is InSTEDD, which aims to help
develop artificial intelligence algorithms for identifying disease
outbreaks by processing news reports from around the world.
Click Here to View Full Article
to the top
Letter Lottery Defines Spam Load
BBC News (09/01/08)
Obscure email addresses may be less likely to receive spam messages, says
Richard Clayton, a University of Cambridge computer scientist, who scanned
for patterns in more than 500 million junk messages. Clayton examined
email messages sent over a two-month period and found that more than 40
percent of all emails sent to addresses beginning with a, m, s, r, and p
were spam. By contrast, emails beginning with q, z, and y had a spam
quotient of 20 percent or less. Clayton also found that more than 50
percent of the messages sent to emails beginning with u were junk. One way
to explain these results is a recurring pattern of dictionary-style
attacks. In this attack, a spammer would take an email address such as [email protected] and use the same name
but change the domain to [email protected]. There are likely to
be fewer email addresses beginning with q, z, and y, thus these addresses
receive fewer junk messages, Clayton says.
Click Here to View Full Article
to the top
Indian Researcher's Improved Anti-Hacking System for
Wireless Networks
Asian News International (09/04/08)
Florida Atlantic University researchers Avinash Srinivasan, Feng Li, and
Jie Wu have developed the Probabilistic Voting-based Filtering Scheme
(PVFS), which they say can protect and help improve the viability of
wireless sensor networks (WSNs). WSNs are vulnerable to two types of
cybersabotage, according to the International Journal of Security and
Networks. The first is the fabricated report with false votes attack that
sends phony data to the base stations with a forged validation. The second
type of attack adds false validation votes to genuine incoming data, which
labels genuine data as being false. Most WSN systems have built-in
software to prevent false data from being given valid credentials, but the
second type of attack is more difficult to detect. The researchers say the
PVFS can counter both of these attacks simultaneously. To protect the WSN
while maintaining normal filtering, the researchers use a general en-route
filtering scheme that breaks WSNs into clusters and locks each cluster to a
particular data encryption key. As data reaches the headquarters from the
clusters, the main cluster-heads check the report together with the votes,
acting as the verification nodes in PVFS. Should a saboteur compromise one
or more of the sensors on a WSN, the PVFS will apply probability rules to
determine the likelihood that the network was compromised, using data from
other sensors in different clusters before reporting incoming data as
false.
Click Here to View Full Article
to the top
MIT Lincoln Laboratory Software Aims to Thwart Cyber
Hackers
MIT News (08/27/08)
Researchers at the Massachusetts Institute of Technology's Lincoln
Laboratory are developing the Network Security Planning Architecture
(NetSPA), software that will identify the most vulnerable points in a
computer network. NetSPA uses information on networks, individual
machines, and any programs running to create a graph that displays how
hackers could infiltrate the network. System administrators can examine
the graph and determine the best course of action. NetSPA relies on
vulnerability scanners to identify known weaknesses in network-accessible
programs that could allow an unauthorized person to access a machine.
NetSPA also analyzes complex firewall and router rules to determine which
vulnerabilities can be reached and exploited by attackers, and how attacks
can spread within a network by moving from one vulnerable host to another.
Richard Lippmann, leader of the development effort, says NetSPA enables
network administrators to see which vulnerabilities pose the greatest
threat to the network, allowing them to fix those problems first instead of
patching or fixing vulnerabilities that are not accessible to attackers.
NetSPA also can account for unforeseen avenues of attack, such as if a
network had to share data with an outside vendor years ago, and now someone
is forging that IP address to try to exploit the forgotten permission.
Click Here to View Full Article
to the top
Small Communities Can Plan for Emergencies Too
ICT Results (09/03/08)
European researchers are working on ERMA, a European Union-funded project
to create technology for helping small communities respond better to
emergency situations. ERMA researchers developed an online risk management
system with a fixed and mobile-based telecoms alarm system by examining
commercial software used by businesses to see how it could be adapted for
rescue situations. Process management software has many parallels to
emergency situations, with one process being followed by another. "It's
not the automation aspects of the software, but the pre-planning aspect
which dictates exactly what to do at each step of a process that can be
translated into procedures for, say, a fire brigade to follow," says ERMA
technical coordinator Gertraud Peinel. There also are similarities in
customer relationship management software, which provides companies with
contact and other details of their customers. The researchers developed a
core system and tested the software with simulated floods in Romania and a
simulated toxic cloud in a Spanish port. The challenge now is to persuade
communities to use the software.
Click Here to View Full Article
to the top
Media Convergence--Television Is the Media Hub Device for
Web, TV and Communication
Fraunhofer Institute (08/08)
A new project from Fraunhofer FOKUS and the RTL Deutschland media group
brings TV, the Internet, and communication capabilities together to deliver
interactive and personalized TV. Fraunhofer FOKUS is using IFA 2008 to
show how Internet-based content from RTL Deutschland can be adapted for TV,
integrated with the group's TV programs, and managed by remote control.
"Highly graphic scenarios for tomorrow's media world can be built using the
TV content and online products of RTL, VOX, n-tv, SUPER RTL, and Clipfish,"
says Stefan Arbanowski, head of the Media Interoperability division at
Fraunhofer Institute FOKUS. "Join this up with the work done by FOKUS in
convergence of technologies and content, and the shape of tomorrow's media
world becomes visible to us all." As part of the "Media Convergence of the
Future" project, FOKUS has developed an IPTV system that integrates set-top
boxes, PCs, or other end devices in a shared media environment. New
advertising concepts show the potential of integrating linear TV programs
with nonlinear Internet content in such a manner.
Click Here to View Full Article
to the top
Self-Help Software to Soothe Stressed Astronauts
New Scientist (08/25/08) Powell, David
Researchers at Harvard Medical School are working on Virtual Space
Station, software designed to help astronauts deal with the pressures of
space travel. The multimedia program asks crew members to respond to
multiple-choice questions about how to handle various problems that may
arise in space, and to make lists of their concerns and how to solve them.
The program focuses on the mental health challenges that were highlighted
by a panel of 11 astronauts. The program features an interpersonal
conflict widget that includes interactive videos to help negotiate and
resolve conflicts. The videos are meant to help astronauts develop
self-awareness of their own behaviors, and have already been used to train
groups of new astronauts. Additional group studies will test the
software's effectiveness with other stressful professions that rely on
teamwork, including firefighters and emergency first responders. Another
module of the Virtual Space Station focuses on depression, using an
approach called problem-solving therapy, which is both clinically effective
and relatively simple to encode into a software program. Instead of asking
astronauts to talk about their feelings, the program has them create lists
of concrete things that are bothering them and brainstorm practical ways of
solving these problems.
Click Here to View Full Article
to the top
A Network That Builds Itself
Technology Review (09/03/08) Fitzgerald, Michael
The National Institute of Standards and Technology (NIST) has developed
two experimental ad hoc wireless networks that instruct emergency workers
how to deploy transmitters to ensure a good signal. The NIST prototypes,
which have been under development for more than three years, use algorithms
to monitor the signal-to-noise ratio of transmissions and automatically
warn when a new node should be deployed. NIST's Nader Moayeri says the
prototypes aim to avoid fixed rules because situations change depending on
the area. The methods also need to be adaptable because deploying too many
nodes can lead to excessive costs and communication delays. Initially,
NIST considered sending short messages between nodes to see how data
packets were lost in transit, except that the people deploying the network
would not detect a weak connection immediately. Using an algorithm to
measure the signal-to-noise ratio avoids this problem and provides a
clearer picture of connection strength. NIST built two prototypes using
off-the-shelf hardware, one that operates at 900 megaherz and uses motes to
transmit radio signals, and one based on a Wi-Fi network operating at 2.4
gigahertz. The mote-based system has LED lights that automatically change
from green to red when a new node needs to be deployed, and the Wi-Fi
system issues alerts through a handheld or tablet computer connected to the
network.
Click Here to View Full Article
to the top
Fellowship Applications for Doctoral Students Pursuing
HPC Research Due September 8
SC Conference (09/01/08)
Sept. 8, 2008, is the last day to turn in applications for the High
Performance Computing Ph.D. Fellowship Program. Applicants will have an
opportunity to receive a stipend of $5,000 or more for one academic year,
plus travel support to attend the SC conference. Up to three fellowships
have been awarded each year. ACM partners with the IEEE Computer Society
and the SC Conference Series to honor outstanding Ph.D. students around the
world pursuing research in high performance computing (HPC), networking,
storage, and analysis. Candidates must be enrolled in a full-time Ph.D.
program at an accredited college or university, should have completed one
year or more of study in their doctoral program at the time of their
nomination, and must meet minimum scholarship requirements at the
institutions. Full-time faculty members at Ph.D.-granting accredited
institutions must nominate them. Selections will be based on applicants'
research potential, academic progress, how their technical interests match
those of the industry, and how they plan to use HPC resources.
Click Here to View Full Article
to the top
Lines and Bubbles and Bars, Oh My! New Ways to Sift
Data
New York Times (08/31/08) P. BU4; Eisenberg, Anne
The Many Eyes experimental Web site organized by scientists at IBM's
Watson Research Center offers more than a dozen ways to visually and
collaboratively represent data for analysis. One method, known as an
interleaved tag cloud, allows users to make side-by-side comparisons of the
relative frequencies of the words in two passages. Users can incorporate
images and links to visualizations in their blogs or Web sites. "When you
have a group look at data, you protect against bias," says Stanford
professor Pat Hanrahan. "You get more perspectives, and this can lead to
more reliable decisions." One presentation on the Many Eyes site charted
the deaths resulting from human violence in the 20th century, with one user
visualizing casualties associated with specific events on a bubble graph.
Later the originator supplied a line graph and a stack graph to plot
casualty numbers against population growth, which revealed a decline in
violent death in the century's latter decades, says Many Eyes co-creator
and IBM researcher Martin Wattenberg. University of Maryland, College Park
professor Ben Schneiderman says that sites such as Many Eyes are aiding the
democratization of visualization tools. "The gift of the Internet is that
everyone can participate, and the tools can be brought to a much wider
audience," Schneiderman says. "The great fun of information visualization
is that it gives you answers to questions you didn't know you had."
Click Here to View Full Article
to the top
Q&A: Futurist Ray Kurzweil
InformationWeek (08/26/08) Greene, Michael
Inventor Ray Kurzweil discusses speech technology innovations that can
help disabled users, noting that this summer saw the introduction of a cell
phone that also acts as a reading machine for the vision-impaired. The
handheld can take pictures of signs and read them out, and Kurzweil says
the product additionally functions as a GPS navigation system, email
reader, MP3 player, phone, Web browser, and camera. The device's reading
machine function can work in seven languages, while the user is guided via
voice-directed output and voice prompts. "Speech recognition can also be
used to provide intuitive interfaces into devices," notes Kurzweil, adding
that these products will be especially helpful to senior citizens if ease
of use is incorporated into their design. He says a good design principle
for such interfaces is ensuring that people who have not read the manuals
can easily figure out how to employ the devices. Kurzweil observes that
the intelligence of technology is advancing, with speech recognition and
character recognition becoming more useful. He also predicts that the next
decade will see the emergence of fully immersive technologies that can
replicate the experience of being in environments and interacting with
people that are far removed from the user, as well as augmented reality
systems that can enhance the user's real-world perspective with virtual
overlays.
Click Here to View Full Article
to the top
Carolina Attracts World-Renowned Large-Scale Data
Research Team, DICE
University of North Carolina, Chapel Hill (08/26/08)
The University of North Carolina (UNC) at Chapel Hill is the new home of
the Data Intensive Cyber Environments (DICE) group, formerly known as the
Data Intensive Computing Environments group of the University of
California, San Diego's Supercomputer Center. The research group is
experienced in the development of digital data technologies, including open
source software for sharing data during collaborative research, the
publication of digital libraries, and preserving data for future
generations. "The opportunity to recruit an entire group of active
researchers with an international reputation for vision, innovation, and
accomplishment is rare," says UNC Chapel Hill chancellor Holden Thorp.
Thorp says DICE's work is closely aligned with UNC's efforts in digital
libraries and archives, databases, institutional repositories, information
retrieval, and information management. Hosting the DICE group will give
UNC students an opportunity to learn from and collaborate with a
world-class research team, Thorp says. Group members will interact with
colleagues in the school and other campus units on academic digital library
and preservation research efforts, initially focusing on current
collaborations, such as the National Archives and Records Administration
Transcontinental Persistent Archive Prototype, the National Science
Foundation Software Development for Cyberinfrastructure project, and the
Library of Congress Video Archiving project. DICE also may work with UNC's
computer science department and with the Renaissance Computing Institute on
the visualization of large datasets.
Click Here to View Full Article
to the top
IT School to Watch: Carnegie Mellon University
Computerworld (08/18/08) Brandel, Mary
The master's program in human-computer interaction (HCI) at Carnegie
Mellon University (CMU) features six faculty that are members of ACM's CHI
Academy, more than any other organization. The HCI program trains students
for careers in user interface and usability engineering, systems
development, and interaction design. CMU's School of Computer Science also
offers master's degrees in entertainment technology, e-business technology,
software engineering, software engineering management, IT with a
specialization in very large information systems, robotics, IT service
management, IT-embedded software engineering, and an MBA track in
technology leadership. All of the programs are intended to lead to
professional positions instead of to additional research or academic
appointments, although CMU also offers several academic master's degrees.
CMU HCI master's graduate Madhu Prabaker says the program teaches in a way
that is actionable, and he was able to learn what it is like to operate in
a real-world scenario, helping him hit the ground running even in his first
week as a professional. "Everyone must program all night to find that last
bug, everyone runs tests with users who do things they never could have
predicted, and everyone designs and is subjected to critiques by faculty
and peers," says HCI program director Bonnie John. All of the skills the
students learn are applied in a final project that is longer and more
intense than similar programs at other schools, John says. The projects
require that students work in teams of four or five students, simulating a
real-world experience.
Click Here to View Full Article
to the top
Open Source: What You Should Learn From the French
IDG News Service (08/28/08) Kaneshige, Tom
About a decade ago, European countries took a strong lead in the open
source movement, particularly France, and left U.S. developers far behind.
France has used high-profile projects and policies to support the use of
open source in all levels of technology in government and education.
France is now continuing its support of open source through an economic
commission, established by French President Nicolas Sarkozy, that
recommends using tax benefits to stimulate even more open source
development. The success of open source in France could serve as an
example to U.S. developers that everyone can prosper when working under a
single, shared technology vision. In France, all computer students learn
open source, while in the United States most universities use traditional
tools. Consequently, open source talent is prevalent in France, and
development is faster, while sustaining high-quality software products.
France's most important open source benefit may be the ability to unite
various open source projects to create a single, unified platform. Miguel
Valdes, co-founder of the Bonita Project, which developed an open source
workflow system, believes that French open source developers have a better
understanding than U.S. developers about reusing code and integrating it
with other systems.
Click Here to View Full Article
to the top
Semantic Provenance for eScience: Managing the Deluge of
Scientific Data
Internet Computing (08/08) Vol. 12, No. 4, P. 46; Sahoo, Satya S.; Sheth,
Amit; Henson, Cory
Metadata that is essential to the effective management of exponentially
growing volumes of scientific data from industrial-scale experiment
protocols is known as provenance information in eScience. Expressive
provenance information and domain-specific provenance ontologies are
incorporated into the semantic provenance architecture for eScience data,
which applies the information to data management. The authors write that
provenance information has to be expressive and software-interpretable so
that it can be employed effectively for eScience data management, and to
accomplish this, the authors have combined the concept of provenance
information with domain knowledge and ontological underpinning. A new
approach that separates the task of producing high-quality semantic
provenance from the core functionality of workflow engines is called for,
and the authors present a "two-degrees-of-separation" strategy in which
semantic provenance creation is handled by specialized services that cite
one or more domain-specific provenance ontologies and can be embedded
within scientific workflows on demand, while a workflow engine would be
equipped with a set of services and a suite of domain-specific provenance
ontologies as resources that could be flexibly blended into a scientific
workflow based on user needs. The semantic provenance framework for
eScience is described by the authors as incorporating three basic
dimensions representing semantic provenance annotation, domain provenance
ontologies, and usage. The first dimension entails a set of specialized
tools interfacing with a scientific workflow on demand to generate
semantic-provenance information; the second dimension utilizes
domain-specific provenance ontologies to model scientific processes, data,
and agents as formally defined concepts connected via named relationships;
and the third dimension involves software agents using reasoning tools to
process the semantic-provenance information and answer sophisticated domain
queries.
Click Here to View Full Article
to the top