Association for Computing Machinery
Welcome to the July 15, 2015 edition of ACM TechNews, providing timely information for IT professionals three times a week.

Updated versions of the ACM TechNews mobile apps are available for Android phones and tablets (click here) and for iPhones (click here) and iPads (click here).


Google Fights Spam With Artificial Intelligence
The Christian Science Monitor (07/13/15) Graham Starr

Google announced last week it has been using its artificial neural network to enhance email spam filtering, to the degree that it has been able to block 99.9 percent of spam from inboxes, while incorrectly classifying legitimate email as spam only 0.05 percent of the time. Google's system is primarily founded on Gmail's "report spam" and "not spam" buttons, taking this user input and referencing other user actions to learn what is and is not spam. The server can learn to identify, separate, and redirect malicious emails from the inbox, but Google acknowledges spam can still bypass filters, usually by using previously unaccounted domains or emulating desired emails. Google's neural network is an array of learning supercomputers that can cogitate and identify imagery to spot junk emails, using natural language processing and information from other users to make conclusions about the messages it is analyzing. Tufts University professor Anselm Blumer cautions decision-making via neural networks can lead to "overfitting." "A network like that is harder to train, and it's much easier for it to come to false conclusions," he notes. Blumer stresses it is difficult to program a system that does not come to false conclusions, without also ensuring it does not overlook any real ones.

MIT Proves Flash Is as Fast as RAM, and Cheaper, for Big Data
Computerworld (07/13/15) Lucas Mearian

Massachusetts Institute of Technology (MIT) researchers built a flash-based server network that is just as fast as dynamic random access memory (DRAM) and much less expensive for running big data applications. In the new network, each server was connected to a field-programmable gate array (FPGA), with each FPGA then linked to two 500-gigabyte flash chips and to the two FGPAs nearest it in the server rack. The researchers found the network was able to process a 10-terabyte (TB) dataset in flash using only 10 computers, while the same 10 TB dataset would have taken about 100 computers to process it in DRAM. In addition, the new network would cost less than $70,000, versus about $400,000 for the DRAM network. "This price may go down even further if we consider the fact we don't need as much DRAM on each server on a flash-based system," says MIT professor Arvind Mithal. Maintaining a flash-based system is less costly because flash consumes much less power than DRAM, and also because it would require fewer servers. The researchers showed if servers working on a distributed computation use disk drives to retrieve data only 5 percent of the time, performance is the same is if it were using flash.

Mellanox, IBM, ORNL Spearhead UCX Framework Initiative
HPC Wire (07/13/15) John Russell

At the ISC High Performance conference in Frankfurt, Germany, on Monday, Mellanox announced a collaboration between it, the Oak Ridge National Laboratory, NVIDIA, the University of Tennessee, and IBM to develop the United Communication X Framework (UCX), an open source network communication framework for high-performance computing (HPC) and data-centric applications. Organizers say the project will help clear a path toward exascale computing by sparking collaboration between industry, labs, and academia, developing an open source production-grade communication framework for data-centric and HPC applications, and facilitating the highest performance via co-design of software-hardware interfaces. UCX partners hope the initiative will not only yield a mechanism for production-quality software development, but also a low-level research infrastructure for more flexible and portable support for exascale-ready programming models. "UCX is clearly a strategic open source communication framework for future high-performance systems," notes IBM fellow Jim Sexton. He says IBM's contributions include innovations from its PAMI high-performance messaging software currently used in some Top 10 supercomputers. Meanwhile, the UCX organizers say the framework will function as a high-performance, low-latency communication layer and help provide app developers with productive, extreme-scale programming languages and libraries such as Partitioned Global Address Space, application programming interfaces, and OpenMP across multiple memory domains and on heterogeneous nodes.

Where Do Most of the Internet Users Live?
University of Oxford (07/13/15)

Researchers at the Oxford Internet Institute have updated their 2011 visualization of Internet users around the world, using 2013 data from the World Bank. The data is visualized with a hexagonal cartogram, and each hexagon represents about half a million people online. The shading of each country in the cartogram indicates the share of the population with Internet access. The map shows Asia has 1.24 billion Internet users, which is nearly half (46 percent) of the world's users. At the national level, China has the largest Internet population at 600 million people, followed by the U.S. with 270 million, India with 190 million, and Japan with 110 million people online. China has more Internet users than the U.S., India, and Japan combined, even though most of its population has never used the Internet. Among countries with at least 10 million inhabitants, the highest Internet penetration is in the Netherlands, the U.K., Japan, Canada, South Korea, the U.S., Germany, Australia, Belgium, and France. Nonetheless, the map ultimately shows most of the world's people remain disconnected, says Oxford researcher Mark Graham. "Even today, only a bit more than a third of humanity has access to the Internet," he notes.

Computer Program Fixes Old Code Faster Than Expert Engineers
MIT News (07/09/15) Adam Conner-Simons

Researchers at the Massachusetts Institute of Technology's (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed Helium, a program designed to automatically fix existing code without requiring the original source. Helium can achieve in hours or minutes what would take human engineers months, according to MIT professor Saman Amarasinghe. "A system like this can help companies make sure that the next generation of code is faster, and save them the trouble of putting 100 people on these sorts of problems," he says. The CSAIL team started with binary code with debug symbols removed, and Helium enabled the researchers to lift stencil kernels from the stripped binary and reconfigure them as high-level representations that are readable in Halide, a programming language designed for image processing. "Because stencils do the same computation over and over again, we are able to accumulate enough data to recover the original algorithms," says MIT graduate student Charith Mendis. The researchers determined Helium can enhance the performance of certain Photoshop filters by 75 percent, and the performance of less optimized programs such as Microsoft Windows' IrfanView by 400 to 500 percent. The Helium research was presented last month in a paper at ACM's SIGPLAN conference on Programming Language Design and Implementation in Portland, Ore.

CMU Leads Google Expedition to Create Technology for 'Internet of Things'
Carnegie Mellon News (PA) (07/09/15) Byron Spice

Researchers at Carnegie Mellon University (CMU) and Google are collaborating to turn the college campus into a living laboratory for a Google-funded, multi-university experiment to create a platform that will enable Internet-connected devices and buildings to communicate with each other. "The goal of our project will be nothing less than to radically enhance human-to-human and human-to­ computer interaction through a large-scale deployment of the Internet of Things (IoT) that ensures privacy, accommodates new features over time, and enables people to readily design applications for their own use," says Anind K. Dey, the project's lead investigator and director of CMU's Human-Computer Interaction Institute. The researchers will work with colleagues at Cornell and Stanford universities, as well as the University of Illinois, to create GloTTO, a new platform to support IoT applications. A separate CMU team will develop technology to further protect the privacy of IoT users. Overall, the project aims to create a complete system of interoperable IoT technology and find answers to key research questions, such as how to preserve privacy and ensure security in an increasingly sensor-filled environment. "An early milestone will include the development of our IoT appstore, where any campus member and the larger research community will be able to develop and share an IoT script, action, multiple-sensor feed, or application easily and widely," Dey says.

Engagement in STEM Through Robotics, Mechatronics, Cybersecurity, and More
National Science Foundation (07/09/15) Maria C. Zacharias

The New York University (NYU) Polytechnic School of Engineering last week launched its Summer of STEM, announcing several projects designed to provide students and educators with learning and development opportunities in science, engineering, technology, and mathematics (STEM) fields. Several of the projects, which are receiving funding through the U.S. National Science Foundation, focus on bringing teachers to NYU for workshops and research that will help them build up the STEM programs at their own schools. For example, the Discovery Research for Teachers program and the Science and Mechatronics Aided Research for Teachers with an Entrepreneurship expeRience (SMARTER) program but focus on robotics and engineering. The Cybersecurity for Teachers and Cybersecurity for College Instructors programs will follow a similar model, but will focus on cybersecurity issues. The Applying Mechatronics to Promote Science program will send NYU engineering students into Brooklyn schools to teach them about math and science using robots. Previous incarnations of the program have helped more than 70 percent of participating students raise their STEM test scores by a half or full letter-grade. Finally, the GenCyber program will help to teach 75 high school girls about cybersecurity and introduce them to role models in the field.

Can the Apple Watch Enhance Student Achievement?
Government Technology (07/07/15) Jessica Hughes

Pennsylvania State University (PSU) researchers will test how the Apple Watch can enhance student achievement via self-regulation and learning strategy. "I think that there's some versatility here that we haven't seen before in this type of application," says PSU professor Rayne Sperling. She currently is determining the proper learning prompts to help students in their learning regulation, and she says "one way that prompts can support students' awareness of their own learning is through modeling the types of questions students should ask themselves. Further, our scaffolds can prompt awareness of whether [the student] understands content and will also provide strategy suggestions." In an ideal scenario, the prompts will be managed in one place, but delivered across different formats including the Apple Watch, smartphones, PSU's Web-based learning management system, and other wearable devices in the future, according to researcher Ben Brautigam. The final scaffolds will be provided to student volunteers in fall science, technology, engineering, and math courses so the researchers can see when and how different tech formats impact self-regulated learning and student achievement. Sperling says the culmination of the study will be the ability to predict which types of students are best able to use which types of technologies, and how these tools are supporting their academic achievements.

Astronomers Teach Machine to 'See' Galaxies in Space
Engineering and Technology Magazine (07/08/15) Laura Onita

University of Hertfordshire researchers say they have taught a machine to "see" astronomical images, including the ability to distinguish between galaxies. Astronomers and computer scientists used unsupervised machine learning, which enabled galaxies to be classified in real time at high speed, and has previously been done only in projects involving thousands of human volunteers. The researchers demonstrated their algorithm using data from the Hubble Space Telescope Frontier Fields. The algorithm does not tell the machine what to look for in images, but instead how to see, notes Hertfordshire master's student Alex Hocking, who led the work. "Our aim is to deploy this tool on the next generation of giant imaging surveys where no human, or even group of humans, could closely inspect every piece of data," says James Geach, Hocking's supervisor. The researchers note the technology could be used in other areas, such as in medicine to spot tumors, or in security to scan for suspicious items.

Scientists Develop Free, Online Genetic Research Tool
MU News Bureau (MO) (07/07/15) Jeff Sossamon

Scientists at the University of Missouri (MU) have introduced an online, free service called RNAMiner designed to enable biological researchers to handle large data sets. MU professor Jianlin Cheng says a problem with RNA sequencing is "scientists must sift through incredibly large amounts of data to get to usable results. RNAMiner has cut that time drastically." Cheng and doctoral students Jilong Li and Jie Hou partnered with members of several MU departments and the Bond Life Sciences Center to analyze vast genomic data sets and formulate the design of RNAMiner. The website was created to be user-friendly and enables users to upload data and analyze it through as many as five steps against the complete genomes of five species: human, mouse, Drosophila melanogaster, TAIR10 arabidopsis, and Clostridium perfringens. Genomic data for any species is welcome for upload to expand the database, and Cheng says most researchers get results within a couple of hours. A paper accompanying the website's creation was recently published in PLoS One and was funded in part by the U.S. National Institutes of Health and Cheng's U.S. National Science Foundation CAREER Award.

Genomics Among the Biggest of Big Data, Experts Say
University of Illinois News Bureau (07/07/15) Liz Ahlberg

University of Illinois (UI) researchers have completed an assessment, which found DNA sequencing data will require massive computational and storage capabilities beyond anything previously anticipated. They compared the data need of genomics with that of astronomy, Twitter, and YouTube, projecting the growth in each area through the year 2025. The projections found genomics is poised to be a leader in data acquisition, storage, distribution, and analysis. The only way to handle the data will be to improve the computing infrastructure for genomics, says UI professor Gene Robinson. Genomics data is highly distributed, coming from many different sources. So far, genomics information has only produced data on the petabyte scale, but over the last decade genomic sequencing data doubled about every seven months, and will grow at an even faster rate as personal genome sequencing becomes more common. By 2025, genomics data will reach the exabyte scale, according to the UI researchers' estimates. "In the future, we may have to take the hard decision of storing only the processed form and not the original, and that, too, in heavily compressed forms, to drastically reduce the storage needs," says UI professor Saurabh Sinha.

Minecraft Shows Robots How to Stop Dithering
Technology Review (07/13/15) Will Knight

Brown University professor Stefanie Tellex and colleagues have developed an approach that enables robots to quickly determine the sequence of actions that will work in a particular environment. The approach is based on an algorithm that enables robots to prune away certain possible paths of action by understanding the direction in which a particular task points. The team used the popular computer game "Minecraft" to test the approach, and they report the algorithm controlled a character, learned certain behaviors, and worked through a much smaller set of potential scenarios. The researchers controlled an avatar tasked with putting a virtual gold block into a virtual furnace, while avoiding a virtual pool of lava. After performing the task in a limited setting, the algorithm controlling the avatar learned that certain behaviors, such as placing gold blocks on the ground, could be excluded when trying to achieve the goal. The researchers also tested it on an actual robot, and they say their approach is more efficient and even more human because it requires a deeper understanding of a task and its context. They note the strategy could be important as robots take on more complex, open-ended tasks in less structured settings.

6 Questions for Carnegie Mellon Autonomous Car Prof Rajkumar
Associated Press (07/09/15) Tom Krisher

Carnegie Mellon University (CMU) has a long history with self-driving cars. In 1984, researchers at the university developed a self-driving vehicle called the "Terregator," which navigated using video cameras, lasers, and sonar. More recently, the school has played a major role in the current self-driving car research boom. The head of Google's self-driving car project was educated at CMU and, in addition to doing research for General Motors, the university recently signed a pact to develop a self-driving vehicle along with ride-sharing service Uber. CMU professor Raj Rajkumar has led the university's research into self-driving cars for several years and heads a spin-off company developing autonomous driving software. Asked how soon he expects autonomous cars to be available to the public, Rajkumar says by the mid- to late-2020s, adding he believes Google's claims they will have a fully automated car ready within five years to be credible. However, Rajkumar recognizes there are several major obstacles, both technological and social, still to overcome. He says getting self-driving cars to account for bad weather, in particular rain and snow and their impact on road conditions, is going to take considerable time. There also is the looming issue of liability related to self-driving vehicles, a question that could take years to fully resolve.

Abstract News © Copyright 2015 INFORMATION, INC.
Powered by Information, Inc.

To submit feedback about ACM TechNews, contact: [email protected]
Current ACM Members: Unsubscribe/Change your email subscription by logging in at myACM.
Non-Members: Unsubscribe