Welcome to the August 11, 2014 edition of ACM TechNews, providing timely information for IT professionals three times a week.
Updated versions of the ACM TechNews mobile apps are available for Android phones and tablets (click here) and for iPhones (click here) and iPads (click here).
HEADLINES AT A GLANCE
Big Data's High-Priests of Algorithms
The Wall Street Journal (08/08/14) Elizabeth Dwoskin
Industry demand for data scientists has exploded over the years, concurrent with a boom in the generation of data from advanced technologies such as smartphones. Merchants, banks, manufacturers, and many other businesses are looking for people who are especially skilled in sifting through massive volumes of data streaming in from different sources for patterns into customer habits and in crafting predictive statistical models. The U.S. National Science Foundation says data scientists are usually hired from the statistics, biostatistics, particle physics, and computer science fields, and 2012 saw approximately 2,500 doctorates awarded in those disciplines. Some data scientists say they felt drawn to industry positions because funding for purely scientific research was cut back during the recession. Employers increasingly are seeking assistance from the Insight Data Science Fellows Program, which helps transition doctoral candidates from fields such as astrophysics, neuroscience, and math into professional data scientists. The program has a 100-percent placement rate, and alumni work in data-science teams at established as well as novice Silicon Valley companies. Some data scientists recruited in non-academic settings say their work can be surprisingly rewarding and meaningful, even in the absence of the greater intellectual challenges of basic research.
What Cars Did for Today's World, Data May Do for Tomorrow's
The New York Times (08/10/14) Quentin Hardy
The acquisition and processing of digital information has become the dominant industrial ecosystem, which calls for new and improved ways of collecting, shipping, and processing data. An example of this ecosystem is General Electric's (GE) announcement of a "data lake" technique for analyzing sensor information from industrial equipment in systems such as railroads, airlines, hospitals, and utilities. GE says this method enabled it to examine data from 3.4 million miles of flights by 24 airlines using GE jet engines for the last three months, and determine possible defects 2,000 times as fast as it could previously. In a similar vein is a method for engineering fiber-optic cable that University of California, San Diego researchers are using to make digital networks operate 10 times faster. California Institute for Telecommunications and Information Technology director Larry Smarr says the goal of this effort is to accommodate the massive volumes of data generated by increasingly advanced technology, and he anticipates commercial networks eventually running 10,000 times faster than they are now. Data computation also is undergoing an evolution through initiatives such as Databricks, a startup that employs new types of software for rapid data analysis on a rental basis.
Microsoft Shows Off Video Stabilization Tools
BBC News (08/11/14)
New software from Microsoft researchers can stabilize shaky video taken while cycling, climbing, kayaking, or engaging in some other high-speed sport, and speed the footage up to make it more watchable. Hyperlapse is designed to analyze the video and then create new frames to smooth out camera jumps. The software analyzes a video for significant features in each scene and creates a very approximate reconstruction of the part of the world through which the camera traveled. Next, it works out the smoothest path the camera could take through this virtual reconstruction. Hyperlapse then renders footage in which the camera travels the smoother path. At this point in the process, the software generates and adds extra frames to remove jumps in the original video and to fill in around the smooth path of the camera. Developers Johannes Kopf, Michael Cohen, and Richard Szeliski are presenting their research at the ACM SIGGRAPH conference this week in Vancouver, Canada.
Origami Robot Inspired by Protein Folding
The Engineer (United Kingdom) (08/08/14)
Researchers at the Massachusetts Institute of Technology and Harvard University have developed a robot that assembles itself into a complex shape in just four minutes and moves without any human intervention. The researchers say their breakthrough demonstrates the potential to quickly and inexpensively build machines that interact with the environment, and to automate much of the design and assembly process. "Here we created a full electromechanical system that was embedded into one flat sheet," says Harvard researcher Sam Felton. The researchers used computer design tools to create the optimal design and fold pattern, enabling the robot to fold itself up and walk away. The new method relies on origami, which enabled the researchers to avoid the traditional approach to assembling complex machines. The researchers started with a composite sheet of paper and polystyrene with a single flexible circuitboard in the middle, to which they added two motors, two batteries, and a microcontroller. The sheet also includes hinges programmed to fold at specific angles. Each hinge includes embedded circuits that produce heat on command from the microcontroller. The heat triggers the composite to self-fold in a series of steps. When the hinges cool, the polystyrene hardens, and the microcontroller signals the robot to walk away.
Google's Big-Data Tool, Mesa, Holds Petabytes of Data Across Multiple Servers
IDG News Service (08/08/14) Joab Jackson
Google says its big-data architecture, Mesa, can store petabytes of data, update millions of rows of data per second, and field trillions of queries daily across multiple servers, enabling continuous operation of the data warehouse even if a data center fails. "Mesa ingests data generated by upstream services, aggregates and persists the data internally, and serves the data via user queries," note Google researchers. They say Mesa was originally constructed to house and analyze critical measurement data for Google's Internet advertising business, but the technology could be applicable to other, similar data warehouse tasks. Mesa is dependent on other Google technologies, such as the Colossus distributed file system, the BigTable distributed data storage system, and the MapReduce data analysis framework. Google engineers implemented Paxos, a distributed synchronization protocol, to help address query consistency issues. Mesa also can operate on generic servers, making costly specialized hardware unnecessary and enabling Mesa to be run as a cloud service with the advantage of scalability. Consultant Curt Monash suggests Mesa may face limited future commercial prospects, and he recommends enterprises seek commercial or open source options to maintain data warehouse consistency across data centers before adopting Google's offerings. He says the majority of new data stores under development boast some form of multi-version currency control.
A New Chip Functions Like a Brain, IBM Says
The New York Times (08/07/2014) John Markoff
IBM researchers have developed a new kind of computer chip that tries to mimic the way brains recognize patterns, relying on densely interconnected webs of transistors similar to the brain's neural network. Named TrueNorth, the processor has electronic "neurons" that are able to signal others when a type of data--such as light--passes a certain threshold. Working in parallel, the neurons begin to organize the data into patterns suggesting the light is growing brighter, or changing color or shape. The chip contains 5.4 billion transistors but draws just 70 milliwatts of power. The vast number of circuits working in parallel enables the chip to perform 46 billion operations a second per watt of energy consumed. TrueNorth has 1 million neurons, which is about as complex as the brain of a bee. "It is a remarkable achievement in terms of scalability and low power consumption," says the Lawrence Berkeley National Laboratory's Horst Simon. IBM's work in neuromorphic computing was funded by the U.S. Defense Advanced Research Projects Agency, but neural network expert Yann LeCun doubts IBM's TrueNorth approach will ever outpace the fastest current commercial processors. "This particular task won't impress anyone in computer vision or machine learning," he contends.
Carnegie Mellon Developing Programming Language That Accommodates Multiple Languages in Same Program
Carnegie Mellon News (PA) (08/07/14) Byron Spice
Carnegie Mellon University (CMU) researchers have developed Wyvern, a programming language that makes it possible to develop software using a variety of targeted, domain-specific languages rather than writing the entire program in a general-purpose language. The researchers say Wyvern enables programmers to use the language most appropriate for each function while guarding against code injection attacks. Wyvern determines which sublanguage is being used within the program based on the type of data the programmer is manipulating. "Wyvern is like a skilled international negotiator who can smoothly switch between languages to get a whole team of people to work together," says CMU professor Jonathan Aldrich. "Such a person can be extremely effective and, likewise, I think our new approach can have a big impact on building software systems." Aldrich notes other researchers have tried to create programming languages that could understand other languages, but they have faced tradeoffs between composability and expressiveness. "With Wyvern, we're allowing you to use these languages, and define new ones, without worrying about composition," says CMU Ph.D. student Cyrus Omar. Although Wyvern is not fully engineered, Omar says the open source project is ready for experimental use by early adopters.
Computational Linguistics of Twitter Reveals the Existence of Global Superdialects
Technology Review (08/07/14)
Messages posted on Twitter have provided researchers with a new way to study dialects on a global scale. Researchers at the University of Toulon and the Institute for Cross-Disciplinary Physics and Complex Systems studying Spanish dialects sampled all tweets in the language that contained geolocation information over two years, which resulted in a database of 50 million tweets. They searched the tweets for word variations that are indicative of specific dialects, and then plotted where in the world the different words were being used, producing a map of their distribution. When looking at the environments in which the words were used, the team discovered Spanish dialects fall into two groups. The first superdialect is used in major Spanish and U.S. cities, and the second corresponds to what is used in rural areas in Spain, the Caribbean, and South America, says the University of Toulon's Bruno Goncalves and the Institute for Cross-Disciplinary Physics and Complex Systems' David Sanchez. They found the subclusters and variations of the second superdialect using a machine-learning algorithm. Goncalves and Sanchez's project demonstrates the power of computational linguistics and how it can be applied to modern forms of communication such as Twitter to uncover patterns on an unprecedented scale.
U-M Developing Wearable Tech for Disease Monitoring
University of Michigan (08/06/2014) Catharine June
University of Michigan (U-M) researchers are developing a wearable vapor sensor that could offer continuous disease monitoring for patients with diabetes, high blood pressure, anemia, or lung disease. The researchers say the sensor would be the first wearable device to pick of a wide range of chemical, rather than physical, attributes. "Each of these diseases has its own biomarkers that the device would be able to sense," says U-M professor Sherman Fan. In addition to being able to detect a broader array of chemicals than conventional devices, the system is faster, smaller, and more reliable than its counterparts, according to the researchers. The sensor also would be able to identify the presence of hazardous chemical leaks in a laboratory, or provide data about air quality. "With our platform technology, we can measure a variety of chemicals at the same time, or modify the device to target specific chemicals," says U-M professor Zhaohui Zhong. The researchers relied on a technique known as heterodyne mixing, which involves examining the interaction between the dipoles associated with molecules and the nanosensor at high frequencies. They currently are working with the U.S. National Science Foundation's Innovation Corps program to commercialize the sensor.
Computer Writes Its Own Fables
UNSW Newsroom (08/01/14) Ry Crozier
University of New South Wales (UNSW) researchers have developed the Moral Storytelling System, an artificial intelligence program that writes its own fables based on particular combinations of emotions or desires felt by the characters in the story. The system uses 22 different emotions to create the fables. "The computer decides the events to elicit those emotional responses from the characters, and the characters do whatever the plot needs them to do," says UNSW researcher Margaret Sarlej. The program is based on a logical translation of the psychological model known as OCC, named after its creators Ortony, Clore, and Collins, for determining emotions. The researchers hope the system will make computers capable of authoring stories with more sophistication and complexity. In the future, computers "will be making interesting and meaningful contributions to literature within the next decade," says UNSW researcher Malcolm Ryan. "They might be more experimental than mainstream, but the computer will definitely be doing some of the work of writing." Ryan says the idea is to find artists such as authors and computer game designers to participate in the project "and create things we'd never imagined."
Berkeley Team Explores Sound for Indoor Localization
Phys.Org (08/01/14) Nancy Owano
University of California, Berkeley researchers have developed SoundLoc, a simple, inexpensive mechanism that can identify rooms based on a relatively small dataset. SoundLoc is a room-level localization system that exploits the intrinsic acoustic properties of individual rooms, and the researchers say they can obtain room impulse responses by employing built-in speakers and microphones on laptops. The SoundLoc system emits a sound and then listens for the return, which will be distorted in a way that depends on the size and shape of the room, the materials on the walls and floors, as well as the furniture and people within it. The researchers tested their system in 10 rooms on the Berkeley campus. They collected 50 samples at each location, which included background noise such as footsteps, talking, and heating and ventilation sounds. The researchers then processed the data to find the echo fingerprint for each room. "The acoustic features we extracted are shown to be distinctive, robust, and efficient to compute," the researchers say. They estimate nearly 98 percent of overall accuracy was achieved for the 10 rooms' identification. The researchers are especially interested in applying the SoundLoc method to lower energy consumption in buildings.
Climate Change Research Goes to the Extremes
Northeastern University News (07/30/14) Angela Herring
Northeastern University researchers have found that although global temperatures are increasing, so too is the variability for temperature extremes. The big data-based research means that as overall temperatures rise, there still could be extreme cold snaps, says researcher Evan Kodra. "Just because you have a year that's colder than the usual over the last decade isn't a rejection of the global warming hypothesis," he says. The researchers used computational tools from big data science to extract specific insights about climate extremes. They also found the natural processes that drive weather anomalies today could continue to do so in a warming future. The researchers combined simulations from the most recent climate models developed by groups around the world as part of the Intergovernmental Panel on Climate Change with reanalysis data sets, which are created by blending the best available weather observations with numerical weather models. They also combined a suite of methods to characterize climate extremes and explain how their variability is influenced by factors such as seasons, geographical region, and land-sea interface. The results could provide important scientific breakthroughs with societal implications, notes Northeastern University researcher Auroop Ganguly. For example, knowing that models forecast a wider range of extreme temperature behavior will enable agriculture, public health, and other sectors to better prepare for the future.
Abstract News © Copyright 2014 INFORMATION, INC.
To submit feedback about ACM TechNews, contact: [email protected]
Current ACM Members: Unsubscribe/Change your email subscription by logging in at myACM.