ACM TechNews
Association for Computing Machinery
Welcome to the August 27, 2014 edition of ACM TechNews, providing timely information for IT professionals three times a week.

Updated versions of the ACM TechNews mobile apps are available for Android phones and tablets (click here) and for iPhones (click here) and iPads (click here).

HEADLINES AT A GLANCE


A New Explaination for Tech's Pathetic Gender Diversity: The Personal Computer
Salon.com (08/25/14) Andrew Leonard

In an article on the Communications of the ACM website, Tom Geller suggests the precipitous fall of female participation in computer science over the last three decades--women earned 35 percent of computer science degrees in the early '80s but only about 18 percent today--is at least partially due to the ways the advent of the personal computer changed the field. Geller says the PC removed computer science from its "grander context." He says instead of being a field that gave you entry into several other fields, computer science became narrowly focused on hardware and software. Union College computer science professor Valerie Barr, chair of ACM's Council on Women in Computing, agrees, noting that before 1984, students interested in fields such as bioinformatics, computational economics, or quantitative anthropology had to be involved in computer science just as a matter of course. After the PC, female students interested in using computers to pursue a certain field did not need a computer science background and the computer sciences increasingly became dominated by a male culture focused more or less exclusively on the machines and their software. At the same time, the financial success of computer companies drew more people, mostly men, who were motivated to enter the field by money.


How an Algorithm Detected the Ebola Outbreak a Week Early, and What It Could Do Next
TechRepublic (08/26/14) Lyndsey Gilpin

Researchers at Boston Children's Hospital have developed HealthMap, an international mapping tool that detects and tracks diseases. HealthMap's algorithms recently identified the Ebola virus just over a week before it spread. The tool shows the full timeline of events, including locations, case counts, and original source documentation. "[This instance with] Ebola is some of the most usage we've ever had at the site, and it's raising awareness for infectious disease," says HealthMap co-founder John Brownstein. In addition, HealthMap captures narratives of case situations, such as patients leaving their containment areas, civil unrest, and other impacts of the outbreak. HealthMap's Web crawler collects information from hundreds of thousands of sources across the Internet, based on keyword searches for disease-related terms. The HealthMap researchers are continually expanding the list of data sources they access, including data from Wikipedia, Yelp, Twitter, and mobile apps, all of which can provide early signals of disease activity. HealthMap co-founder Clark Freifeld says the algorithm is 90-percent accurate at filtering out unwanted data, and it is constantly improving through feedback from analysts. HealthMap has expanded into 15 languages and uses social media sources to find drivers of disease such as attitudes toward vaccines and animal health. "What's exciting about HealthMap is that we are capturing all this information and making it available and accessible," Freifeld says.


Helping Researchers Cope With the Medical Literature Knowledge Explosion
KurzweilAI.net (08/27/14)

Computational biologists at the Baylor College of Medicine and analytics experts at IBM Research have collaborated to create a new tool called the Knowledge Integration Toolkit (KnIT), which will help research scientists to make better use of the massive volumes of scientific research that is available in public databases. KnIT relies in part on IBM's Watson supercomputer to help mine public databases for relevant studies and then quickly digest and synthesize them so they can be used to formulate new hypotheses. Olivier Lichtarge, director of Baylor's Center of Computational and Integrative Biomedical Research, says that especially in biology and medicine there is a tremendous amount of research that no single scientist could ever hope to completely absorb in a timely fashion, and the volume of research is always increasing. Lichtarge uses the example of a single tumor-suppressing protein, p53, which is discussed in some 70,000 research papers. Reading them all would take a researcher decades, but KnIT is able to do the job exponentially faster. In a test using the literature for p53, KnIT was able to generate a hypothesis about the protein that was later proved to be accurate. Lichtarge will discuss the study and the KnIT project today at the ACM SIGKDD Conference on Knowledge Discovery and Data Mining.


The Surveillance Engine: How the NSA Built Its Own Secret Google
The Intercept (08/25/14) Ryan Gallagher

The U.S. National Security Agency (NSA) is secretly supplying data to almost 24 government agencies with ICREACH, a "Google-like" search engine designed to share more than 850 billion records about phone calls, emails, cellphone locations, and Internet chats, according to classified documents. The documents indicate ICREACH has allowed NSA for years to make massive amounts of surveillance data directly accessible to domestic law enforcement agencies. "[The ICREACH team] began over two years ago with a basic concept compelled by the [Intelligence Community's] increasing need for communications metadata and NSA's ability to collect, process, and store vast amounts of communications metadata related to worldwide intelligence targets," says a secret 2007 memo. The search tool is able to handle 2 billion to 5 billion new surveillance records daily, and it facilitates access to a vast database that intelligence analysts can mine for "foreign intelligence," which is a less-specific term than counterterrorism. The system's simple search interface enables analysts to run searches against specific "selectors" affiliated with a person of interest, and return a results page that can be used to uncover the subject's social network. A U.S. official reports ICREACH is not a data repository, but rather enables analysts to execute one-stop searches for information from a broad array of separate databases. Legal experts are troubled about ICREACH's scope and its potential use for domestic, non-terrorism-related inquiries.


Classroom Contest Yields Publishable Results
MIT News (08/26/14) Larry Hardesty

Massachusetts Institute of Technology (MIT) professor Hari Balakrishnan last year taught a graduate-level networking course featuring a two-week research project in which 20 two-person student teams designed competing protocols for managing congestion in cellular networks. At the same time, Balakrishnan was scheduled to present such a protocol, called Sprout, at a major networking conference. Balakrishnan told his students that any team producing better results than Sprout would be listed as co-authors of a paper describing the contest and its results. "You don't get this kind of access to really smart people working on a problem in a focused manner elsewhere," says MIT researcher Anirudh Sivaraman. Over the two-week research project, the students tested about 3,000 variations of several dozen protocol designs and found a clear trade-off between delay and throughput, which could be useful information for protocol designers. "We were able to actually trace out this design space and say that our own protocol, Sprout, was somewhere on the frontier--that hey, people can do better than us on throughput, or on delay, but probably not on both," Sivaraman says. Project-based educational initiatives like the protocol design contest are of particular interest at MIT. "The students definitely liked the aspect of working on a cutting-edge research problem," Sivaraman says.


'Robo Brain' Mines the Internet to Teach Robots
Cornell Chronicle (08/25/14) Bill Steele

Researchers at Cornell, Stanford, and Brown universities and the University of California, Berkeley are developing Robo Brain, a large-scale computational system that learns from publicly available Internet resources. The system currently is downloading and processing about 1 billion images, 120,000 YouTube videos, and 100 million tutorial documents and appliance manuals. The information is being translated and stored in a robot-friendly format that robots will be able to access when they need it. "If a robot encounters a situation it hasn't seen before, it can query Robo Brain in the cloud," says Cornell professor Ashutosh Saxena. Robo Brain will process images to identify the objects in them, and connect those images with text to learn to recognize objects and how they are used, as well as associated human language and behavior. The system relies on structured deep learning, a technology in which information is stored in many levels of abstraction. A robot's computer brain stores learned information in a form known as a Markov model, which can be represented graphically as a set of nodes connected by lines. "The Robo Brain will look like a gigantic, branching graph with abilities for multi-dimensional queries," says Cornell's Aditya Jami.


Smartphones Set Out to Decipher a Cryptographic System
Swiss Federal Institute of Technology in Lausanne (08/25/14)

Researchers in the LACAL laboratory at the Swiss Federal Institute of Technology in Lausanne have developed an Android app to crack a cryptographic system by enabling thousands of smartphones to work together on the task. Users launch the application to run an algorithm many times in an effort to eventually break the cryptographic code. The app lets users form teams, view their statistics, and measure their participation. "All of us do not necessarily have a computer for running the algorithm, making it difficult to gather a few dozen," says LACAL Lab student Ramasamy Gowthami. "On the other hand, everyone has a smartphone, and launching the application becomes a child's game." Gowthami says it is important to constantly assess cryptographic systems because they can be broken at some point, so people can know their limitations and adapt them if they are no longer safe. For example, she says this can be achieved by extending the length of the encryption key. "Since I was in charge of the interface between the program's components, I had to have a perfect knowledge of the elements of the algorithm," Gowthami notes.


The Ultimate Challenge for Recommendation Engines
Technology Review (08/25/14)

One of the next improvements for recommendation engines could be the ability to identify shared accounts, which could help ensure that people in a household receive appropriate recommendations. The Massachusetts Institute of Technology's Amy Zhang and colleagues are working to see if this is possible. The researchers studied two datasets of movie recommendations, and applied several standard mathematical approaches to a subspace clustering problem to separate out a joint set of ratings into its component parts. They applied this method to another group of about 55,000 households, with their algorithm labeling 37,000 as single-person accounts, 15,000 as two-person accounts, and 3,000 as used by three or more people. "A visual inspection of the accounts that were labeled as composite yield some interesting observations," giving the team confidence the algorithm was on the right track, Zhang says. For example, in many accounts that sequels or seasons of the same TV shows were grouped together, they found one user would prefer movies labeled as Science Fiction and Fantasy while another might prefer movies labeled as Romantic. The researchers' approach to changing recommendations for shared accounts is to display the top recommendations for each user.


Can Computers Replace Historians?
BBC News (08/22/14) Rory Cellan-Jones

Computers have the potential to sift through the big data of history to help people spot patterns and determine what direction the world is heading in, according to Georgetown University researcher Kalev Leetaru. Leetaru used the Google Big Query tool to crunch the GDELT database of media reports stretching back to 1979. "What we did here was use this tool to shove in a quarter of a billion records and use this massive piece of software to just in a few minutes sift out the patterns in this data," Leetaru says. He says he uncovered complex patterns of events repeating themselves over the years by examining recent events in Egypt, the Ukraine, and Lebanon and attempting to draw common patterns. Leetaru cites the continued ups and downs in coverage of Ukraine as an example. "You don't see this traditional burst of interest and then tailing off, you see this complex up-and-down movement over the two months after the protests started and it turns out this predicts that entire complex up-and-down cycle," he points out. Leetaru says although the idea that computing power and artificial intelligence can now start replacing some intellectual disciplines may be frowned upon, historians should view this kind of computational tool as just another technique rather than a threat to their professional expertise.


Stealing Encryption Keys Through the Power of Touch
Ars Technica (08/21/14) Peter Bright

Tel Aviv University researchers have demonstrated a side-channel attack against the GnuPG encryption software that enables them to access decryption keys by touching exposed metal parts of laptop computers. The metal parts of a laptop, such as the shielding around a USB port, are notionally all at a common ground level, but this level undergoes tiny fluctuations due to the electric fields within the laptop. These variations can be measured, and this can be used to leak information about encryption keys. Although this measurement has been demonstrated by directly attaching a digitizer to a metal part of the laptop, the researchers showed they could retrieve information with connections at the far end of shielded USB, VGA, and Ethernet connections. They also demonstrated that a person in contact with metal parts of the laptop can in turn be connected to a digitizer, and the voltage fluctuations can be measured, a technique that works better in hot weather because of the lower resistance of sweaty fingers. The researchers reported their findings to the GnuPG developers, and the software has been modified to reduce some of the information leaked this way. However, even with the alteration, the software is not immune to this side-channel attack, and different encryption keys can be distinguished from one another.


Researchers Made a Fake Social Network to Infiltrate China's Internet Censors
Motherboard (08/21/14) Jason Koebler

Researchers from Harvard University and the University of California, San Diego recently published a paper that provides one of the first Western glimpses inside the systems China uses to censor the online communications of its citizens. The researchers, led by Harvard professor Gary King, first made posts on Chinese microblogging service Weibo, using various keywords and phrases to see what posts were censored. They then invented their own social networking service and launched it in China. "From inside China, we created our own social media website, purchased a URL, rented server space, contracted with one of the most popular software platforms in China used to create these sites, submitted, automatically reviewed, posted, and censored our own submissions," King says. The researchers found that instead of a monolithic government censorship apparatus, censorship worked in a somewhat hands-off way designed to encourage competition among services to create superior censorship systems. Even more surprising, they found the Chinese government is not averse to criticism of the government or government officials, but rather of criticisms that seem to incite others to take action. King says criticisms of local government officials on social media are actually helpful to the central government, offering it an unofficial method of monitoring and policing those officials.


Hacking Gmail With 92 Percent Success
UCR Newsroom (08/20/14) Sean Nealon

University of California, Riverside (UCR) researchers have uncovered a weakness believed to exist in Android, Windows, and iOS mobile operating systems that could be used to obtain personal information from unsuspecting users. The attack works by getting a user to download a seemingly safe, but actually malicious, app such as one for background wallpaper on a phone. The attackers can then exploit a newly discovered public-side channel, which is the shared memory statistics of a process, and which can be accessed without any privileges. The researchers track changes in shared memory and are able to correlate changes to what they call an activity transition event. Enhanced with a few other side channels, the researchers found it is possible to fairly accurately monitor in real time which activity a victim app is in. The attack must take place at the exact moment the user is logging into the app or taking a picture, and also needs to be carried out in an inconspicuous way. "By design, Android allows apps to be preempted or hijacked," says UCR professor Zhiyun Qian. "But the thing is you have to do it at the right time so the user doesn't notice. We do that and that's what makes our attack unique."


Skype's Real-Time Translator Learns How to Speak From Social Media
IEEE Spectrum (08/22/14) Teresa Chong

Microsoft's upcoming Skype Translator app translates multilingual conversations in real time through a combination of speech recognition, machine translation, and speech synthesis technologies. A key element of the app is a software system Microsoft Research developed to translate social media musings. An essential component to enable translation is the syntactically informed phrasal statistical machine translation (syntactic SMT) system, which builds on the phrasal SMT platform but comprehends syntax as well. Syntactical SMT does not simply match common phrases, but deconstructs a phrase into individual words and then maps each word over to the other language. Microsoft researchers then started studying Facebook, Short Message Service, and Twitter communications to determine the best technique for managing conversational text. To account for the characteristic differences of each social media platform, the researchers developed software capable of automatically adapting to these distinctions to generate something that syntactic SMT can process. The addition of this normalization system to the translator's protocol improved the quality of translations by boosting their accuracy 6 percent, says Microsoft's Vikram Dendi. "There's still a lot of work to do, but when we did this, it really did move the needle on understanding and translating that type of data better," he says.


Abstract News © Copyright 2014 INFORMATION, INC.
Powered by Information, Inc.


To submit feedback about ACM TechNews, contact: technews@hq.acm.org
Current ACM Members: Unsubscribe/Change your email subscription by logging in at myACM.
Non-Members: Unsubscribe