News 2018

December 2018

Whittaker Named Among Top Pittsburghers of All Time

Byron Spice

Pittsburgh magazine has named William "Red" Whittaker, the Fredkin University Professor of Robotics, as one of "The 50 Greatest Pittsburghers of All Time," putting him in the company of Andrew Carnegie, Rachel Carson, Jonas Salk, Fred Rogers and Chuck Noll. For the magazine's January 2019 issue, editors selected the 50 men and women from more than 200 years of Pittsburgh history, based on their contributions in fields ranging from sports to technology and on how they put the national spotlight on Pittsburgh. The issue coincides with the magazine's 50th year of publishing. Considered the father of field robotics, Whittaker's contributions propelled robots from research curiosities mostly found bolted to factory floors or relegated to laboratories to mobile, autonomous units capable of working outdoors in harsh and challenging environments. Field robotics incorporates innovations in controls, sensing, propulsion, suspension, communications and navigation to enable devices that can operate autonomously in changing and uncertain environments. After the nuclear reactor meltdown at Three Mile Island near Harrisburg, Pa., in 1979, Whittaker and his team developed robots to inspect the damaged reactor's basement and perform repairs. This work led to the formation of Carnegie Mellon's Field Robotics Center, which he continues to direct. His other innovations include Dante II, a walking robot that explored an active volcano; Nomad, which searched for meteorites in Antarctica; and Tugbot, which surveyed an 1,800-acre area of Nevada for buried hazards. He led the Tartan Racing team and its Boss self-driving vehicle to victory in the $2 million Defense Advanced Research Projects Agency Urban Challenge in 2007, creating the template for today's burgeoning autonomous vehicle industry. He also founded Astrobotic, a Pittsburgh company that plans to deliver payloads to the moon's surface. Another Pittsburgh company, RedZone Robotics, is using robots to revolutionize the inspection of sewer lines.

Thwarting Bias in AI Systems

Alexandra George

Artificial intelligence systems are at work in many areas where we might not realize — making decisions about credit, what ads to show us and which job applicants to hire. While these systems are really good at systematically combing through lots of data to detect patterns and optimize decisions, the biases held by humans can be transmitted to these systems through the training data.A team of researchers from Carnegie Mellon University — including CyLab's Anupam Datta, professor of electrical and computer engineering at CMU Silicon Valley; Matt Fredrikson, assistant professor of computer science; and Ph.D. student Samuel Yeom — are detecting what factors directly or indirectly affect decision outcomes and correcting them when they are used inappropriately.Bias often appears in AI systems through factors like race or gender that aren't directly inputted into the system, but still have a strong influence on their decisions. Discrimination can happen when one of these attributes is strongly correlated with information that is directly used by the system. For example, suppose a system that makes decisions about credit uses zip code as a factor to make its decisions. The direct information about race is not given to the system, but zip code is strongly correlated with race since many neighborhoods are still segregated. By using zip code, the system would be indirectly making decisions based on race. In this case, zip code is a proxy for race.  "If zip code is encoding race and is being used to make decisions about credit, then it's not a defensible proxy," said Datta. "That's what our method can uncover. It can look inside these machine learning models and discover proxies that are influential in the decisions of the model."To detect bias and repair algorithms that may be making inappropriate decisions, the researchers have developed detection algorithms that identify the variables in a system that may be exhibiting proxy use in an unfair way. The algorithm combs through the model to detect the variables that are correlated with a protected feature (like race, age, or gender) and heavily influence the decision outcome.The concept of proxy use in machine learning models was formally studied in an earlier paper by a Carnegie Mellon team including Datta, Fredrikson, Ko, Mardziel, and Sen. The first proxy detection algorithm they created was a slow, brute-force algorithm that works in the context of simple decision tree and random forest models, two classes of machine learning models. Most recently, Yeom, Datta, and Fredrikson developed an algorithm that works on linear regression models and scales to numerous applications where these kinds of models are used in the real world, detailed in a paper presented at NeurIPS 2018."Our recent results show that, in the case of linear regression, we can simply treat the input attributes as vectors in a low-dimensional space," said Yeom, "and this allows us to use an existing convex optimization technique to identify a proxy quickly."Once the algorithm has detected the influential variables, it shares them with a human domain expert who decides if the proxy is used in a way that is unjustified. To demonstrate that the algorithm works in practice, they ran it on a model used by a large police department to predict who is likely to be involved in a shooting incident. This model did not have any strong proxies, but gang membership was found to be a weak proxy for race and gender. A domain expert would then consult this information and decide whether it is a justified proxy. Their method is broadly applicable to machine learning models that are widely used in such high-stakes applications.Not all instances of proxy use are negative, either. For example, debt-to-income ratio is also strongly associated with race. But if debt-to-income ratio can separately be justified as a strong predictor of creditworthiness, then it is legitimate to use. That is why it is important to have a human domain expert be able to decide once the algorithm has flagged the proxy use.Detecting and correcting biases in AI systems is just the beginning. As AI systems continue to make important decisions for us, we need to make sure they are fair. Hiring tools, criminal justice systems, and insurance companies all use AI to make decisions, and many other domain areas are continuing to incorporate artificial intelligence in new ways."Being able to explain certain aspects of a model's predictions helps not only with identifying sources of bias, but also with recognizing decisions that may at first appear biased, but are ultimately justified," said Fredrikson. "We believe that this ability is essential when applying the approach to real applications, where the distinction between fair and unfair use of information is not always clear-cut."

CDC Says Carnegie Mellon's Flu Forecasts Once Again Most Accurate

Byron Spice

The U.S. Centers for Disease Control and Prevention has announced that Carnegie Mellon University's forecasts of national and regional influenza activity during the 2017-2018 flu season were the most accurate of the 30 systems in its flu forecasting initiative. Carnegie Mellon's Delphi Research Group has proven the most accurate four years in a row and for four of the five years that the CDC has run the forecasting initiative. Last season, 21 research groups participated, testing 30 different forecasting systems. "The CDC seems very happy with this entire initiative," said Roni Rosenfeld, Delphi leader and head of the Machine Learning Department. "Flu forecasts were actually used in official CDC communications for the first time in the 2017-2018 season." Delphi fielded two systems. The first uses machine learning to make predictions based on both past patterns and input from the CDC's domestic flu surveillance system. The second system bases its predictions on the judgments of human volunteers who submit their own weekly predictions — the so-called wisdom-of-crowds approach. CMU is using updated versions of both systems to forecast the current season, which thus far is off to a mercifully slow start. The forecasts are available through the Delphi website, and the group welcomes input from anyone who wants to contribute to the weekly crowdsourced forecasts. The flu forecasts by CMU and other participants suggest flu activity will rise in the next few weeks, but Rosenfeld said it's too soon to say when flu activity will peak or how high it might get. Unlike flu surveillance, which tracks flu activity based on reports of flu-like illnesses from physicians, flu forecasting attempts to look into the future, much like a weather forecast, so health officials can plan ahead. Delphi and the other forecasting groups make weekly predictions throughout the flu season. The CDC combines those efforts in its FluSight Network, with forecasts available online. In addition to CMU's Delphi team, the CDC's forecasting initiative includes groups from entities such as Columbia University, Los Alamos National Laboratory and the University of Massachusetts-Amherst. Initially, the CDC asked participants to forecast flu-related visits to physician offices in each of 10 regions in the nation. But last year, the CDC added two additional forecasting challenges — flu-related visits to physician offices for each state and flu-related hospitalization nationwide for each of five age groups. Delphi co-leader Ryan Tibshirani, associate professor of statistics and machine learning, said forecasting by state is important because flu activity can be very local, so wide differences are possible even within a region. Forecasting hospitalization is important, he added, because some strains of flu can result in more severe illness than others, so flu activity levels in hospitals can differ markedly from activity levels in doctors' offices. CMU participated in all three challenges, submitting the most accurate forecasts for each one. The machine learning method proved most accurate for the state-by-state and hospitalization forecasts, while the wisdom-of-crowds method was most accurate for regional forecasts. "Virtually all members of the Delphi group contributed to our success, with theory, algorithmic development, implementation and weekly participation in the crowdsourcing system," Rosenfeld said. "Special kudos to the students who oversaw the competition weekly, if not daily, during the unusually long flu season." Those students included Logan Brooks, a Ph.D. student in the Computer Science Department; Aaron Rumack, a Ph.D. student in machine learning; and Jiaxian Sheng, a senior computer science major. The Delphi group unites students and faculty from the Machine Learning, Statistics and Data Science, Computer Science and Computational Biology departments. The forecasting efforts are supported by the Machine Learning for Good fund established at the School of Computer Science by Uptake, the Defense Threat Reduction Agency, and the National Institute of General Medical Sciences' Models of Infectious Disease Agency Study (MIDAS).

SCS Professors Reimagine What It Takes To Code

Aisha Rashid (DC 2019)

David Kosbie and Mark Stehlik believe anyone can code. As course instructors for Principles of Computing — better known to Carnegie Mellon University students by its course number, 15-110 — that belief comes in handy. One of two introductory courses offered in the School of Computer Science, 15-110 covers programming constructs along with history and current events in computer science, tailored to students with little to no computer science background. This fall semester, Kosbie and Stehlik switched up elements of the course, with the goal of enhancing students' experiences in 15-110. While the course provided students with the necessary tools and resources they'd need as novice coders, the instructors wanted to take it to the next level and showcase it to a wider audience. Using creative approaches and techniques inside and outside of the classroom, they hope to transform what might at first seem like complicated principles of computer science into collaborative and interactive problem-solving tools applicable to any and all fields of study. Their mission wasn't without challenges — the first being that their course is mandatory for many CMU students. "Most of the students taking 15-110 aren't there because they want to be, they're there because someone chose for them to be there," said Kosbie, an associate teaching professor in SCS. "Maybe they didn't come in kicking and screaming, and maybe they're okay with it, but nonetheless, they may not see this as a future. They may see this as a check in whatever box they have to check." Kosbie and Stehlik also know they have to manage their students' expectations. Often, new students may not be the most receptive to the basics of computer science theory and simple coding exercises. "Earlier in the course, you don't have the computational tools to do grandiose stuff," said Stehlik, teaching professor of computer science and assistant dean for outreach. "When students think about graphics, their thoughts immediately go to things like video games, like 'Halo,' or 'Fortnite.' But you have to start small, and a lot of the times when you start small, you start simple both in the domain and in the coding techniques." The instructors also recognize that they must tailor the course to a primarily freshman audience, as 15-110 is often taken by CMU students during their first two semesters. "Courses designed for a predominantly freshman audience have responsibilities to move students from a high school mindset to a Carnegie Mellon mindset, and that's true regardless of the course," Stehlik said. "It gets students to understand what it means to be in this environment, and what it means four years from now, to be a graduate from this environment." So striking a balance between keeping the class engaging for students while ensuring legitimate CMU-quality outcomes is one of the chief focuses Kosbie and Stehlik have for the course. How, then, are the duo re-envisioning 15-110 to meet these goals? Team homeworks, which are collaborative weekly group homework sessions led by a teaching assistant, make the content more approachable. With this addition, students participate in an extra two hours outside of lecture and watch interactive videos, practice coding challenges to prepare for the upcoming week, and even have opportunities to conduct research on the role of computer science in current events. This enables them to more qualitatively discuss the course material in a collaborative learning environment. Kosbie and Stehlik have also incorporated more guest lectures from CMU faculty as a part of the curriculum. Guest lecturers this semester included Distinguished Career Professor of Computer Science Lenore Blum, Bruce Nelson Professor of Computer Science Manuel Blum, Department Head and L.L. Thurstone Professor of Philosophy and Psychology David Danks, Angel Jordan Professor of Computer Science Tuomas Sandholm, Computer Science Professor Roger Dannenberg and Art Professor Golan Levin. "The guest lecturers give students a sense of a much bigger picture," Stehlik said. "When I taught 15-110 in the spring of 2017, we had one guest lecturer the entire semester. So this, in a sense, is opening up that box a lot wider to illustrate these concepts less from our own knowledge and more from the experts at CMU who are immersed in that knowledge." Finally, to ensure students are engaged with the content of the course as much as possible, they've also incorporated a new term project. This final project allows students to fuse their technical and creative skills and create an interactive game or activity about computer science in order to both show off what they've learned throughout the semester and educate their audience. The project's end goal is to help students better understand how the core technical skills they learned all semester can fundamentally change how they approach problem-solving, regardless of their profession. With these additions to the 15-110 curriculum, Kosbie and Stehlik both set a very clear desire to ensure students get the most out of their exposure to computer science. "I'd like the biggest takeaway of the course to be that students not be afraid of coding and understand that it's very relevant to their future careers," Stehlik said. "We want to entice students who think that they may not be able to do this to believe that they can, and then to continue to do so."

RFID Tag Arrays Track Body Movements, Shape Changes

Byron Spice

Carnegie Mellon University researchers have found ways to track body movements and detect shape changes using arrays of radio-frequency identification (RFID) tags. RFID-embedded clothing thus could be used to control avatars in video games — much like in the movie "Ready Player One." Embedded clothing could also tell you when you should sit up straight— much like your mother.RFID tags are nothing new, which is part of their appeal for these applications, said Haojian Jin, a Ph.D. student in CMU's Human-Computer Interaction Institute (HCII). They are cheap, battery-free and washable.What's new is the method that Jin and his colleagues devised for tracking the tags, and monitoring movements and shapes. RFID tags reflect certain radio frequencies. It would be possible, but not practical, to use multiple antennae to track this backscatter and triangulate the locations of the tags. Rather, the CMU researchers showed they could use a single, mobile antenna to monitor an array of tags without any prior calibration.Just how this works varies based on whether the tags are used to track the body's skeletal positions or to track changes in shape. For body-movement tracking, arrays of RFID tags are positioned on either side of the knee, elbow or other joints. By keeping track of the ever-so-slight differences in when the backscattered radio signals from each tag reach the antenna, it's possible to calculate the angle of bend in a joint."By attaching these paper-like RFID tags to clothing, we were able to demonstrate millimeter accuracy in skeletal tracking," Jin said.The researchers call this embedded clothing RF-Wear and described it earlier this year at the UbiComp 2018 conference in Singapore. It could be an alternative to systems such as Kinect, which use a camera to track body movements and can only work when the person is in the camera's line of sight. It also could be an alternative to existing wearables, which generally depend on inertial sensors that are expensive, difficult to maintain and power hungry, Jin said.RFID-embedded clothes might also be an alternative to wrist-worn devices, such as Fitbit, for activity tracking or sports training.The technology for monitoring changes in curves or shapes, called WiSh (for Wireless Shape-aware world), also uses arrays of RFIDs and a single antenna, but relies on a more sophisticated algorithm for interpreting the backscattered signals to infer the shape of a surface.WiSh was presented earlier this year at Mobisys, the International Conference on Mobile Systems, Applications and Services, in Munich, by Jin and Jingxian Wang, a Ph.D. student in CMU's Electrical and Computer Engineering (ECE) Department. It could be incorporated into smart fabrics and used to track a user's posture. It could also be embedded in a variety of objects."We can turn any soft surface in the environment into a touch screen," Wang said. Smart carpets, for instance, could detect the presence and locations of people, or be used to control games or devices. Soft toys could respond to or otherwise register squeezes and bends. Smart pillows might help track sleep quality.WiSh also could be used to monitor the structural health of bridges or other infrastructure, Wang noted. The researchers measured the curvature of Pittsburgh's 10th Street Bridge by using a robot to drag a string of 50 RFID tags along the bridge's sidewalk."We're really changing the way people are thinking about RF sensing," Jin added."Weaving these tags into clothing will only add a minimal cost, under $1," Jin said. The most expensive part of these measurement systems is the antenna. But smartphones already use 13 MHz antennas for services such as Apple Pay. Adding a 900 MHz antenna for RFID-related applications might be feasible in future smartphones, eliminating the need for a separate device, he suggested.In addition to Jin and Wang, the research team included Jason Hong, associate professor in the HCII; Swarun Kumar, assistant professor of ECE; and Zhijian Yang of Tsinghua University in China. The National Science Foundation and Google provided support for this research.

Autism Risk-Factors Identified in 'Dark Matter' of Human Genome

Abby Simmons

Using cutting-edge statistical models to analyze data from nearly 2,000 families with an autistic child, a multi-institute research team discovered tens of thousands of rare mutations in noncoding DNA sequences and assessed if these contribute to autism spectrum disorder. Published Dec. 14 in the journal Science, the study is the largest to date for whole-genome sequencing in autism. It included 1,902 families comprising both biological parents, a child affected with autism and an unaffected sibling.     Scientists representing Carnegie Mellon University; the University of California, San Francisco; the University of Pittsburgh School of Medicine; Massachusetts General Hospital; Harvard Medical School and the Broad Institute led the research team. The study is one of 13 being released Dec. 14 as part of the first round of results to emerge from the National Institute of Mental Health's PsychENCODE consortium — a nationwide research effort that seeks to decipher how noncoding DNA, often referred to as the 'dark matter' of the human genome, contributes to psychiatric diseases such as autism, bipolar disorder and schizophrenia. Over the past decade, scientists have identified dozens of genes associated with autism by studying so-called "de novo" mutations — newly arising changes to the genome found in children but not their parents. To date, most de novo mutations linked to autism have been found in protein-coding genes. It has proven far more difficult for scientists to identify autism-associated mutations in noncoding regions of the genome. "Protein-coding genes clearly play an important role in human disorders like autism, yet their expression is regulated by the 'noncoding' genome, which covers the remaining 98.5 percent of the genome and remains somewhat mysterious," said Carnegie Mellon's Kathryn Roeder, corresponding author and UPMC Professor of Statistics and Life Sciences in the Statistics and Data Science and Computational Biology departments. "Because the genome comprises 3 billion nucleotides, identifying which portions of the noncoding genome, when mutated, enhance the risk of autism is as challenging as looking for a needle in a haystack." Using a novel bioinformatics framework, the researchers were able to compress the search from billions of nucleotides to tens of thousands of functional categories that potentially contribute to autism. Working with these categories, they used machine learning tools to build statistical models to predict autism risk from a subset of the families in the study. They then applied this model to an independent set of families and successfully predicted patterns of risk in the noncoding genome. Though rare de novo mutations were found in many noncoding regions of the genome, the strongest signals arose from promoters — noncoding DNA sequences that control gene transcription. These risk-conferring promoters were most often located far from the genes under their control. They were also found to be largely conserved across species, suggesting that any rare mutations that might arise in these promoters are more likely to disrupt normal biology. "For years, scientists have used genome-wide studies to find common variants that confer disease risk. Our group has now focused on creating a computational framework that's capable of finding rare, high-impact variants associated with a human disorder, looking across all the noncoding regions of the genome," said Stephan Sanders, corresponding author and professor of psychiatry at the UCSF Weill Institute for Neurosciences and Institute for Human Genetics. The team's findings have practical implications for future research on model organisms, like mice, as attempts are made to move toward genetically informed therapies for autism. But the value of studying the noncoding genome extends well beyond autism. "We were particularly interested in the elements of the genome that regulate when, where and to what degree genes are transcribed. Understanding this noncoding sequence could provide insights into a variety of human disorders," said Bernie Devlin, corresponding author and professor of psychiatry at the University of Pittsburgh School of Medicine. "We are just scratching the surface of what there is to learn about noncoding regulatory variation in human disease, and the new methods this team has developed will catalyze an important step forward into larger and more comprehensive studies," said Michael Talkowski of Massachusetts General Hospital, Harvard Medical School and the Broad Institute, who also served as corresponding author on the study. Lead authors on the paper are Joon-Yong An and Donna Werling of the UCSF Weill Institute for Neurosciences, and Kevin Lin and Lingxue Zhu of CMU's Department of Statistics and Data Science. The National Institutes of Health, the Simons Foundation Autism Research Initiative and the Broad Institute's Stanley Center for Psychiatric Research provided funding for this research.  

Alumna Q&A: Alexandra Johnson

Susie Cribbs

Carnegie Mellon University doesn't always consider itself cool. But this year, Seventeen magazine begged to differ, naming CMU one of its 2018 "Cool Schools." Their reasons? Our gender parity in STEM fields and strong community of female coders. One person who has helped make CMU cool is Alexandra Johnson (CS 2014). The Washington state native played integral roles in strengthening programs like SCS Day and Women @ SCS, and worked hard to improve multiple areas of campus life. And she did it all while earning her bachelor's in computer science; completing internships at Duolingo, Facebook and Rent the Runway; being active in Greek life; and serving as an undergraduate teaching assistant. Alexandra works at a startup in San Francisco now, but was on campus earlier this semester for a Women @ SCS panel discussion. We caught up with her to learn more about her time at CMU, how organizations like Women @ SCS transformed her CMU experience, what she's doing now and what her plans are for the future. Washington state's a long way from Pittsburgh. How did you end up at Carnegie Mellon? I'd known for a long time that I wanted to major in computer science. I liked math, and my parents said I should work in startups in Silicon Valley. I always had it in my head that I wanted to major in computer science, so I only applied to schools that I knew were top programs in CS. Carnegie Mellon was far and away the one with the best resources. Once you arrived on campus, how did you feel about SCS? SCS gives you enough theory to really impress interviewers, and so you get great internships right off the bat. And you get practical experience. Spending four years understanding the theory behind why we do something and what it means for code to run a certain way and what it means for code to be modular — I think that's actually important. Because then you can drop into any situation and you can learn any language and any new paradigm. SCS teaches you how to learn about programming. Did anything at CMU disappoint you? I tried to help create the experience I felt was missing by getting involved in clubs and activities. I ran SCS Day for three years. I was also really involved in Women @ SCS, which is such a valuable group for SCS. Another student and I put a lot of effort into making sure there were always upperclassmen at the freshmen events, getting people to come to the meetings, letting people know how they could get involved. Both SCS Day and Women @ SCS have sustained a lot of momentum in the years after my peers and I left CMU, and I'm so proud of the work the students are doing today. What are you doing now? I work at SigOpt. We provide an API that can help you fine-tune the hyperparameters of your machine learning models. I do a lot of the full stack engineering, a lot of the process to make the building of the models repeatable. I just finished development on a big machine learning platform project called Orchestrate, which contained a lot of exciting technical and logistical challenges. What did you do at CMU that best trained you for your job at SigOpt? Certainly Operating Systems was a great crash course in "manage a project over four months and don't hit any deadlines." Doing that project and having a partner on it taught me a lot. When I was at work and I had a chance to manage a project as the tech lead, I gave myself generous deadlines, all of which I hit. I also kept some of the ethic of that class. OS taught me that sometimes, in a large project with many small, moving pieces, the most important thing is that the project gets done. How do you feel about California? There's a lot of energy in San Francisco. I organize a meetup now, Women in Machine Learning and Data Science. It's an extension of the work I did with Women @ SCS and SCS Day. How do you find women in tech? How does it compare to being a woman in SCS? I definitely think SCS is probably the best place to go as a female undergrad. I don't think I've found something that strong anywhere else. It was just so great. I really wanted to be involved in the community and there WAS a community to be involved in. Some of that extends out to the Bay Area, but it's different, because you're not all on campus together. Let's talk about your time at CMU. What are some of your favorite memories? Some of my best computer-science related memories were when I was taking Operating Systems and we would sneak into the conference rooms in the upper levels of the Gates-Hillman Centers that overlook Pittsburgh and get a really good sunrise view. It was in the fall semester, so everything was snowy. Those times when we were doing that, but we were all together — they were great. My favorite class at CMU was actually not in SCS. It was the History of Clothing in the School of Drama, taught by Barbara Anderson. It was a class at the apex of what it should be. I couldn't have gone to another school and majored in computer science and gotten that same experience. That's an experience unique to CMU — that I majored in computer science and literally walked across the Pausch Bridge every day to the School of Drama and took my class from their costume expert. You've been out of school for a little more than four years. What's your career plan? I thought I had a plan when I graduated and I have less of a plan now. I'm learning that life and career doesn't work in five- or 10-year plans. If you'd asked me this eight months ago, I would have said I want to make a really great technical contribution to my company, which I felt like I hadn't necessarily made at the time. Now, I feel like I've worked on something that is interesting and my plans are to just keep working on that. It reminds me of what I like about tech. My work allowed me to learn about a piece of technology that's relatively newish in the industry, and I want to keep working with and learning about it. I want to write a couple blog posts. I want to give a couple talks. And then I'll see where I am. What advice would you give prospective students applying to SCS? I would tell them to really go look at the curriculum. Look at the advantages. Look at things like if they want to start programming right off the bat. If they do, it's a great place. Is being around a lot of other really intelligent students who want to talk about computer science and think about computer science and breathe computer science — is having that really important to them? Because SCS is the place where you can have that.

Parrot Genome Analysis Reveals Insights Into Longevity, Cognition

Byron Spice

Parrots are famously talkative, and a blue-fronted Amazon parrot named Moises – or at least its genome – is telling scientists volumes about the longevity and highly developed cognitive abilities that give parrots so much in common with humans. Perhaps someday, it will also provide clues about how parrots learn to vocalize so well. Morgan Wirthlin, a BrainHub post-doctoral fellow in Carnegie Mellon University’s Computational Biology Department and first author of a report to appear in the Dec. 17 issue of the journal Current Biology, said she and her colleagues sequenced the genome of the blue-fronted Amazon and used it to perform the first comparative study of parrot genomes. By comparing the blue-fronted Amazon with 30 other long- and short-lived birds — including four additional parrot species — she and colleagues at Oregon Health and Science University (OHSU), the Federal University of Rio de Janeiro and other entities identified a suite of genes previously not known to play a role in longevity that deserve further study. They also identified genes associated with longevity in fruit flies and worms. ''In many cases, this is the first time we’ve connected those genes to longevity in vertebrates,'' she said. Wirthlin, who began the study while a Ph.D. student in behavioral neuroscience at OHSU, said parrots are known to live up to 90 years in captivity — a lifespan that would be equivalent to hundreds of years for humans. The genes associated with longevity include telomerase, responsible for DNA repair of telomeres (the ends of chromosomes), which are known to shorten with age. Changes in these DNA repair genes can potentially turn cells malignant. The researchers have found evidence that changes in the DNA repair genes of long-lived birds appear to be balanced with changes in genes that control cell proliferation and cancer. The researchers also discovered changes in gene-regulating regions of the genome — which seem to be parrot-specific — that were situated near genes associated with neural development. Those same genes are also linked with cognitive abilities in humans, suggesting that both humans and parrots evolved similar methods for developing higher cognitive abilities. ''Unfortunately, we didn’t find as many speech-related changes as I had hoped,'' said Wirthlin, whose research is focused on the evolution of vocal behaviors, including speech. Animals that learn songs or speech are relatively rare — parrots, hummingbirds, songbirds, whales, dolphins, seals and bats — which makes them particularly interesting to scientists, such as Wirthlin, who hope to gain a better understanding of how humans evolved this capacity. ''If you’re just analyzing genes, you hit the end of the road pretty quickly,'' she said. That’s because learned speech behaviors are thought be more of a function of gene regulation than of changes in genes themselves. Doing comparative studies of these ''non-coding'' regulatory regions, she added, is difficult, but she and Andreas Pfenning, assistant professor of computational biology, are working on the computational and experimental techniques that may someday reveal more of their secrets. This work was supported through the Brazilian Avian Genome Consortium and by the National Institutes of Health/National Institute on Deafness and Other Communication Disorders. See coverage of this research by New Scientist.    

Bible Readings Help Create New Multilingual Dataset

Byron Spice

It's the Christmas season, which means that beloved Bible verses are being read and recited innumerable times — and in a vast number of languages. The Bible's global reach as evidenced this time of year has enabled a Carnegie Mellon University professor to create a language resource that could enhance communication in hundreds of languages. By tapping online text and audio recordings of the New Testament in more than 700 languages, Alan Black, a professor in CMU's Language Technologies Institute, has created a dataset that can be used to build text-to-speech computer systems and other modern speech technologies for so-called low-resource languages. These languages, such as Kaqchikel in central Guatemala, Lun Bawang of Malaysia and Indonesia, and Mamprusi in northern Ghana, often are spoken by relatively small groups of people and generally lack the kind of technological tools for recognizing or translating language that are routinely available for high-resource languages such as English, Spanish or Mandarin Chinese. Black said it generally isn't profitable to build such systems — or often even basic tools such as dictionaries or pronunciation guides — for low-resource languages. But that never mattered to Christian missionaries, he added. "They don't care about commercial aspects," Black explained. "They care about the Word." In many cases, what few resources exist for these languages are the work of missionaries. "I suspect that for some of these languages these are the only written texts that exist." Black was able to tap one of those evangelical resources — an online service called Bible.is that provides recordings of the New Testament in more than a thousand languages — to create what he calls the CMU Wilderness Multilingual Speech Dataset. This dataset, available for free download online via GitHub, includes audio, word pronunciations and other tools necessary to build text-to-speech systems. From Bible.is, Black downloaded recordings of more than 700 languages for which both audio and text were available. That represents about 10 percent of the world's languages, he noted. "They are languages that missionaries would care about," Black said, including those spoken in areas such as Central and South America, West and East Africa, and Southeast Asia. He then set about aligning the text with the audio, determining which words in the text corresponded with spoken words. By so doing, he was able to establish pronunciation rules that make it possible to vocalize any word in that language, not just those included in the Bible. To make those alignments, Black and his CMU students were aided by the similar spelling and pronunciations across languages of three Hebrew names — Jesus, David and Abraham — and the first verse of the Book of Matthew: "The book of the genealogy of Jesus Christ, the son of David, the son of Abraham." "I now probably know that first sentence in Matthew better than anyone else," Black added. A computer program that makes a best guess at pronunciation helps create an initial alignment of text and audio. This first attempt often is incomprehensible, Black noted, but a machine learning program then analyzes the alignment and fine-tunes it. Thus far, he and his students have completed alignments for 600 of the languages and hope to finish the remaining, more troublesome languages soon. In some cases, poor quality recordings, misidentified languages and unrecognizable writing systems have thwarted their efforts. Development of the dataset was an outgrowth of a Defense Advanced Research Projects Agency program called Lorelei, which sought ways to develop speech recognition tools for low-resource languages within a matter of hours or days. Such tools would be useful, for instance, in responding to epidemic outbreaks or other humanitarian crises. Rather than build such tools on demand — which requires intensive work — Black worked to identify existing resources, such as Bible.is, that could be tapped to create these tools inexpensively in advance. He and his students have demonstrated that tools such as a speech synthesizer can indeed be created using the Wilderness dataset. Tools for processing and translating speech are particularly important for low-resource languages because many of their speakers are illiterate, Black explained. The dataset also should be useful for linguists, he added, noting it makes it possible to do studies of how languages vary across the planet. For instance, the dataset includes about 100 languages from the Amazon basin, enabling studies of how words are formed and how they relate to words in other languages.

Hodgins Named 2018 ACM Fellow

Byron Spice

The Association for Computing Machinery (ACM) has named Jessica Hodgins, professor of robotics and computer science, one of 56 new ACM fellows honored for their significant contributions to computer science. Hodgins, who leads the Facebook AI Research lab in Pittsburgh in addition to her faculty duties, was cited by the ACM for her contributions to character animation, human simulation and humanoid robotics. Hodgins's research focuses on computer graphics, animation and robotics with an emphasis on generating and analyzing human motion. Formerly vice president for research at Disney Research, she last year was elected president of the ACM's Special Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH). She has received numerous awards, including SIGGRAPH's Steven Anson Coons Award for Outstanding Creative Contributions to Computer Graphics. She received her Ph.D. in computer science at CMU in 1989. Former CMU professor Bruce Maggs, now a professor of computer science at Duke University, also was named an ACM fellow "for contributions to the development of content distribution networks and the theory of computer networks." ACM will formally recognize its 2018 fellows at its annual awards banquet, to be held June 15 in San Francisco. Additional information about new and current ACM fellows is available on the ACM website.

Three SCS Faculty Members Named 2019 IEEE Fellows

Byron Spice

Three School of Computer Science faculty members — Venkatesan Guruswami, Mor Harchol-Balter and Eric Xing — have been elevated to fellows in the Institute of Electrical and Electronics Engineers (IEEE), the world's largest technical professional organization. Fellow status is a distinction reserved for select members who have demonstrated extraordinary accomplishments in an IEEE field of interest. Guruswami, a professor in the Computer Science Department (CSD), was cited "for contributions to list error-correction and algorithmic coding theory." His research spans a number of topics in theoretical computer science, including the theory of error-correcting codes, probabilistically checkable proofs, computational complexity theory and algebraic algorithms. He joined the CMU faculty in 2009. Harchol-Balter, a professor in CSD since 1999, was cited "for contributions to performance analysis and design of computer systems." Her work on designing new resource-allocation policies includes load-balancing policies, power-management policies and scheduling policies for distributed systems. She is heavily involved in the SIGMETRICS/PERFORMANCE research community and is the author of a popular textbook, "Performance Analysis and Design of Computer Systems." Xing, a professor in the Machine Learning Department since 2004, was cited "for contributions to machine learning algorithms and systems." His research interests lie in machine learning, computational biology and statistical methodology. He and his collaborators have developed a framework called Petuum for distributed machine learning with massive data, big models and a wide spectrum of algorithms. Petuum is now established as a company, of which Xing is founder, CEO and chief scientist. The total number of fellows selected in any one year cannot exceed one-tenth of one percent of the total voting IEEE membership. A complete list of the Class of 2019 is available on the IEEE site.

SCS Master's Student Named Schwarzman Scholar

Susie Cribbs

School of Computer Science master's student Hima Tammineedi has been named to the 2020 class of Schwarzman Scholars, a highly competitive graduate fellowship inspired by the Rhodes Scholarships that features one year of study at Tsinghua University in China. Launched in 2016, the Schwarzman Scholars program prepares future global leaders to meet the geopolitical challenges of the 21st century. During their year of study, the world's best young minds explore the economic, political and cultural factors that have contributed to China's growth as a global power. Tammineedi is the second CMU student to be named a Schwarzman Scholar. Chrystal Thomas, who graduated from the Mellon College of Science, earned the award in 2016. "The program brings the world's future leaders to China because it is and will continue to be one of the largest factors influencing the future of the world — politically, economically and technologically," said Tammineedi, who earned his bachelor's degree in computer science with a minor in machine learning from CMU this past May. "Having an understanding of the country will be essential in order to be a global leader. This directly aligned with my interests for my future, and given that I already had a big interest in China, it's the perfect opportunity for me." Tammineedi is one of 147 students worldwide selected for the program. More than 2,800 students applied for the fellowship, which begins in August. "Our newest class includes a diverse group of future leaders from around the world," said Stephen A. Schwarzman, co-founder and CEO of Blackstone and chair of Schwarzman Scholars. "They join a global network of scholars who have committed themselves to being a force for change, regardless of where their professional or personal passions take them. My hope is that a year in Beijing will inspire and challenge these students in ways they haven't even imagined. I look forward to seeing how this new class will leave its mark." Schwarzman Scholars spend a year in Beijing, where they earn a master's degree in global affairs from Tsinghua's Schwarzman College. In addition to taking classes in the core curriculum, scholars pursue an individually designed concentration in public policy, international studies, or economics and business. Outside the classroom, scholars gain exposure to China through internships, mentorship opportunities, special speakers and travel. Tammineedi, who will earn a master's degree in machine learning in May 2019, will study public policy as a Schwarzman Scholar. He first became interested in the scholarship when it caught his eye on the website for CMU's Fellowships and Scholarships Office. He then worked with Richelle Bernazzoli, the office's assistant director, to complete the grueling application process that included in-person interviews. "Hima is a budding expert in artificial intelligence whose interests span machine learning, transportation and urban issues. It was clear from our first meeting that he has been on an impressive trajectory since high school," Bernazzoli said. "Every step of the way, Hima has approached his work with humility, thoughtfulness and a sense of responsibility to society. These qualities will make him an excellent Schwarzman Scholar and leader in his field for years to come." While Tammineedi said he's excited to learn from global leaders and the strong group of peers that will surround him during his scholar experience, he's also looking forward to expanding his education beyond computer science. "I've studied computer science and machine learning at CMU, and that's where a lot of my interests lie. But my ultimate goals involve effecting change in the world at a large level," he said. "While I do believe that a deep understanding of tech could make this change possible, I don't think just knowing tech can lead to the best changes. I want to develop a better understanding of how the world and its countries and governments function in order to understand the best ways technology can help." For more about this year's class of Schwarzman Scholars, visit the organization's website.

November 2018

PopSci Recognizes Wheel-Track With "Best of What's New" Award

Byron Spice

A wheel that can transform into a triangular track, developed by Carnegie Mellon University's National Robotics Engineering Center with funding from the Defense Advanced Research Projects Agency, has won a Popular Science "Best of What's New" Award for 2018.The reconfigurable wheel-track can transform from one mode to the other in less than two seconds while the vehicle is in motion, enabling a vehicle in wheel mode to operate at high speeds on roads and switch rapidly to track mode to negotiate challenging off-road terrain.The device was recognized by Popular Science with a Best of What's New Award in the security category. The magazine presents the awards annually to 100 new products and technologies in 10 categories, including aerospace, entertainment and health."The Best of What's New Awards allow us the chance to examine and honor the best innovations of the year," said Joe Brown, editor-in-chief of Popular Science. "This collection shapes our future, helps us be more efficient, keeps us healthy and safe, and lets us have some fun along the way.''The innovative wheel-track was one of the technologies developed in DARPA's Ground X-Vehicle Technologies (GXV-T) program, which aimed to reduce the need for armor by making combat vehicles faster, more maneuverable and capable of operating in a wide variety of environments.Dimi Apostolopoulos, a CMU Robotics Institute senior systems scientist and principal investigator for the wheel-track project, said the shape-shifting wheel-track has a number of potential civilian applications as well, including uses in agriculture, mining, construction, forestry and transportation. It can also be used in vehicles ranging in size from heavy equipment to recreational vehicles."Creating a reconfigurable wheel-track system that works on a moving vehicle and at high speeds was an exceptional challenge, but our NREC team came up with a design that works and has the potential to transform ground mobility." Apostolopoulos said. "We are appreciative of DARPA's GXV-T program and we thank the editors of Popular Science for this recognition."The wheel-track has a rubberized tread that sits atop a frame that can change shape. The spinning wheel is transformed into a track by extending a Y-shaped support, which pushes the frame into a triangular shape. Simultaneously, application of a brake to stop the wheel from spinning causes the transmission to automatically shift from turning the wheel to turning a set of gears that drives the track.Though other research groups have built devices similar to NREC's reconfigurable wheel-track, those previous designs have required halting the vehicle to transform from one mode to the other, Apostolopoulos noted. The ability to make these transformations on the fly, he added, is a critical requirement for vehicles that must handle changing terrain at high speed.In testing to date, vehicles have been able to achieve 50 miles an hour in wheel mode and almost 30 mph in track mode. The device has been able to transform from wheel mode to track mode at speeds as high as 25 mph and from track mode to wheel mode at speeds of around 12 mph.Previous winners of Best of What's New Awards from Carnegie Mellon include Tartan Racing's Boss self-driving SUV, a self-landing helicopter, a snake-like robotic neck surgery tool, an automated method for editing video, a panoramic video camera and a photo editing tool that can manipulate objects in a photo as if they were three-dimensional.NREC is a part of Carnegie Mellon's Robotics Institute that performs contract research and development for a variety of governmental and industrial clients.

Farber Elected 2018 AAAS Fellow

Byron Spice

David Farber of the Institute for Software Research is one of two Carnegie Mellon University faculty members named 2018 fellows of the American Association for the Advancement of Science (AAAS). The AAAS honor recognizes Farber, sometimes called the "Grandfather of the Internet," for distinguished contributions to programming languages and computer networking. Farber joined CMU in 2002. He served as Distinguished Career Professor of Computer Science and Public Policy, and is now an adjunct professor. Earlier this year, Farber became a Distinguished Professor at Keio University in Tokyo, where he is co-director of the Cyber Civilization Research Center. This year, 416 members have been named AAAS fellows because of their scientifically or socially distinguished efforts to advance science or its applications. In addition to Farber, they include Gregory V. Lowry, the Walter J. Blenko Sr. Professor of Civil and Environmental Engineering, who is cited for his contributions to safe and sustainable use of nanomaterials, remediation methods for contaminated sediments and brines, and mitigation of fossil fuel use impacts. Farber's distinguished career spans more than 50 years, including a stint as chief technologist for the Federal Communications Commission. He was the Alfred Fitler Moore Professor of Telecommunication Systems at the University of Pennsylvania's Wharton School before joining CMU. Farber has made foundational contributions to electronics, programming languages and distributed computing. He also is moderator of the long-running Interesting People email list, which focuses on internet governance, infrastructure and other topics he favors. He also is known for his way with words and known Farberisms, such as "another day, a different dollar," and "don't look for a gift in the horse's mouth." His work has earned him countless awards and honors, including induction as an IEEE fellow, ACM fellow, the 1995 SIGCOMM Award for lifelong contributions to computer communications, and a spot in the Pioneers Circle of the Internet Hall of Fame. The new AAAS fellows will be inducted on Saturday, Feb. 16, at the AAAS Fellows Forum during the AAAS Annual Meeting in Washington, D.C.

Bajpai, Wang Earn Stehlik Scholarships

Aisha Rashid (DC 2019)

The School of Computer Science has named current seniors Tanvi Bajpai and Serena Wang the recipients of its 2018 Mark Stehlik SCS Alumni Undergraduate Impact Scholarship. The award, now in its fourth year, recognizes undergraduate students for their commitment and dedication both in and beyond the classroom. Bajpai and Wang have made noteworthy contributions both to SCS and the computer science field in general. And they both plan to continue doing so after graduation. Bajpai, who hails from West Windsor, NJ, said that she felt out of place in high school, surrounded by students who were less passionate about learning and more preoccupied with padding their resumes. She cultivated her interest in computer science by participating in programming competitions at the University of Pennsylvania, and attended a summer program at Princeton called the Program in Algorithmic and Combinatorial Thinking (PACT). Her exposure to discrete math and algorithm design fueled a desire to pursue computer science at CMU, where she was pleased to finally be surrounded with peers, faculty and mentors who were all just as passionate about the field as she was. "I didn't want to get my hopes up about anything when I arrived at CMU," Bajpai said. "I just wanted to learn as much as I could." During her time at CMU, Bajpai has performed research with Ramamoorthi Ravi, the Andris A. Zoltners Professor of Business and Rohet Tolani Distinguished Professor in SCS and the Tepper School of Business. In the summer of 2017, she interned at Microsoft, and this past summer she traveled to the University of Maryland to work on research with Samir Khuller, the Distinguished Scholar Teacher and Professor of Computer Science. Despite her many accomplishments, Bajpai believes that her biggest achievement at CMU was being a teaching assistant for a series of computer science and discrete math classes including 15-451: Algorithms, 15-151: Mathematical Foundations of Computer Science, and 21-128: Mathematical Concepts and Proofs. "My outreach has been primarily toward encouraging diversity in the undergraduate computer science program, because although we have a 50/50 male to female ratio, we still need to push diversity at the teaching assistant and research level," Bajpai said. "I've been very passionate about addressing the imposter phenomenon that goes on at CMU, and I've planned events with Women @ SCS to address both of these topics." Wang, a Bay Area native, wasn't interested in computer science until her junior year of high school, even though she grew up in Silicon Valley. After a field trip to visit Google and Facebook's offices, and joining the National Center for Women and Information Technology (NCWIT) Facebook group, she was inspired by all of the initiatives proposed by young women. "It made me realize that just like any field, computer science had many diverse topics," Wang said. "There were many other young women just like me who were pursuing the field." Wang has been a teaching assistant every semester since fall of her sophomore year, because of the positive impact her own teaching assistants had on her education at CMU. Beyond that, she has been involved with ScottyLabs and Women @ SCS since her freshman year, holding executive positions in ScottyLabs including both director of finance and director. She has also performed research on provable security and privacy with SCS Assistant Professor Jean Yang, and developed a passion for entrepreneurship while participating in the Kleiner Perkins Engineering Fellows Program. Wang believes the most incredible opportunity she's had at CMU was organizing TartanHacks, a CMU-wide hackathon. "Organizing a large event like TartanHacks takes a lot of preparation and teamwork," Wang said. "But in the end, the rest of the ScottyLabs executive board members and I felt so accomplished and satisfied when we finished successfully hosting the event." With their senior years nearly half completed, both students are focusing on their post-graduation goals. Bajpai hopes to pursue a Ph.D. in theoretical computer science and Wang will join an enterprise data infrastructure startup called Akita. Both students are incredibly grateful for the resources and opportunities that were theirs for the taking in the School of Computer Science. "Receiving the Stehlik Scholarship has made me look back at what I've accomplished during my time at CMU, and as a freshman, I never would have expected to be able to do everything I've achieved," said Wang. Bajpai added, "I don't think I'd be where I am today had I not had the support from some of my professors and advisors here, and I will always be grateful for that."

Carnegie Mellon University, Microsoft Join Forces to Advance Edge Computing Research

Byron Spice

Carnegie Mellon University today announced it will collaborate with Microsoft on a joint effort to innovate in edge computing, an exciting field of research for intensive computing applications that require rapid response times in remote and low-connectivity environments. By bringing artificial intelligence to the "edge," devices such as connected vehicles, drones or factory equipment can quickly learn and respond to their environments, which is critical to scenarios like search and rescue, disaster recovery, and safety.To enable discovery in these areas and more, Microsoft will contribute edge computing products to Carnegie Mellon for use in its Living Edge Laboratory, a testbed for exploring applications that generate large data volumes and require intense processing with near-instantaneous response times. Intel, which already is associated with the lab, is also contributing technology to the lab.Edge computing is a growing field that, in contrast to cloud computing, pushes computing resources closer to where data is generated — particularly mobile users — so that a host of new interactive and augmented reality applications are possible. It's the focus of intense commercial interest by network providers and tech companies, even as researchers continue to investigate its possibilities. Carnegie Mellon is at the forefront of this major shift in computing paradigms.Under a two-year agreement, Microsoft will provide edge computing products to the Living Edge Lab, including Azure Data Box Edge, Azure Stack (with hardware partner Intel) and Microsoft Azure credits, which provide access to cloud services including artificial intelligence, internet of things, storage and more. The new hardware is powered by Intel® Xeon® Scalable processors to support the most high-demand applications and actionable insights.The lab, run by edge computing pioneer and Carnegie Group Professor of Computer Science Mahadev Satyanarayanan, now operates on the CMU campus, as well as in shopping districts and parks in Pittsburgh's Oakland and Shadyside neighborhoods."It's easy to talk about edge computing, but it's hard to get crucial hands-on experience," said Satyanarayanan. "That's why a number of major telecommunications and tech companies have joined our Open Edge Computing Initiative and helped us establish the lab. We validate ideas and provide unbiased, critical thinking about what works and what doesn't."With the addition of Microsoft products and Intel technology to the lab, faculty and students will be able to use them to develop new applications and compare their performance with other components already in the lab. Microsoft partners also will be able to use the lab."The intelligent edge, with the power of the intelligent cloud, can and is already driving real-world impact. By moving AI models and compute closer to the source, we can surface real-time insights in scenarios where milliseconds make a critical difference, and in remote areas where 'real time' has not been possible," said Tad Brockway, general manager of Azure Storage and Azure Stack. "Microsoft offers the most comprehensive spectrum of intelligent edge technologies across hardware, software and devices, bringing the power of the cloud to the edge. We are excited to see what Carnegie Mellon researchers create."Speed — both of computation and communication — is a driving force for edge computing. By placing computer nodes, or "cloudlets," near where people are, edge computing makes it possible to both perform intensive computation and to communicate the results to users at near real-time. This enables solutions better designed to for latency-sensitive workloads where every millisecond matters."Intel is at the heart of solutions needed to run the most demanding AI applications on the edge," said Renu Navale, senior director of Edge Services and Industry Enabling in the Network Communications Division. "We are excited to extend our existing networking edge collaboration with the Open Edge Computing Initiative to include Microsoft solutions like Azure Data Box Edge and Azure Stack, powered by Intel Xeon processors."One example class of applications are wearable cognitive assistance applications based on the Gabriel platform, a National Science Foundation-sponsored project led by Satyanarayanan. A Gabriel application is intended as an angel on your shoulder, observing a user and providing advice on a task. This technology might provide expert guidance to a user who is assembling furniture, or troubleshooting a complex piece of machinery, or helping someone use an AED device in an emergency.A second example of the value edge computing brings to applications is OpenRTiST, which allows a user to see the world around them in real time, through the eyes of an artist. The video feed from the camera of a mobile device is transmitted to a cloudlet, transformed there by a deep neural network trained offline to learn the artistic features of a famous painting, and returned to the user's device as a video feed. The entire round trip is fast enough to preserve the illusion that the artist is continuously repainting the user's world as displayed on the device.Another class of applications envisioned for the Living Edge Laboratory are real-time assistive tools for visually impaired people to help them detect objects or people nearby. The video feeds of a stereoscopic camera on a user are transmitted to a nearby cloudlet, and real-time video analytics is used detect obstacles. This information is transmitted back to the user and communicated via vibro-tactile feedback."The Living Edge Laboratory can help determine not only what types of applications are possible, but also what kind of equipment or software works best for a given application," Satyanarayanan said.The lab was established through the Open Edge Computing Initiative, a group of leading companies, including Intel, Deutsche Telekom, Vodafone and Crown Castle who have provided equipment, software and expertise."We welcome Microsoft as a new member of the Open Edge Computing Initiative and we very much look forward to exploring Microsoft technologies in our Living Edge Laboratory," said Rolf Schuster, director of the Open Edge Computing Initiative. "This is a great opportunity to drive attractive new business opportunities around edge computing for both the telecom and the cloud industries."

Neural Nets Supplant Marker Genes in Analyzing Single Cell RNA Sequencing

Byron Spice

Computer scientists at Carnegie Mellon University say neural networks and supervised machine learning techniques can efficiently characterize cells that have been studied using single cell RNA-sequencing (scRNA-seq). This finding could help researchers identify new cell subtypes and differentiate between healthy and diseased cells. Rather than rely on marker genes, which are not available for all cell types, this new automated method analyzes all of the scRNA-seq data to select just those parameters that can differentiate one cell from another. This enables the analysis of all cell types and provides a method for comparative analysis of those cells. Researchers from CMU's Computational Biology Department explain their method today in the online journal Nature Communications. They also describe a web server called scQuery that makes the method usable by all researchers. Over the past five years, single cell sequencing has become a major tool for cell researchers. In the past, researchers could only obtain DNA or RNA sequence information by processing batches of cells, providing results that only reflected average values of the cells. Analyzing cells one at a time, by contrast, enables researchers to identify subtypes of cells, or to see how a healthy cell differs from a diseased cell, or how a young cell differs from an aged cell. This type of sequencing will support the National Institutes of Health's new Human BioMolecular Atlas Program (HuBMAP), which is building a 3D map of the human body that shows how tissues differ on a cellular level. Ziv Bar-Joseph, professor of computational biology and machine learning and a co-author of today's paper, leads a CMU-based center contributing computational tools to that project. "With each experiment yielding hundreds of thousands of data points, this is becoming a Big Data problem," said Amir Alavi, a Ph.D. student in computational biology who was co-lead author of the paper with post-doctoral researcher Matthew Ruffalo. "Traditional analysis methods are insufficient for such large scales." Alavi, Ruffalo and their colleagues developed an automated pipeline that attempts to download all public scRNA-seq data available for mice — identifying the genes and proteins expressed in each cell — from the largest data repositories, including the NIH's Gene Expression Omnibus (GEO). The cells were then labeled by type and processed via a neural network, a computer system modeled on the human brain. By comparing all of the cells with each other, the neural net identified the parameters that make each cell distinct. The researchers tested this model using scRNA-seq data from a mouse study of a disease similar to Alzheimer's. As would be expected, the analysis showed similar levels of brain cells in both healthy and diseased cells, while the diseased cells included substantially more immune cells, such as macrophages, generated in response to the disease. The researchers used their pipeline and methods to create scQuery, a web server that can speed comparative analysis of new scRNA-seq data. Once a researcher submits a single cell experiment to the server, the group's neural networks and matching methods can quickly identify related cell subtypes and identify earlier studies of similar cells. In addition to Ruffalo, Alavi and Bar-Joseph, authors of the research paper include Aiyappa Parvangada and Zhilin Huang, both graduate students in computational biology. The National Institutes of Health, the National Science Foundation, the Pennsylvania Department of Health and the James S. McDonnell Foundation supported this work.

High Stakes

Josh Quicksall

Opioid drug overdoses kills thousands of Americans each year. A team of software engineering students has developed a wearable device that could help address these unprecedented rates of overdose deaths. As a capstone project for the Institute for Software Research's professional master's program in embedded software engineering (MSIT-ESE), four students worked with their sponsor, Pinney Associates, to build a prototype wristband that can detect overdose in the wearer. The challenge the client presented to the team was to produce a low-cost wearable device that could accurately detect an opioid overdose and send out an alert — helping rescuers respond in time to administer naloxone, a life-saving opioid antagonist that can reverse the overdose. The device delighted Pinney Associates, a pharmaceutical consulting firm that sponsored the work. And it was also clever enough that the team beat out 97 percent of all submissions to the Robert Wood Johnson Foundation's Opioid Challenge competition, ultimately placing third in the competition finals at the Health 2.0 Conference held in September in Santa Clara, Calif. "The project was intimidating, not only because it was massive, but also because this wasn't a project where you could simply deliver the code," explained Puneetha Ramachandra, one member of the group that calls itself Team Hashtag. "There was a burden of real societal responsibility to the project. Lives were on the line. This had to be done properly." Using pulse oximetry, the device they developed monitors the amount of oxygen in the user's blood by measuring light reflected back from the skin to a sensor. When paired to a mobile phone via Bluetooth, the sensor takes numerous readings on an ongoing basis to establish a baseline reading. If the user's blood oxygen level drops for more 30 seconds, the device switches an LED on the display from green to red. The device also cues the paired mobile phone — via an app the team also developed — to send out a message with the user's GPS coordinates to his or her emergency contacts. "Having naloxone on hand doesn't matter if you overdose and there is nobody nearby to administer it," said Michael Hufford, CEO of Harm Reduction Therapeutics, a nonprofit pharmaceutical company spun out of Pinney Associates with the goal of taking naloxone over-the-counter. "Having a cheap-but-reliable device that can detect overdose could be absolutely central in saving lives." One of the most significant challenges to developing the system was simply understanding what constitutes an overdose in terms of a specific drop in oxygen saturation in the blood. "Even if you asked a group of doctors what defines the overdose, they would struggle to give you a concrete answer," team member Rashmi Kalkunte Ramesh said. "They have to physically assess the person for a variety of signals. It was on us to cull those signals and select a method of reliable, accurate assessment. We eventually honed in on a wrist-mounted pulse oximetry device as the best approach." The team, which also included Yu-Sam Huang and Soham Donwalkar, is excited to see how the device might further evolve. "There are so many ways this product could be even better," Donwalkar said. "I can absolutely see additional sensors being incorporated to give a machine-learning backend a bigger dataset to work with, reducing the number of false positives, for example. Or, once clinical trials are open, assembling a much larger, more diverse corpus for ML training that encompasses a wide range of physical variables — like age, sex, race, etc. — that could affect what an overdose state looks like!" Their clients couldn't be happier with the progress to date. "I wasn't expecting something that was quite so turnkey," said Pinney senior data manager Steve Pype. "Initially, we were thinking this might be a proof of concept. But here we are: The project is almost finished and they're refining the prototype."    

Sandholm, Brown To Receive Minsky Medal

Byron Spice

Computer Science Professor Tuomas Sandholm and Noam Brown, a Ph.D. student in the Computer Science Department, are the second-ever recipients of the prestigious Marvin Minsky Medal, which will be presented by the International Joint Conference on Artificial Intelligence (IJCAI) in recognition of their outstanding achievements in AI. Sandholm and Brown created Libratus, an AI that became the first computer program to beat top professional poker players at Heads-Up No-Limit Texas Hold'em. During the 20-day "Brains vs. Artificial Intelligence" competition in January 2017, Libratus played 120,000 hands against four poker pros, beating each player individually and collectively amassing more than $1.8 million in chips. The feat has yet to be duplicated. "Poker is an important challenge for AI because any poker player has to deal with incomplete information," said Michael Wooldridge, a professor of computer science at the University of Oxford and chair of the IJCAI Awards Committee. "Incomplete information makes the computational challenge orders of magnitude harder. Libratus used fundamentally new techniques for dealing with incomplete information which have exciting potential applications far beyond games." This is just the second time that the IJCAI has awarded the Minsky Medal. The inaugural recipient was the team behind DeepMind's AlphaGo system, which beat a world champion Go player in 2016. The award is named for Marvin Minsky, one of the founders of the field of AI and co-founder of MIT's Computer Science and AI Laboratory. It will be presented at the IJCAI 2019 conference in Macao, China, next August. "Marvin Minsky was a big, broad thinker and an AI pioneer. We are proud to receive the medal in his name," Sandholm said. "Computational techniques for solving imperfect-information games will have large numbers of applications in the future, since most real-world settings have more than one actor and imperfect information. I believe that this is a tipping point toward applications now that the best AI has reached a superhuman level, as measured on the main benchmark in the field." Libratus did not use expert domain knowledge or human data specific to poker. Rather, the AI analyzed the game's rules and devised its own strategy. The technology thus could be applied to any number of imperfect-information games. Such hidden information is ubiquitous in real-world strategic interactions, including business negotiation, cybersecurity, finance, strategic pricing and military applications. "We appreciate the community's recognition of the difficult challenges that hidden information poses to the field of artificial intelligence and the importance of addressing them," Brown said. "We look forward to applying this technology to a variety of real-world settings in a way that will have a positive impact on peoples' lives." Sandholm said he believes so strongly in the potential of this technology that he has founded two companies, Strategic Machine Inc. and Strategy Robot Inc., which have exclusively licensed Libratus' technology and other technologies from Sandholm's lab to create a variety of commercial applications. Sandholm, a leader of the university's CMU AI initiative, is the first recipient of Carnegie Mellon University's Angel Jordan Professorship in Computer Science. He also founded and directs the Electronic Marketplaces Laboratory and Optimized Markets Inc. Sandholm joined CMU in 2001, and works at the convergence of AI, economics and operations research. His algorithms run the nationwide kidney exchange for the United Network for Organ Sharing, autonomously making the kidney exchange transplant plan for 69 percent of U.S. transplant centers each week. One of his startups, Optimized Markets Inc., is bringing a new optimization-powered paradigm to advertising campaign sales and scheduling in television, streaming video and audio, internet display, mobile, game, radio and cross-media advertising. Through a prior startup he founded, he fielded 800 combinatorial electronic auctions for sourcing, totaling $60 billion. Sandholm's many honors include a National Science Foundation CAREER Award, the inaugural Association for Computing Machinery (ACM) Autonomous Agents Research Award, a Sloan Fellowship, an Edelman Laureateship, and the IJCAI's Computers and Thought Award. He is a fellow of the Association for Computing Machinery, Association for the Advancement of Artificial Intelligence, and Institute for Operations Research and the Management Sciences. He holds an honorary doctorate from the University of Zurich. Before undertaking his doctoral studies at CMU, Brown worked at the Federal Reserve Board in the International Financial Markets section, where he researched algorithmic trading in financial markets. Prior to that, he developed algorithmic trading strategies. At CMU, where he is advised by Sandholm, Brown has developed computational game theory techniques to produce AIs capable of strategic reasoning in large imperfect-information interactions. He is a recipient of an Open Philanthropy Project AI Fellowship and a Tencent AI Lab Fellowship. Brown and Sandholm have shared numerous awards for their research, including a best paper award at the 2017 Neural Information Processing Systems conference, the Allen Newell Award for Research Excellence and multiple supercomputing awards.

Sadeh Speaks on Plenary Panel About Data Protection and Privacy

Daniel Tkacik

It's been five months since the General Data Protection Regulation (GDPR) went into effect in the European Union, setting strict rules for how personal data is collected and processed. Last week, CyLab's Norman Sadeh, a professor in the Institute for Software Research and co-director of the Privacy Engineering program, spoke about privacy, artificial intelligence (AI) and the challenges at the intersection of the two at the International Conference of Data Protection and Privacy Commissioners (ICDPPC) in Brussels. He sat down with Cylab to discuss his talk and privacy in general in a Q&A we've included below. Can you give us a taste of what the International Conference of Data Protection and Privacy Commissioners was all about? As its name suggests, ICDPPC is the big international conference where once a year regulators and other key stakeholders from around the world come together to discuss privacy regulation and broader challenges associated with privacy. You served as a panelist on one of the plenary panels titled, "Right vs. Wrong." What exactly was discussed? This panel was aimed at broadening the scope of privacy discussions beyond just regulation and addressing deeper, more complex ethical issues related to the use and collection of data. I discussed how our research has shown that getting the full benefits of existing regulations, whether GDPR or the recently passed California Consumer Protection Act, is hampered by complex cognitive and behavioral limitations we people have. I talked about the technologies our group has been developing to assist users make better informed privacy decisions and overcome these limitations. How exactly do you define what's right vs. what's wrong? When people discuss ethics, they generally refer to a collection of principles that include basic expectations of trustworthiness, transparency, fairness and autonomy. As you can imagine, there is no single definition out there and this list is not exhaustive. In my presentation, I discussed the principles and methodologies our group uses to evaluate and fine-tune technologies we develop, and how we ultimately ask ourselves whether a user is better off with a given configuration of one or more technologies. This often involves running human subject studies designed to isolate and quantify the effects of those technologies. Examples of privacy technologies we have been developing range from technologies to nudge users to more carefully reflect on privacy decisions they need to make, to machine learning techniques to model people’s privacy preferences and help them configure privacy settings. They also include technologies to automatically answer privacy questions users may have about a given product or service. Can you talk about the context in which this conference took place? What kinds of privacy regulations have we seen go into effect this year, and what other regulations might we see in the future? This conference took place in a truly unique context. People’s concerns about privacy have steadily increased over the past several years, from the Snowden revelations of a few years ago to the Cambridge Analytica fiasco exposed earlier this year. People have come to realize that privacy is not just about having their data collected for the sake of sending them better targeted ads, but that it goes to the core of our democracy and how various actors are using data they collect to manipulate our opinions and even influence our votes. The widespread use of artificial intelligence and how it can lead to bias, discrimination and other challenges is also of increasing concern to many. A keynote presentation at the conference by Apple CEO Tim Cook, as well as messages from Facebook's Mark Zuckerberg and Alphabet's Sundar Pichai also suggest that big tech may now be in favor of a sweeping U.S. federal privacy law that would share some similarities with the EU GDPR. While the devil is in the details, such a development would mark a major shift in the way in which data collection and use practices are regulated in the US, with many technologies being by and large unregulated today. How does your research inform some of these types of discussions? Research is needed on many fronts, from developing a better understanding of how new technologies negatively impact people’s expectations of privacy to how we can mitigate the risks associated with undesirable inferences made by data mining algorithms. At the conference, I focused on some of the research we have conducted on modeling people’s privacy preferences and expectations, and how we have been able to develop technologies that can assist users in making better informed decisions at scale. How do you address the scale at which data is collected and the complexity of the value chains along with data travels? Regulations by themselves, such as more transparent privacy policies and offering users more control and more privacy settings, are important but not sufficient to empower users to regain control over their data at scale. I strongly believe that our work on using AI to build privacy assistants can ultimately make a very big difference here. People are just unable to read the privacy policies and configure the settings associated with the many technologies with which they interact on a typical day. There is a need for intelligent assistants that can help them zoom in on those issues they care about, answer questions they have and help them configure settings. What do you see as the main challenges for privacy in the age of AI and IoT? A first issue is the scale at which data is collected and the diverse ways in which it is used. A second challenge has to do with the difficulty of controlling the inferences that can be made by machine learning algorithms. A third challenge in the context of IoT is that we don’t have any mechanisms today to even advertise the presence of these technologies — think cameras with computer vision or home assistants — let alone expose privacy settings to people who come in contact with these technologies. For instance, how is a camera supposed to allow users to opt in or opt out of facial expression recognition technology, if the user does not even know the camera is there, doesn’t know that facial expression recognition algorithms are processing the footage, and has no user interface to opt in or out? If I had to identify one final challenge, I would emphasize the need to help and train developers do a better job when it comes to adopting privacy-by-design practices, from data minimization practices all the way to more transparent data practice disclosures.