0 Members and 6 Guests are viewing this topic.
Powerful transformer models have been widely used in autoregressive generation, where they have advanced the state-of-the-art beyond recurrent neural networks (RNNs). However, because the output words for these models are incrementally predicted conditioned on the prefix, the generation requires quadratic time complexity with regard to sequence length.As the performance of transformer models increasingly relies on large-scale pretrained transformers, this long sequence generation issue has become increasingly problematic. To address this, a research team from the University of Washington, Microsoft, DeepMind and Allen Institute for AI have developed a method to convert a pretrained transformer into an efficient RNN. Their Transformer-to-RNN (T2R) approach speeds up generation and reduces memory cost.
Overall, the results validated that T2R achieves efficient autoregressive generation while retaining high accuracy, proving that large-scale pretrained models can be compressed into efficient inference models that facilitate downstream applications.
Whatever business a company may be in, software plays an increasingly vital role, from managing inventory to interfacing with customers. Software developers, as a result, are in greater demand than ever, and that's driving the push to automate some of the easier tasks that take up their time.
A machine capable of programming itself once seemed like science fiction. But an exponential rise in computing power, advances in natural language processing, and a glut of free code on the internet have made it possible to automate at least some aspects of software design.Trained on GitHub and other program-sharing websites, code-processing models learn to generate programs just as other language models learn to write news stories or poetry. This allows them to act as a smart assistant, predicting what software developers will do next, and offering an assist. They might suggest programs that fit the task at hand, or generate program summaries to document how the software works. Code-processing models can also be trained to find and fix bugs. But despite their potential to boost productivity and improve software quality, they pose security risks that researchers are just starting to uncover.
"Our framework for attacking the model, and retraining it on those particular exploits, could potentially help code-processing models get a better grasp of the program's intent," says Liu, co-senior author of the study. "That's an exciting direction waiting to be explored."In the background, a larger question remains: what exactly are these black-box deep-learning models learning? "Do they reason about code the way humans do, and if not, how can we make them?" says O'Reilly. "That's the grand challenge ahead for us."
Jonny Cheetham, Sales Director: Graph databases are a rising tide in the world of big data insights, and the enterprises that tap into their power realize significant competitive advantages.So how might your enterprise leverage graph databases to generate competitive insights and derive significant business value from your connected data? This webinar will show you the top five most impactful and profitable use cases of graph databases.
Multimodal Neurons in Artificial Neural NetworksWe’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP’s accuracy in classifying surprising visual renditions of concepts, and is also an important step toward understanding the associations and biases that CLIP and similar models learn.
Our discovery of multimodal neurons in CLIP gives us a clue as to what may be a common mechanism of both synthetic and natural vision systems—abstraction. We discover that the highest layers of CLIP organize images as a loose semantic collection of ideas, providing a simple explanation for both the model’s versatility and the representation’s compactness.
Animals are constantly moving and behaving in response to instructions from the brain. But while there are advanced techniques for measuring these instructions in terms of neural activity, there is a paucity of techniques for quantifying the behavior itself in freely moving animals. This inability to measure the key output of the brain limits our understanding of the nervous system and how it changes in disease.A new study by researchers at Duke University and Harvard University introduces an automated tool that can readily capture behavior of freely behaving animals and precisely reconstruct their three dimensional (3D) pose from a single video camera and without markers.The April 19 study in Nature Methods led by Timothy W. Dunn, Assistant Professor, Duke University, and Jesse D. Marshall, postdoctoral researcher, Harvard University, describes a new 3D deep-neural network, DANNCE (3-Dimensional Aligned Neural Network for Computational Ethology). The study follows the team's 2020 study in Neuron which revealed the groundbreaking behavioral monitoring system, CAPTURE (Continuous Appendicular and Postural Tracking using Retroreflector Embedding), which uses motion capture and deep learning to continuously track the 3D movements of freely behaving animals. CAPTURE yielded an unprecedented detailed description of how animals behave. However, it required using specialized hardware and attaching markers to animals, making it a challenge to use."With DANNCE we relieve this requirement," said Dunn. "DANNCE can learn to track body parts even when they can't be seen, and this increases the types of environments in which the technique can be used. We need this invariance and flexibility to measure movements in naturalistic environments more likely to elicit the full and complex behavioral repertoire of these animals."DANNCE works across a broad range of species and is reproducible across laboratories and environments, ensuring it will have a broad impact on animal—and even human—behavioral studies. It has a specialized neural network tailored to 3D pose tracking from video. A key aspect is that its 3D feature space is in physical units (meters) rather than camera pixels. This allows the tool to more readily generalize across different camera arrangements and laboratories. In contrast, previous approaches to 3D pose tracking used neural networks tailored to pose detection in two-dimensions (2D), which struggled to readily adapt to new 3D viewpoints.
Do Neural Networks Think Like Our Brain? OpenAI Answers!https://openai.com/blog/multimodal-neurons/
Components make compute and storage servers, and servers with application plane, control plane, and data plane software running atop them or alongside them make systems, and workflows across systems make platforms. The end state goal of any system architect is really creating a platform. If you don’t have an integrated platform, then what you have is an IT nightmare.That is what four decades of distributed computing has really taught us, if you boil off all the pretty water that obscures with diffraction and bubbling and look very hard at the bottom of the pot into the substrate of bullshit left behind.Maybe we should have something called a platform architect? And maybe they don’t have those titles at the big hyperscalers and public cloud builders, but that is, in fact, what these companies are doing. And for those of us who have been around for a while, it is with a certain amount of humor that we are seeing the rise of the most vertically integrated, proprietary platforms that the world has seen since the IBM System/360 mainframe and the DEC VAX, IBM AS/400, and HP 3000 – there was no “E” back then – minicomputers in the 1960s and the 1970s.
We are starting to see more exascale and large supercomputing sites benchmark and project on deep learning capabilities of systems designed for HPC applications but only a few have run system-wide tests to see how their machines might stack up against standard CNN and other metrics.In China, however, we finally have some results about the potential for leadership-class systems to tackle deep learning. That is interesting in itself, but in the case of AI benchmarks on the Tianhe-3 exascale prototype supercomputer, we also get a sense of how that system’s unique Arm-based architecture performs for math that is quite different than that required for HPC modeling/simulation.
It is hard to tell what to expect from this novel architecture in terms of AI workloads but for us, the news is that the system is operational and teams are at least exploring what might be possible in scaling deep learning using an Arm-based architecture and unique interconnect. It also shows that there is still work to be done to optimize Arm-based processors for even routine AI benchmarks to keep pace with other companies with CPUs and accelerators.
Building a computer that can support artificial intelligence at the scale and complexity of the human brain will be a colossal engineering effort. Now researchers at the National Institute of Standards and Technology have outlined how they think we’ll get there.How, when, and whether we’ll ever create machines that can match our cognitive capabilities is a topic of heated debate among both computer scientists and philosophers. One of the most contentious questions is the extent to which the solution needs to mirror our best example of intelligence so far: the human brain.Rapid advances in AI powered by deep neural networks—which despite their name operate very differently than the brain—have convinced many that we may be able to achieve “artificial general intelligence” without mimicking the brain’s hardware or software.Others think we’re still missing fundamental aspects of how intelligence works, and that the best way to fill the gaps is to borrow from nature. For many that means building “neuromorphic” hardware that more closely mimics the architecture and operation of biological brains.The problem is that the existing computer technology we have at our disposal looks very different from biological information processing systems, and operates on completely different principles. For a start, modern computers are digital and neurons are analog. And although both rely on electrical signals, they come in very different flavors, and the brain also uses a host of chemical signals to carry out processing.Now though, researchers at NIST think they’ve found a way to combine existing technologies in a way that could mimic the core attributes of the brain. Using their approach, they outline a blueprint for a “neuromorphic supercomputer” that could not only match, but surpass the physical limits of biological systems.The key to their approach, outlined in Applied Physics Letters, is a combination of electronics and optical technologies. The logic is that electronics are great at computing, while optical systems can transmit information at the speed of light, so combining them is probably the best way to mimic the brain’s excellent computing and communication capabilities.
In June 2020, a new and powerful artificial intelligence (AI) began dazzling technologists in Silicon Valley. Called GPT-3 and created by the research firm OpenAI in San Francisco, California, it was the latest and most powerful in a series of ‘large language models’: AIs that generate fluent streams of text after imbibing billions of words from books, articles and websites. GPT-3 had been trained on around 200 billion words, at an estimated cost of tens of millions of dollars.The developers who were invited to try out GPT-3 were astonished. “I have to say I’m blown away,” wrote Arram Sabeti, founder of a technology start-up who is based in Silicon Valley. “It’s far more coherent than any AI language system I’ve ever tried. All you have to do is write a prompt and it’ll add text it thinks would plausibly follow. I’ve gotten it to write songs, stories, press releases, guitar tabs, interviews, essays, technical manuals. It’s hilarious and frightening. I feel like I’ve seen the future.”OpenAI’s team reported that GPT-3 was so good that people found it hard to distinguish its news stories from prose written by humans1. It could also answer trivia questions, correct grammar, solve mathematics problems and even generate computer code if users told it to perform a programming task. Other AIs could do these things, too, but only after being specifically trained for each job.Large language models are already business propositions. Google uses them to improve its search results and language translation; Facebook, Microsoft and Nvidia are among other tech firms that make them. OpenAI keeps GPT-3’s code secret and offers access to it as a commercial service. (OpenAI is legally a non-profit company, but in 2019 it created a for-profit subentity called OpenAI LP and partnered with Microsoft, which invested a reported US$1 billion in the firm.) Developers are now testing GPT-3’s ability to summarize legal documents, suggest answers to customer-service enquiries, propose computer code, run text-based role-playing games or even identify at-risk individuals in a peer-support community by labelling posts as cries for help.
Towards complete and error-free genome assemblies of all vertebrate speciesHigh-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1,2,3,4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
The Vertebrate Genomes ProjectBuilding on this initial set of assembled genomes and the lessons learned, we propose to expand the VGP to deeper taxonomic phases, beginning with phase 1: representatives of approximately 260 vertebrate orders, defined here as lineages separated by 50 million or more years of divergence from each other. Phase 2 will encompass species that represent all approximately 1,000 vertebrate families; phase 3, all roughly 10,000 genera; and phase 4, nearly all 71,657 extant named vertebrate species (Supplementary Note 5, Supplementary Fig. 3). To accomplish such a project within 10 years, we will need to scale up to completing 125 genomes per week, without sacrificing quality. This includes sample permitting, high molecular weight DNA extractions, sequencing, meta-data tracking, and computational infrastructure. We will take advantage of continuing improvements in genome sequencing technology, assembly, and annotation, including advances in PacBio HiFi reads, Oxford Nanopore reads, and replacements for 10XG reads (Supplementary Note 6), while addressing specific scientific questions at increasing levels of phylogenetic refinement. Genomic technology advances quickly, but we believe the principles of our pipeline and the lessons learned will be applicable to future efforts. Areas in which improvement is needed include more accurate and complete haplotype phasing, base-call accuracy, and resolution of long repetitive regions such as telomeres, centromeres, and sex chromosomes. The VGP is working towards these goals and making all data, protocols, and pipelines openly available (Supplementary Notes 5, 7).Despite remaining imperfections, our reference genomes are the most complete and highest quality to date for each species sequenced, to our knowledge. When we began to generate genomes beyond the Anna’s hummingbird in 2017, only eight vertebrate species in GenBank had genomes that met our target continuity metrics, and none were haplotype phased (Supplementary Table 23). The VGP pipeline introduced here has now been used to complete assemblies of more than 130 species of similar or higher quality (Supplementary Note 5; BioProject PRJNA489243). We encourage the scientific community to use and evaluate the assemblies and associated raw data, and to provide feedback towards improving all processes for complete and error-free assembled genomes of all species.
A major part of real-world AI has to be solved to make unsupervised, generalized full self-driving work, as the entire road system is designed for biological neural nets with optical imagers
Designing deep neural networks these days is more art than science. In the deep learning space, any given problem can be addressed with a fairly large number of neural network architectures. In that sense, designing a deep neural network from the ground up for a given problem can result incredibly expensive in terms of time and computational resources. Additionally, given the lack of guidance in the space, we often end up producing neural network architectures that are suboptimal for the task at hand. About two years ago, artificial intelligence(AI) researchers from Google published a paper proposing a method called MorphNet to optimize the design of deep neural networks.
Automated neural network design is one of the most active areas of research in the deep learning space. The most traditional approach to neural network architecture design involves sparse regularizers using methods such as L1. While this technique has proven effective on reducing the number of connections in a neural network, quite often ends up producing suboptimal architectures. Another approach involves using search techniques to find an optimal neural network architecture for a given problem. That method has been able to generate highly optimized neural network architectures but it requires an exorbitant number of trial and error attempts which often results computationally prohibited. As a result, neural network architecture search has only proven effective in very specialized scenarios. Factoring the limitations of the previous methods, we can arrive to three key characteristics of effective automated neural network design techniques:a) Scalability: The automated design approach should be scalable to large datasets and models.b) Multi-Factor Optimization: An automated method should be able to optimized the structure of a deep neural network targeting specific resources.c) Optimal: An automated neural network design should produce an architecture that improves performance while reducing the usage of the target resource.
MorphNetGoogle’s MorphNet approaches the problem of automated neural network architecture design from a slightly different angle. Instead of trying to try numerous architectures across a large design space, MorphNet start with an existing architecture for a similar problem and, in one shot, optimize it for the task at hand.MorphNet optimizes a deep neural network by interactively shrinking and expanding its structure. In the shrinking phase, MorphNet identifies inefficient neurons and prunes them from the network by applying a sparsifying regularizer such that the total loss function of the network includes a cost for each neuron. Just doing this typically results on a neural network that consumes less of the targeted resource, but typically achieves a lower performance. However, MorphNet applies a specific shrinking model that not only highlights which layers of a neural network are over-parameterized, but also which layers are bottlenecked. Instead of applying a uniform cost per neuron, MorphNet calculates a neuron cost with respect to the targeted resource. As training progresses, the optimizer is aware of the resource cost when calculating gradients, and thus learns which neurons are resource-efficient and which can be removed.
Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with the new Vision Transformer (ViT) architecture and achieves impressive results without any labels. Attention maps can be directly interpreted as segmentation maps, and the obtained representations can be used for image retrieval and zero-shot k-nearest neighbor classifiers (KNNs).OUTLINE:0:00 - Intro & Overview6:20 - Vision Transformers9:20 - Self-Supervised Learning for Images13:30 - Self-Distillation15:20 - Building the teacher from the student by moving average16:45 - DINO Pseudocode23:10 - Why Cross-Entropy Loss?28:20 - Experimental Results33:40 - My Hypothesis why this works38:45 - Conclusion & Comments
As the world fights the SARS-CoV-2 virus causing the COVID-19 pandemic, another group of dangerous pathogens looms in the background. The threat of antibiotic-resistant bacteria has been growing for years and appears to be getting worse. If COVID-19 taught us one thing, it’s that governments should be prepared for more global public health crises, and that includes finding new ways to combat rogue bacteria that are becoming resistant to commonly used drugs.In contrast to the current pandemic, viruses may be be the heroes of the next epidemic rather than the villains. Scientists have shown that viruses could be great weapons against bacteria that are resistant to antibiotics.
Since the discovery of penicillin in 1928, antibiotics have changed modern medicine. These small molecules fight off bacterial infections by killing or inhibiting the growth of bacteria. The mid-20th century was called the Golden Age for antibiotics, a time when scientists were discovering dozens of new molecules for many diseases.This high was soon followed by a devastating low. Researchers saw that many bacteria were evolving resistance to antibiotics. Bacteria in our bodies were learning to evade medicine by evolving and mutating to the point that antibiotics no longer worked.As an alternative to antibiotics, some researchers are turning to a natural enemy of bacteria: bacteriophages. Bacteriophages are viruses that infect bacteria. They outnumber bacteria 10 to 1 and are considered the most abundant organisms on the planet.Bacteriophages, also known as phages, survive by infecting bacteria, replicating and bursting out from their host, which destroys the bacterium.Harnessing the power of phages to fight bacteria isn’t a new idea. In fact, the first recorded use of so-called phage therapy was over a century ago. In 1919, French microbiologist Félix d'Hérelle used a cocktail of phages to treat children suffering from severe dysentery.D'Hérelle’s actions weren’t an accident. In fact, he is credited with co-discovering phages, and he pioneered the idea of using bacteria’s natural enemies in medicine. He would go on to stop cholera outbreaks in India and plague in Egypt.Phage therapy is not a standard treatment you can find in your local hospital today. But excitement about phages has grown over the past few years. In particular, scientists are using new knowledge about the complex relationship between phages and bacteria to improve phage therapy. By engineering phages to better target and destroy bacteria, scientists hope to overcome antibiotic resistance.
Now scientists are hoping to use the knowledge about CRISPR systems to engineer phages to destroy dangerous bacteria.When the engineered phage locates specific bacteria, the phage injects CRISPR proteins inside the bacteria, cutting and destroying the microbes’ DNA. Scientists have found a way to turn defense into offense. The proteins normally involved in protecting against viruses are repurposed to target and destroy the bacteria’s own DNA. The scientists can specifically target the DNA that makes the bacteria resistant to antibiotics, making this type of phage therapy extremely effective.
Science is only half of the solution when it comes to fighting these microbes. Commercialization and regulation are important to ensure that this technology is in society’s toolkit for fending off a worldwide spread of antibiotic-resistant bacteria.
Summary: A new neuroimaging technique captures the brain in motion in real-time, generating a 3D view and with improved detail. The new technology could help clinicians to spot hard-to-detect neurological conditions.Source: Stevens Institute of Technology
Scientists are exploring how to edit genomes and even create brand new ones that never existed before, but how close are we to harnessing synthetic life?Scientists have made major strides when it comes to understanding the base code that underlies all living things—but what if we could program living cells like software?The principle behind synthetic biology, the emerging study of building living systems, lies in this ability to synthesize life. An ability to create animal products, individualized medical therapies, and even transplantable organs, all starting with synthetic DNA and cells in a lab. There are two main schools of thought when it comes to synthesizing life: building artificial cells from the bottom-up or engineering microorganisms so significantly that it resynthesizes and redesigns the genome.With genetic engineering tools becoming more and more accessible, researchers want to use these synthesized genomes to enhance human health with regards to things like detecting infections or environmental pollutants. Bacterial cells can be engineered that will detect toxic chemicals. And these synthesized bacteria could potentially protect us from, for example, consuming toxins in contaminated water. The world of synthetic biology goes beyond human health though, it can be used in a variety of industries, including fashion. Researchers hope to come up with lab-made versions of materials like leather or silk.
For decades, biologists have read and edited DNA, the code of life. Revolutionary developments are giving scientists the power to write it. Instead of tinkering with existing life forms, synthetic biologists may be on the verge of writing the DNA of a living organism from scratch. In the next decade, according to some, we may even see the first synthetic human genome. Join a distinguished group of synthetic biologists, geneticists and bioengineers who are edging closer to breathing life into matter.This program is part of the Big Ideas Series, made possible with support from the John Templeton Foundation.Original Program Date: June 4, 2016MODERATOR: Robert KrulwichPARTICIPANTS: George Church, Drew Endy, Tom Knight, Pamela Silver
Synthetic Biology and the Future of Creation 00:00Participant Intros 3:25Ordering DNA from the internet 8:10 How much does it cost to make a synthetic human? 13:04Why is yeast the best catalyst 20:10How George Church printed 90 billion copies of his book 26:05Creating synthetic rose oil 28:35Safety engineering and synthetic biology 37:15Do we want to be invaded by bad bacteria? 45:26Do you need a human gene's to create human cells? 55:09The standard of DNA sequencing in utero 1:02:27The science community is divided by closed press meetings 1:11:30The Human Genome Project. What is it? 1:21:45
Principal component analysis(PCA) is one of the key algorithms that are part of any machine learning curriculum. Initially created in the early 1900s, PCA is a fundamental algorithm to understand data in high-dimensional spaces which are common in deep learning problems. More than a century after its invention, PCA is such a key part of modern deep learning frameworks that very few question it there could be a better approach. Just a few days ago, DeepMind published a fascinating paper that looks to redefine PCA as a competitive multi-agent game called EigenGame.Titled “EigenGame: PCA as a Nash Equilibrium”, the DeepMind work is one of those papers that you can’t resist to read just based on the title. Redefining PCA sounds ludicrous. And yet, DeepMind’s thesis makes perfect sense the minute you deep dive into it.In recent years, PCA techniques have hit a bottleneck in large scale deep learning scenarios. Originally designed for mechanical devices, traditional PCA is formulated as an optimization problem which is hard to scale across large computational clusters. A multi-agent approach to PCA might be able to leverage vast computational resources and produce better optimizations in modern dep learning problems.