0 Members and 1 Guest are viewing this topic.
Keynote: ?What's wrong with LLMs and what we should be building instead?Abstract: Large Language Models provide a pre-trained foundation for training many interesting AI systems. However, they have many shortcomings. They are expensive to train and to update, their non-linguistic knowledge is poor, they make false and self-contradictory statements, and these statements can be socially and ethically inappropriate. This talk will review these shortcomdifferentings and current efforts to address them within the existing LLM framework. It will then argue for a , more modular architecture that decomposes the functions of existing LLMs and adds several additional components. We believe this alternative can address all of the shortcomings of LLMs. We will speculate about how this modular architecture could be built through a combination of machine learning and engineering.Timeline:00:00-02:00 - Introducci?n00:00-02:00 Introduction to large language models and their capabilities02:01-3:14 Problems with large language models: Incorrect and contradictory answers03:15-4:28 Problems with large language models: Dangerous and socially unacceptable answers04:29-6:40 Problems with large language models: Expensive to train and lack of updateability06:41-12:58 Problems with large language models: Lack of attribution and poor non-linguistic knowledge12:59-15:02 Benefits and limitations of retrieval augmentation15:03-15:59 Challenges of attribution and data poisoning16:00-18:00 Strategies to improve consistency in model answers18:01-21:00 Reducing dangerous and socially inappropriate outputs21:01-25:26 Learning and applying non-linguistic knowledge25:27-37:35 Building modular systems to integrate reasoning and planning37:36-39:20 Large language models have surprising capabilities but lack knowledge bases.39:21-40:47 Building modular systems that separate linguistic skill from world knowledge is important.40:48-45:47 Questions and discussions on cognitive architectures and addressing the issue of miscalibration.45:48 Overcoming flaws in large language models through prompting engineering and verification.
Is AlphaGeometry a key step toward AGI? Even Deepmind's leaders can't seem to make their minds up. In this video, I'll give you the rundown of what AlphaGeometry is, what it means and what it doesn't meann. Plus I'll cover AlphaCodium, dropped open-source tonight seemingly out of nowhere, and causing a big stir for what it might mean for coders the world over. And I'll touch on what I foresee is the future of large languages models and their alliance with search.
As different things are bolted together it reminds me of various regions of the brain that are specialized and work together.
Why do neural networks need to be deep? In this video we explore how neural networks transform perceptions into concepts. This video unravels the mystery behind how machines interpret input data, such as images or sounds, and categorize them into recognizable concepts. From the basic structure of neurons and layers to the intricate play of weights and activations, get a comprehensive understanding of the learning process. Explore real-world applications like handwriting recognition and how layered processing aids in effective data categorization. Whether it's distinguishing between summer and winter days based on temperature and humidity or recognizing handwritten digits, the magic lies in the layered architecture of neural networks. This video elucidates how these artificial networks mimic the human brain's ability to interpret, recognize, and reason, marking a significant stride in AI research towards creating machines capable of reasoning. Why layers matter.
This video demystifies the core insight behind Transformers, moving beyond traditional explanations that get lost in query, key, value matrices and positional encoding. Instead, we'll unravel how a unique kind of layer, capable of adapting its connection weights based on input context, catapults the Transformer's efficiency and processing prowess. Comparing this dynamic nature with static layers in traditional networks, we'll see why Transformers excel in handling complex tasks with fewer layers. Get a visual grasp of how mini networks within layers, known as attention heads, act as information filters, dynamically adjusting to input and enhancing the model's learning capability. This simplified yet insightful explanation aims to shed light on the essence of what makes Transformers a game-changer in the realm of deep learning.
Meta's Shocking New Research | Self-Rewarding Language Models
The downfall of megalomaniacs, throughout history, has been the point at which they believed their own propaganda. Ozymandias, Canute, Napoleon, Hitler....
The academic question is whether Putin, Trump or Musk will be next. The humanitarian question is why do we tolerate these dangerous people?
AlphaGeometry is a combination of a symbolic solver and a large language model by Google DeepMind that tackles IMO geometry questions without any human-generated trainind data.OUTLINE:0:00 - Introduction1:30 - Problem Statement7:30 - Core Contribution: Synthetic Data Generation9:30 - Sampling Premises13:00 - Symbolic Deduction17:00 - Traceback19:00 - Auxiliary Construction25:20 - Experimental Results32:00 - Problem Representation34:30 - Final CommentsAbstract:Proving mathematical theorems at the olympiad level represents a notable milestone in human-level automated reasoning1,2,3,4, owing to their reputed difficulty among the world?s best talents in pre-university mathematics. Current machine-learning approaches, however, are not applicable to most mathematical domains owing to the high cost of translating human proofs into machine-verifiable format. The problem is even worse for geometry because of its unique translation challenges1,5, resulting in severe scarcity of training data. We propose AlphaGeometry, a theorem prover for Euclidean plane geometry that sidesteps the need for human demonstrations by synthesizing millions of theorems and proofs across different levels of complexity. AlphaGeometry is a neuro-symbolic system that uses a neural language model, trained from scratch on our large-scale synthetic data, to guide a symbolic deduction engine through infinite branching points in challenging problems. On a test set of 30 latest olympiad-level problems, AlphaGeometry solves 25, outperforming the previous best method that only solves ten problems and approaching the performance of an average International Mathematical Olympiad (IMO) gold medallist. Notably, AlphaGeometry produces human-readable proofs, solves all geometry problems in the IMO 2000 and 2015 under human expert evaluation and discovers a generalized version of a translated IMO theorem in 2004.Authors: Trieu H. Trinh, Yuhuai Wu, Quoc V. Le, He He & Thang Luong
00:02 Mark Zuckerberg announces plans to merge Metas 2 AI research efforts to build general intelligence.02:02 Mark Zuckerberg wants to open-source AGI04:01 Meta's AI Chief skeptical about AI superintelligence and Quantum Computing05:47 Debate on the future of AI and its capabilities
Sam Altman, CEO of OpenAI, and Satya Nadella, CEO of Microsoft, speak to The Economist?s editor-in-chief, Zanny Minton Beddoes, about what the future of AI will really look like.00:00 Sam Altman and Satya Nadella talk to The Economist00:25 What?s next for ChatGPT?1:33 How dangerous is AGI?2:32 AI regulation
An engineer can halt a training run, but a corporation cannot stop a profitable enterprise. And as more groups join the race, the only button they manufacture for themselves to press is ACCELERATE.
It shows how fast AI progress can catch up with humans reasoning ability.
Quote from: alancalverd on 19/01/2024 22:17:34Quote from: hamdani yusuf on 19/01/2024 21:14:18Computer software has done those things in virtual environment. Which is a roundabout way of saying that they haven't done them. I have flown to Mars and bombed the Mohne dam in a virtual environment. You don't get medals for not actually doing something.Not yet. Computation is just one component of consciousness. That's why I said I prefer the holistic approach for consciousness. Combining AI and robotics like what's being done by Tesla and other tech companies can make the difference in not so distant future.
Quote from: hamdani yusuf on 19/01/2024 21:14:18Computer software has done those things in virtual environment. Which is a roundabout way of saying that they haven't done them. I have flown to Mars and bombed the Mohne dam in a virtual environment. You don't get medals for not actually doing something.
Computer software has done those things in virtual environment.
ALOHA[Paper] https://arxiv.org/abs/2304.13705[Project Page] https://tonyzhaozh.github.io/aloha/Mobile ALOHA[Paper] https://arxiv.org/abs/2401.02117[Project Page] https://mobile-aloha.github.io/Code https://github.com/MarkFzp/act-plus-plusALOHA: AbstractFine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback. Performing these tasks typically requires high-end robots, accurate sensors, or careful calibration, which can be expensive and difficult to set up. Can learning enable low-cost and imprecise hardware to perform these fine manipulation tasks? We present a low-cost system that performs end-to-end imitation learning directly from real demonstrations, collected with a custom teleoperation interface. Imitation learning, however, presents its own challenges, particularly in high-precision domains: the error of the policy can compound over time, drifting out of the training distribution. To address this challenge, we develop a novel algorithm Action Chunking with Transformers (ACT) which reduces the effective horizon by simply predicting actions in chunks. This allows us to learn difficult tasks such as opening a translucent condiment cup and slotting a battery with 80-90% success, with only 10 minutes worth of demonstration data.Mobile ALOHA: AbstractImitation learning from human demonstrations has shown impressive performance in robotics. However, most results focus on table-top manipulation, lacking the mobility and dexterity necessary for generally useful tasks. In this work, we develop a system for imitating mobile manipulation tasks that are bimanual and require whole-body control. We first present Mobile ALOHA, a low-cost and whole-body teleoperation system for data collection. It augments the ALOHA system with a mobile base, and a whole-body teleoperation interface. Using data collected with Mobile ALOHA, we then perform supervised behavior cloning and find that co-training with existing static ALOHA datasets boosts performance on mobile manipulation tasks. With 50 demonstrations for each task, co-training can increase success rates by up to 90%, allowing Mobile ALOHA to autonomously complete complex mobile manipulation tasks such as sauteing and serving a piece of shrimp, opening a two-door wall cabinet to store heavy cooking pots, calling and entering an elevator, and lightly rinsing a used pan using a kitchen faucet.A quick clarification:3:33 to 3:58 of the cooking and the day in a life are all teleoperated, not behavior cloned. It was meant to be a demonstration on what teleoperation can do, and what behavioral cloning can potentially do with these teleoperated data.
Paint-spraying robots have been learning from human experts for years.Machines have been cleaning and sauteeing shrimp likewise.
Uncover the alarming truth about how money and corruption are undermining academic integrity. Starting with a shocking revelation from a Cambridge researcher!00:00 Intro00:19 Nick Wise01:04 The Discovery01:33 The Scheme03:00 The AI Effect03:36 Bribes04:46 Facebook05:24 The Data06:30 Solutions
Everything you wanted to know about how Humanoid Robots are trained, and will Tesla's Optimus be delayed due to shortage in training data?00:00 - Intro01:27 - 2023: Simulations, Soccer, Martial Arts07:11 - A Personal Story09:31 - 2024: End2End NN, LLMs16:52 - Train Your Own Robot18:53 - Creepy and Fun19:49 - Privacy, Character, Humanized Robot23:10 - INSUFFICIENT Training Data?30:37 - Who Wins? Optimus or Figure?34:08 - So Who Will Win?37:19 - FUN: Robot Faking It!