0 Members and 1 Guest are viewing this topic.
Machine learning is a promising and potentially powerful technique for detection and prognosis of disease. Machine learning methods, including where imaging and other data streams are combined with large electronic health databases, could enable a personalized approach to medicine through improved diagnosis and prediction of individual responses to therapies.“However, any machine learning algorithm is only as good as the data it’s trained on,” said first author Dr. Michael Roberts from Cambridge’s Department of Applied Mathematics and Theoretical Physics. “Especially for a brand-new disease like COVID-19, it’s vital that the training data is as diverse as possible because, as we’ve seen throughout this pandemic, there are many different factors that affect what the disease looks like and how it behaves.”“The international machine learning community went to enormous efforts to tackle the COVID-19 pandemic using machine learning,” said joint senior author Dr James Rudd, from Cambridge’s Department of Medicine. “These early studies show promise, but they suffer from a high prevalence of deficiencies in methodology and reporting, with none of the literature we reviewed reaching the threshold of robustness and reproducibility essential to support use in clinical practice.”Many of the studies were hampered by issues with poor quality data, poor application of machine learning methodology, poor reproducibility, and biases in study design. For example, several training datasets used images from children for their ‘non-COVID-19’ data and images from adults for their COVID-19 data. “However, since children are far less likely to get COVID-19 than adults, all the machine learning model could usefully do was to tell the difference between children and adults, since including images from children made the model highly biased,” said Roberts.Many of the machine learning models were trained on sample datasets that were too small to be effective. “In the early days of the pandemic, there was such a hunger for information, and some publications were no doubt rushed,” said Rudd. “But if you’re basing your model on data from a single hospital, it might not work on data from a hospital in the next town over: the data needs to be diverse and ideally international, or else you’re setting your machine learning model up to fail when it’s tested more widely.”In many cases, the studies did not specify where their data had come from, or the models were trained and tested on the same data, or they were based on publicly available ‘Frankenstein datasets’ that had evolved and merged over time, making it impossible to reproduce the initial results.
This thread is another spinoff from my earlier thread called universal utopia. This time I try to attack the problem from another angle, which is information theory point of view.I have started another thread related to this subject asking about quantification of accuracy and precision. It is necessary for us to be able to make comparison among available methods to describe some aspect of objective reality, and choose the best option based on cost and benefit consideration. I thought it was already a common knowledge, but the course of discussion shows it wasn't the case. I guess I'll have to build a new theory for that. It's unfortunate that the thread has been removed, so new forum members can't explore how the discussion developed.In my professional job, I have to deal with process control and automation, engineering and maintenance of electrical and instrumentation systems. It's important for us to explore the leading technologies and use them for our advantage to survive in the fierce industrial competition during this industrial revolution 4.0. One of the technology which is closely related to this thread is digital twin.https://en.m.wikipedia.org/wiki/Digital_twinJust like my other spinoff discussing about universal morality, which can be reached by expanding the groups who develop their own subjective morality to the maximum extent permitted by logic, here I also try to expand the scope of the virtualization of real world objects like digital twin in industrial sector to cover other fields as well. Hopefully it will lead us to global governance, because all conscious beings known today share the same planet. In the future the scope needs to expand even further because the exploration of other planets and solar systems is already on the way.
In simple terms a point of information reads true (absolute answers) where expansions of information reads false (speculative) .
What if you never had to fill out paperwork again? In Estonia, this is a reality: citizens conduct nearly all public services online, from starting a business to voting from their laptops, thanks to the nation's ambitious post-Soviet digital transformation known as "e-Estonia." One of the program's experts, Anna Piperal, explains the key design principles that power the country's "e-government" -- and shows why the rest of the world should follow suit to eradicate outdated bureaucracy and regain citizens' trust.
Objective reality contains a lot of objects with complex relationships among them. Hence to build a virtual universe we must use a method capable of storing data to represent the complex system. The obvious choice is using graphs, which are a mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices (also called nodes or points) which are connected by edges (also called links or lines).
## WTF is a graph database- Euler and Graph Theory- Math -- it's hard, let's skip it- It's about data -- lots of it- But let's zoom in and look at the basics## Relational model vs graph model- How do we represent THINGS in DBs- Relational vs Graph- Nodes and Relationships## Why use a graph over a relational DB or other NoSQL?- Very simple compared to RDBMS, and much more flexible- The real power is in relationship-focused data (most NoSQL dbs don't treat relationships as first-order)- As related-ness and amount of data increases, so does advantage of Graph DBs- Much closer to our whiteboard modelEVENT: Nodevember 2016SPEAKER: Ed Finkler
Opening the black box to uncover the rules of the genome’s regulatory code.Researchers at the Stowers Institute for Medical Research, in collaboration with colleagues at Stanford University and Technical University of Munich, have developed advanced explainable artificial intelligence (AI) in a technical tour de force to decipher regulatory instructions encoded in DNA. In a report published online on February 18, 2021, in Nature Genetics, the team found that a neural network trained on high-resolution maps of protein-DNA interactions can uncover subtle DNA sequence patterns throughout the genome and provide a deeper understanding of how these sequences are organized to regulate genes.Neural networks are powerful AI models that can learn complex patterns from diverse types of data such as images, speech signals, or text to predict associated properties with impressive high accuracy. However, many see these models as uninterpretable since the learned predictive patterns are hard to extract from the model. This black-box nature has hindered the wide application of neural networks to biology, where interpretation of predictive patterns is paramount.One of the big unsolved problems in biology is the genome’s second code—its regulatory code. DNA bases (commonly represented by letters A, C, G, and T) encode not only the instructions for how to build proteins, but also when and where to make these proteins in an organism. The regulatory code is read by proteins called transcription factors that bind to short stretches of DNA called motifs. However, how particular combinations and arrangements of motifs specify regulatory activity is an extremely complex problem that has been hard to pin down.Now, an interdisciplinary team of biologists and computational researchers led by Stowers Investigator Julia Zeitlinger, PhD, and Anshul Kundaje, PhD, from Stanford University, have designed a neural network—named BPNet for Base Pair Network—that can be interpreted to reveal regulatory code by predicting transcription factor binding from DNA sequences with unprecedented accuracy. The key was to perform transcription factor-DNA binding experiments and computational modeling at the highest possible resolution, down to the level of individual DNA bases. This increased resolution allowed them to develop new interpretation tools to extract the key elemental sequence patterns such as transcription factor binding motifs and the combinatorial rules by which motifs function together as a regulatory code.
“More traditional bioinformatics approaches model data using pre-defined rigid rules that are based on existing knowledge. However, biology is extremely rich and complicated,” says Avsec. “By using neural networks, we can train much more flexible and nuanced models that learn complex patterns from scratch without previous knowledge, thereby allowing novel discoveries.“BPNet’s network architecture is similar to that of neural networks used for facial recognition in images. For instance, the neural network first detects edges in the pixels, then learns how edges form facial elements like the eye, nose, or mouth, and finally detects how facial elements together form a face. Instead of learning from pixels, BPNet learns from the raw DNA sequence and learns to detect sequence motifs and eventually the higher-order rules by which the elements predict the base-resolution binding data.Once the model is trained to be highly accurate, the learned patterns are extracted with interpretation tools. The output signal is traced back to the input sequences to reveal sequence motifs. The final step is to use the model as an oracle and systematically query it with specific DNA sequence designs, similar to what one would do to test hypotheses experimentally, to reveal the rules by which sequence motifs function in a combinatorial manner.“The beauty is that the model can predict way more sequence designs that we could test experimentally,” Zeitlinger says. “Furthermore, by predicting the outcome of experimental perturbations, we can identify the experiments that are most informative to validate the model.” Indeed, with the help of CRISPR gene editing techniques, the researchers confirmed experimentally that the model’s predictions were highly accurate.Since the approach is flexible and applicable to a variety of different data types and cell types, it promises to lead to a rapidly growing understanding of the regulatory code and how genetic variation impacts gene regulation. Both the Zeitlinger Lab and the Kundaje Lab are already using BPNet to reliably identify binding motifs for other cell types, relate motifs to biophysical parameters, and learn other structural features in the genome such as those associated with DNA packaging. To enable other scientists to use BPNet and adapt it for their own needs, the researchers have made the entire software framework available with documentation and tutorials.
In regards to a virtual universe I consider that would be the smallest element of possible information , a tuple .
Emil Eifrem, Neo4j Co-Founder and CEO explains why connected data is the key to more accurate, efficient and credible learning systems. Using real world use cases ranging from space engineering to investigative journalism, he will outline how a relationships-first approach adds context to data - the key to explainable, well-informed predictions.
Another jargon busting video - Here I explain in simple terms what edge computing or sometimes called fog computing is. I provide practical examples of computing at the edge of the network - in phones, cameras, etc.
With the emergence of offerings on both AWS (Neptune) and Azure (CosmosDB) within the past year it is fair to say that graph databases are of the hottest trends and that they are here to stay. So what are graph databases all about then? You can read article after article about how great they are and that they will solve all your problems better than your relational database but its difficult to really find any practical information about them.This talk will start with a short primer on graph databases and the ecosystem but will then quickly transition to discussing the practical aspects of how to apply them to solve real world business problems. We will dive into what makes a good use case and what does not. We will then follow this up with some real world examples of some of the common patterns and anti-patterns of using graph databases. If you haven't been scared away by this point we will end by showing you some of the powerful insights that graph databases can provide you.
Edge computing places workloads closer to where data is created and where actions need to be taken. It addresses the unprecedented scale and complexity of data created by connected devices. As more and more data comes in from remote IoT edge devices and servers, it’s important to act on the data quickly. Acting quickly can help companies seize new business opportunities, increase operational efficiency and improve customer experiences.In this video, Rob High, IBM Fellow and CTO, provides insights into the basic concepts and key use cases of edge computing.
When OpenAI released its huge natural-language algorithm GPT-3 last summer, jaws dropped. Coders and developers with special access to an early API rapidly discovered new (and unexpected) things GPT-3 could do with naught but a prompt. It wrote passable poetry, produced decent code, calculated simple sums, and with some edits, penned news articles.All this, it turns out, was just the beginning. In a recent blog post update, OpenAI said that tens of thousands of developers are now making apps on the GPT-3 platform.Over 300 apps (and counting) use GPT-3, and the algorithm is generating 4.5 billion words a day for them.
The Coming Torrent of Algorithmic ContentEach month, users publish about 70 million posts on WordPress, which is, hands down, the dominant content management system online.Assuming an average article is 800 words long—which is speculation on my part, but not super long or short—people are churning out some 56 billion words a month or 1.8 billion words a day on WordPress.If our average word count assumption is in the ballpark, then GPT-3 is producing over twice the daily word count of WordPress posts. Even if you make the average more like 2,000 words per article (which seems high to me) the two are roughly equivalent.Now, not every word GPT-3 produces is a word worth reading, and it’s not necessarily producing blog posts (more on applications below). But in either case, just nine months in, GPT-3’s output seems to foreshadow a looming torrent of algorithmic content.
Processing goes to the edge – networks and storage become the bottlenecksWe recently reported Microsoft Corp. Chief Executive Satya Nadella’s epic quote that we’ve reached peak centralization. The graphic below paints a picture that is telling. We just shared above that processing power is accelerating at unprecedented rates. And costs are dropping like a rock. Apple’s A14 costs the company $50 per chip. Arm at its v9 announcement said that it will have chips that can go into refrigerators that will optimize energy use and save 10% annually on power consumption. They said that chip will cost $1 — a buck to shave 10% off your electricity bill from the fridge.
Processing is plentiful and cheap. But look at where the expensive bottlenecks are: networks and storage. So what does this mean?It means that processing is going to get pushed to the edge – wherever the data is born. Storage and networking will become increasingly distributed and decentralized. With custom silicon and processing power placed throughout the system with AI embedded to optimize workloads for latency, performance, bandwidth, security and other dimensions of value.And remember, most of the data – 99% – will stay at the edge. We like to use Tesla Inc. as an example. The vast majority of data a Tesla car creates will never go back to the cloud. It doesn’t even get persisted. Tesla saves perhaps five minutes of data. But some data will connect occasionally back to the cloud to train AI models – we’ll come back to that.
Massive increases in processing power and cheap silicon will power the next wave of AI, machine intelligence, machine learning and deep learning.
We sometimes use artificial intelligence and machine intelligence interchangeably. This notion comes from our collaborations with author David Moschella. Interestingly, in his book “Seeing Digital,” Moschella says “there’s nothing artificial” about this:There’s nothing artificial about machine intelligence just like there’s nothing artificial about the strength of a tractor.It’s a nuance, but precise language can often bring clarity. We hear a lot about machine learning and deep learning and think of them as subsets of AI. Machine learning applies algorithms and code to data to get “smarter” – make better models, for example, that can lead to augmented intelligence and better decisions by humans, or machines. These models improve as they get more data and iterate over time.Deep learning is a more advanced type of machine learning that uses more complex math.
OpenAI Brings Introspection to Reinforcement Learning AgentsThe research around Evolved Policy Gradients attempts to recreate introspection in reinforcement learning models.Introspection is one of those magical cognitive abilities that differentiate humans from other species. Conceptually, introspection can be defined as the ability to examine conscious thoughts and feelings. Introspection also plays a pivotal role in how humans learn. Have you ever tried to self-learn a new skill such as learning a new language? Even without any external feedback, you can quickly assess whether you are making progress on aspects such as vocabulary or pronunciation. Wouldn’t it be great if we could apply some of the principles of introspection to artificial intelligence(AI) discplines such as reinforcement learning (RL)?The magic of introspection comes from the fact that humans have access to very well shaped internal reward functions, derived from prior experience on other tasks, and through the course of biological evolution. That model highly contrasts with RL agents that are fundamentally coded to start from scratch on any learning task relying mainly on external feedback. Not surprisingly, most RL models take substantially more time than humans to learn similar tasks. Recently, researchers from OpenAI published a new paper that proposes a method to address this challenge by creating RL models that know what it means to make progress on a new task, by having experienced making progress on similar tasks in the past.
Graph enhancements to Artificial Intelligence and Machine Learning are changing the landscape of intelligent applications. Beyond improving accuracy and modeling speed, graph technologies make building AI solutions more accessible. Join us to hear about 6 areas at the forefront of graph enhanced AI and ML, and find out which techniques are commonly used today and which hold the potential for disrupting industries.