The future of AI
April 12, 2024•1,268 words
The future of AI
This will be a very short overview with plenty cited sources as to enlighten you on the foundation of the next-gen AI beying trained by the major companies right now. At the very least, OpenAI, which is the recurrent one on every paper (research from one or more of its scientists).
Papers:
https://arxiv.org/pdf/2106.04399.pdf - 19 Nov 2021
https://arxiv.org/pdf/2202.01361.pdf - 8 Jun 2022
https://arxiv.org/pdf/2302.06576.pdf - 3 Jun 2023
https://arxiv.org/pdf/2302.09465.pdf - 25 Jun 2023
https://arxiv.org/pdf/2111.09266.pdf - 10 Jul 2023
https://arxiv.org/pdf/2310.02710.pdf - 4 Oct 2023
https://arxiv.org/pdf/2310.03419.pdf - 5 Oct 2023 *
https://arxiv.org/pdf/2310.03301.pdf - 5 Oct 2023
https://arxiv.org/pdf/2310.08774.pdf - 12 Oct 2023
https://arxiv.org/pdf/2312.14331.pdf - 21 Dec 2023
https://arxiv.org/pdf/2312.15246.pdf - 23 Dec 2023
https://arxiv.org/pdf/2310.12934.pdf - 25 Feb 2024
https://arxiv.org/pdf/2310.00386.pdf - 25 Feb 2024
https://arxiv.org/pdf/2310.02679.pdf - 9 Mar 2024 *
https://arxiv.org/pdf/2310.04363.pdf - 13 Mar 2024
(and many MANY more not linked above)
Bonus:
https://quilted-toque-702.notion.site/The-GFlowNets-and-Amortized-Marginalization-Tutorial-af30e5a00e9e46c8b324b6a461e1e908
https://colab.research.google.com/drive/1qqXbKygQmlLSlAREkY4TRq2jgYluzI_F?usp=sharing
Generative Flow Networks: Diverse Candidate Generation and Amortized Inference
This report presents a comprehensive overview of Generative Flow Networks (GFlowNets), a novel
deep learning framework for generating diverse and high-quality candidate solutions in various
domains. Drawing upon insights from reinforcement learning and probabilistic modeling, GFlowNets
offer a powerful alternative to traditional methods like MCMC and reward-maximizing RL.
What are GFlowNets?
GFlowNets are generative models specifically designed for compositional discrete objects like
graphs, sets, and sequences. They learn a stochastic policy that iteratively constructs the desired
object through a sequence of simpler actions. This policy is trained to ensure that the probability of
generating an object is proportional to a given positive reward function, enabling the generation
of diverse and high-reward candidates.
The key idea behind GFlowNets is to view the generative process as a flow network. Each state
in the network represents a partially constructed object, and each edge represents an action that
modifies the state. The reward function defines the desired distribution over the final objects, and
the GFlowNet learns to adjust the "flow" of probability through the network to match this distribution.
Advantages of GFlowNets
GFlowNets offer several advantages over existing methods:
- Diversity: Unlike standard RL methods that focus on finding the single best solution, GFlowNets
can sample a diverse set of high-reward solutions. This is crucial in applications like drug discovery,
where exploring a variety of promising candidates is essential.
- Efficiency: GFlowNets can be trained offline and off-policy, meaning they can learn from
data generated by a different policy. This allows for efficient exploration and avoids the need for
expensive online data collection.
- Amortization: GFlowNets amortize the cost of search during training, resulting in fast
generation at test time. This contrasts with MCMC methods, which require running potentially
long chains for each new sample.
- Scalability: GFlowNets can be applied to high-dimensional discrete spaces where MCMC
methods often struggle due to the mode-mixing problem. By learning the underlying structure of
the reward function, GFlowNets can efficiently jump between modes and explore the space more
effectively.
Theoretical Foundations and Extensions
The theoretical foundation of GFlowNets is based on the concept of **flow networks and Markovian
flows**. The flow-matching conditions ensure that the learned policy samples objects with the desired
probability distribution. Several extensions have been proposed to broaden the applicability of
GFlowNets:
- Non-acyclic state spaces: Recent work has extended GFlowNets to handle non-acyclic state
spaces, allowing for applications where cycles are inherent to the problem or where the state space
is continuous.
- Stochastic rewards and environments: GFlowNets can be adapted to handle stochastic
rewards and environments, making them more suitable for real-world applications where perfect
controllability is not guaranteed.
- Intermediate rewards and returns: By incorporating intermediate rewards, GFlowNets can
learn to sample proportionally to the accumulated reward along the trajectory, similar to the
concept of return in RL.
- Distributional GFlowNets: This extension allows GFlowNets to capture not just the expected
value of achievable rewards but also other statistics of the reward distribution, providing a more
complete picture of the problem.
- Unsupervised GFlowNets and Pareto optimization: GFlowNets can be trained in an
unsupervised manner without a predefined reward function, enabling them to learn to sample from
the Pareto frontier defined by a set of objectives. This is particularly useful in multi-objective
optimization problems.
Applications of GFlowNets
GFlowNets have been successfully applied to various tasks, including:
- Scientific discovery: GFlowNets have been used to design new molecules, discover promising
biological sequences, and solve combinatorial optimization problems.
- Active learning: GFlowNets can be used to actively select informative data points for training
other models, leading to improved data efficiency.
- Bayesian structure learning: GFlowNets can be used to learn the structure of Bayesian
networks, providing a powerful tool for causal discovery.
- Discrete image modeling: GFlowNets have been used to model distributions over high-
dimensional discrete data like images, achieving state-of-the-art performance.
- Natural language reasoning: GFlowNets can be used to learn to sample latent reasoning
chains, enabling LLMs to perform chain-of-thought reasoning and solve complex problems.
- Tool use: GFlowNets can be used to train LLMs to use external tools, such as calculators, to
solve problems that require multi-step reasoning and planning.
Conclusion
GFlowNets are a powerful and versatile framework for diverse candidate generation and amortized
inference. Their ability to learn from unnormalized reward functions, handle complex state spaces,
and generate diverse samples makes them a valuable tool for various machine learning applications.
As research on GFlowNets continues, we can expect to see even more powerful and efficient
algorithms emerge, pushing the boundaries of what is possible in probabilistic modeling and
reasoning.