The future of AI

The future of AI


This will be a very short overview with plenty cited sources as to enlighten you on the foundation of the next-gen AI beying trained by the major companies right now. At the very least, OpenAI, which is the recurrent one on every paper (research from one or more of its scientists).


Papers:
https://arxiv.org/pdf/2106.04399.pdf - 19 Nov 2021
https://arxiv.org/pdf/2202.01361.pdf - 8 Jun 2022
https://arxiv.org/pdf/2302.06576.pdf - 3 Jun 2023
https://arxiv.org/pdf/2302.09465.pdf - 25 Jun 2023
https://arxiv.org/pdf/2111.09266.pdf - 10 Jul 2023
https://arxiv.org/pdf/2310.02710.pdf - 4 Oct 2023
https://arxiv.org/pdf/2310.03419.pdf - 5 Oct 2023 *
https://arxiv.org/pdf/2310.03301.pdf - 5 Oct 2023
https://arxiv.org/pdf/2310.08774.pdf - 12 Oct 2023
https://arxiv.org/pdf/2312.14331.pdf - 21 Dec 2023
https://arxiv.org/pdf/2312.15246.pdf - 23 Dec 2023
https://arxiv.org/pdf/2310.12934.pdf - 25 Feb 2024
https://arxiv.org/pdf/2310.00386.pdf - 25 Feb 2024
https://arxiv.org/pdf/2310.02679.pdf - 9 Mar 2024 *
https://arxiv.org/pdf/2310.04363.pdf - 13 Mar 2024
(and many MANY more not linked above)


Bonus:
https://quilted-toque-702.notion.site/The-GFlowNets-and-Amortized-Marginalization-Tutorial-af30e5a00e9e46c8b324b6a461e1e908
https://colab.research.google.com/drive/1qqXbKygQmlLSlAREkY4TRq2jgYluzI_F?usp=sharing


Generative Flow Networks: Diverse Candidate Generation and Amortized Inference


This report presents a comprehensive overview of Generative Flow Networks (GFlowNets), a novel
deep learning framework for generating diverse and high-quality candidate solutions in various
domains. Drawing upon insights from reinforcement learning and probabilistic modeling, GFlowNets
offer a powerful alternative to traditional methods like MCMC and reward-maximizing RL.


What are GFlowNets?


GFlowNets are generative models specifically designed for compositional discrete objects like
graphs, sets, and sequences. They learn a stochastic policy that iteratively constructs the desired
object through a sequence of simpler actions. This policy is trained to ensure that the probability of
generating an object is proportional to a given positive reward function, enabling the generation
of diverse and high-reward candidates.


The key idea behind GFlowNets is to view the generative process as a flow network. Each state
in the network represents a partially constructed object, and each edge represents an action that
modifies the state. The reward function defines the desired distribution over the final objects, and
the GFlowNet learns to adjust the "flow" of probability through the network to match this distribution.


Advantages of GFlowNets


GFlowNets offer several advantages over existing methods:


  • Diversity: Unlike standard RL methods that focus on finding the single best solution, GFlowNets
    can sample a diverse set of high-reward solutions. This is crucial in applications like drug discovery,
    where exploring a variety of promising candidates is essential.


  • Efficiency: GFlowNets can be trained offline and off-policy, meaning they can learn from
    data generated by a different policy. This allows for efficient exploration and avoids the need for
    expensive online data collection.


  • Amortization: GFlowNets amortize the cost of search during training, resulting in fast
    generation at test time. This contrasts with MCMC methods, which require running potentially
    long chains for each new sample.


  • Scalability: GFlowNets can be applied to high-dimensional discrete spaces where MCMC
    methods often struggle due to the mode-mixing problem. By learning the underlying structure of
    the reward function, GFlowNets can efficiently jump between modes and explore the space more
    effectively.


Theoretical Foundations and Extensions


The theoretical foundation of GFlowNets is based on the concept of **flow networks and Markovian
flows**. The flow-matching conditions ensure that the learned policy samples objects with the desired
probability distribution. Several extensions have been proposed to broaden the applicability of
GFlowNets:


  • Non-acyclic state spaces: Recent work has extended GFlowNets to handle non-acyclic state
    spaces, allowing for applications where cycles are inherent to the problem or where the state space
    is continuous.


  • Stochastic rewards and environments: GFlowNets can be adapted to handle stochastic
    rewards and environments, making them more suitable for real-world applications where perfect
    controllability is not guaranteed.


  • Intermediate rewards and returns: By incorporating intermediate rewards, GFlowNets can
    learn to sample proportionally to the accumulated reward along the trajectory, similar to the
    concept of return in RL.


  • Distributional GFlowNets: This extension allows GFlowNets to capture not just the expected
    value of achievable rewards but also other statistics of the reward distribution, providing a more
    complete picture of the problem.


  • Unsupervised GFlowNets and Pareto optimization: GFlowNets can be trained in an
    unsupervised manner without a predefined reward function, enabling them to learn to sample from
    the Pareto frontier defined by a set of objectives. This is particularly useful in multi-objective
    optimization problems.


Applications of GFlowNets


GFlowNets have been successfully applied to various tasks, including:


  • Scientific discovery: GFlowNets have been used to design new molecules, discover promising
    biological sequences, and solve combinatorial optimization problems.


  • Active learning: GFlowNets can be used to actively select informative data points for training
    other models, leading to improved data efficiency.


  • Bayesian structure learning: GFlowNets can be used to learn the structure of Bayesian
    networks, providing a powerful tool for causal discovery.


  • Discrete image modeling: GFlowNets have been used to model distributions over high-
    dimensional discrete data like images, achieving state-of-the-art performance.


  • Natural language reasoning: GFlowNets can be used to learn to sample latent reasoning
    chains, enabling LLMs to perform chain-of-thought reasoning and solve complex problems.


  • Tool use: GFlowNets can be used to train LLMs to use external tools, such as calculators, to
    solve problems that require multi-step reasoning and planning.


Conclusion


GFlowNets are a powerful and versatile framework for diverse candidate generation and amortized
inference. Their ability to learn from unnormalized reward functions, handle complex state spaces,
and generate diverse samples makes them a valuable tool for various machine learning applications.
As research on GFlowNets continues, we can expect to see even more powerful and efficient
algorithms emerge, pushing the boundaries of what is possible in probabilistic modeling and
reasoning.


You'll only receive email when they publish something new.

More from fhsp
All posts