Learning Posted Prices in Bilateral Trade: Regret Guarantees Under Full and Bandit Feedback

Bruni, Luca

Learning Posted Prices in Bilateral Trade: Regret Guarantees Under Full and Bandit Feedback

Fichiers

Bruni_Luca_2026_thesis.pdf (693.5 KB)

Date

2026-05-08

Authors

Bruni, Luca

Éditeur

Université d'Ottawa | University of Ottawa

Licence Creative Commons

Attribution-NonCommercial-NoDerivatives 4.0 International

Résumé

In this thesis we study an economically motivated sequential decision problem in which a learner repeatedly chooses an action (e.g., a posted price) and observes structured feedback. We ask how the information revealed after each decision determines whether learning is possible and what regret rates are achievable. We cast the problem in the online-learning framework and analyze two feedback models. Under full-feedback, the learner can effectively evaluate alternative actions; we give an efficient algorithm with sublinear regret and matching lower bounds, yielding sharp minimax rates. Under bandit- feedback, we show that without additional regularity, sublinear regret is impossible. We then identify natural smoothness conditions on the instance under which bandit learning becomes feasible again and derive regret guarantees. Overall, our results cleanly separate learnable from non-learnable regimes and quantify how mild structure can bridge the gap between full-feedback and bandit learning.

Mots-clés

Learning, Online, Price, Machine, Bound, Rate

URI

http://hdl.handle.net/10393/51619
https://doi.org/10.20381/ruor-31922

Collections

- Thèses, 2011 - // Theses, 2011 -

Notice complète

Learning Posted Prices in Bilateral Trade: Regret Guarantees Under Full and Bandit Feedback

Fichiers

Date

Authors

Nom de la revue

ISSN de la revue

Titre du volume

Éditeur

Licence Creative Commons

Résumé

Description

Mots-clés

Citation

URI

Collections

Approbation

Évaluation

Complété par

Référencé par