Caro5: Build Small Hackathon: First Days
Hugging Face π An Adventure in Thousand Token Wood
First days. Last week I started to develop an online version of a game I learned in Vietnam. I never played more than tic-tac-toeβs 3 in a line, but I had so much fun, I wanted to play again. Then I got the final notification that the hackathon registration was about to close, so I jumped immediately to this opportunity.
My little game was basic x and o. It could be played, a player vs player, and it detected a win.
I wanted to dream big. What could AI bring into this? A playful adversary. A smart adversary! Maybe even having multiple characters, taunts. It could be a teacher too. But I never did any machine learning, so the roadmap was set for the first weekend:
- Polish the game ui, prepare the characters, etc
- Learn about game mechanics and strategies
- Learn about machine learning and stuff Second weekend:
- Build the gradio app
Sounds simple? I asked codex to set up a roadmap. Let's start with the basics that I learned:
Minimax β A decision-making algorithm used in two-player turn-based games (like chess or tic-tac-toe). It assumes both players play optimally: one tries to maximize their score, the other tries to minimize it. The algorithm explores all possible moves several steps ahead to choose the move that guarantees the best worst-case outcome.
AlphaBeta β An optimization technique for Minimax. It "prunes" (cuts off) branches of the move tree that cannot possibly lead to a better result than already found, without checking them fully. This dramatically speeds up the search, allowing the AI to look much deeper in the same amount of time.
AI Models in Caro5 π
Not using AI wouldn't be a good fit, because AI can only predict text. But if i wanted a really good opponent , then:
- Self-play thousands of games.
- Train a neural network to evaluate positions.
- Use MCTS + neural network.
- This is how systems like DeepMind's AlphaGo work.
Oh boy! Lots of new words! I still remember watching AlphaGo win against Lee Sedol ten years ago, so at least I know that.
Neural network β A computational system inspired by the human brain, made of layers of interconnected "neurons" (simple math functions). It learns patterns from data by adjusting internal weights. In games, it can evaluate board positions or suggest promising moves without brute-force searching everything.
MCTS (Monte Carlo Tree Search) β We'll see this word a lot. A search algorithm that builds a game tree gradually. It repeatedly plays out random simulations (or "rollouts") from a position, tracks which moves lead to wins, and focuses future simulations on the most promising branches. It balances exploring new moves versus exploiting known good ones, working well even with enormous search spaces like Go.
AlphaGo β The first computer program to defeat a world champion at the complex board game Go. It combined Monte Carlo Tree Search, deep neural networks (trained on human games and later self-play), and reinforcement learning to handle Go's huge branching factor, where traditional brute-force methods failed.
Training data π
The problem is, I don't have months, we have two weeks!
Self play β A learning method where an AI agent plays games against itself rather than against humans or another fixed opponent. This generates unlimited training data, and the agent can progressively improve because it always faces a challenger of roughly its own skill level (itself).
Self-Play Reinforcement Learning β A specific type of reinforcement learning where the agent continuously improves by playing against copies of itself. Each version learns from the games, updates its strategy, then plays against updated versions, creating an endless cycle of improvement without needing human data. It's how AlphaGo Zero surpassed the original AlphaGo.
Neural evaluator β A neural network trained to evaluate how good a given board position is for the current player (often outputting a win probability or score). Instead of searching all the way to the end of the game, the AI calls this evaluator at the bottom of its search tree to estimate how promising a position is, speeding up decision-making enormously.
Researching π
1. Locality assumption (critical)
Locality assumption β The belief that you don't need to look at the whole game history to make a good decision. Instead, the current board position contains all the relevant information for choosing the best move. It's what allows game AIs to search forward from the present without replaying everything that happened before.
2. Pattern-based engines
Pattern-based engines, very common in older + strong open-source AIs. by SzΕts, J., Harmati, I.
3. MCTS / AlphaZero-style approaches
Mastering Gomoku with AlphaZero: A Study in Advanced AI Game Strategy by Liang, W., Yu, C., Whiteaker, B., Huh, I., Shao, H., & Liang, Y.
AlphaGomoku: An AlphaGo-based Gomoku Artificial Intelligence using Curriculum Learning by Zheng Xie, XingYu Fu, JinYuan Yu
4. CNN / reinforcement learning approaches
Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions by Bonwoo Gu and Yunsick Sung
Advanced Machine Learning Techniques in Gomoku: Strategy,Implementation, and Analysis by Jiajun Han
5. Strong modern open-source direction (important for you)
Rapfi: Distilling Efficient Neural Network for the Game of Gomoku by Zhanggen Jin, Haobin Duan, Zhiyang Hang
πDing ding ding! This paper was published by last year's winners and it's a gold mine, If only I could understand! π
To be honest, while researching I already had found the tournament, but the engines were so over my head. But wasn't I lucky, this year they introduced a category just for caro! And the cup was this weekend!
I canβt wait to see what the winners came up with.




