<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[AlexCoderBlog]]></title><description><![CDATA[senior software engineer @fellow.app]]></description><link>https://alexcoder.dev</link><generator>RSS for Node</generator><lastBuildDate>Thu, 16 Apr 2026 08:37:47 GMT</lastBuildDate><atom:link href="https://alexcoder.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[I built my first SaaS project — AnimFlow]]></title><description><![CDATA[For the past few months, I’ve been working on a small side project that I finally feel comfortable sharing.
It’s called AnimFlow - https://animflow.io/
At a high level, AnimFlow is a simple idea:Take ]]></description><link>https://alexcoder.dev/animflow-launch</link><guid isPermaLink="true">https://alexcoder.dev/animflow-launch</guid><category><![CDATA[AI]]></category><category><![CDATA[AI Video Generator]]></category><category><![CDATA[ai animation]]></category><category><![CDATA[SaaS]]></category><category><![CDATA[startup]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Fri, 03 Apr 2026 19:04:01 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/62c23791ccecfb25bbc76cea/eb47ba9f-970b-4c99-837b-b3d6eba75cc5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For the past few months, I’ve been working on a small side project that I finally feel comfortable sharing.</p>
<p>It’s called <strong>AnimFlow</strong> - <a href="https://animflow.io/">https://animflow.io/</a></p>
<p>At a high level, AnimFlow is a simple idea:<br />Take a single image - a mascot, a character, a sticker - and turn it into a short AI-generated animation.</p>
<p>Not a cinematic video.<br />Not a full production pipeline.<br />Just a clean, short, expressive motion that feels usable.</p>
<h2>What it actually does</h2>
<p>The current version is very focused.</p>
<ul>
<li><p>You upload an image</p>
</li>
<li><p>You describe a small action (e.g. <em>“wave hello”</em>, <em>“idle breathing”</em>)</p>
</li>
<li><p>You get a short animation back</p>
</li>
</ul>
<p>The output is intentionally:</p>
<ul>
<li><p>short</p>
</li>
<li><p>loopable</p>
</li>
<li><p>consistent with the original character</p>
</li>
</ul>
<p>It’s designed more for <strong>animated characters</strong>, <strong>stickers, mascots, and lightweight content</strong>, not full videos.</p>
<p>Additionally, I think I am giving the best price on the market. Other popular services that I tried charge much more.</p>
<p>Here is the animation of my mascot character, which I generated in my service:</p>
<img src="https://cdn.hashnode.com/uploads/covers/62c23791ccecfb25bbc76cea/e086412e-0d40-4357-8dea-b5ca0ff2dc4f.gif" alt="" style="display:block;margin:0 auto" />

<h2>How I’m running it</h2>
<p>From a technical perspective, it’s a fairly standard setup:</p>
<ul>
<li><p>Django backend + REST API</p>
</li>
<li><p>Async job system for animation generation based on Celery</p>
</li>
<li><p>GPU workers running the actual model, hosting on Runpod.io. Currently, I run optimized Wan2.2, but planning to try / integrate other models.</p>
</li>
<li><p>Next.js Web app as the main client, I have also almost finished the iOS app</p>
</li>
</ul>
<p>The important part is that everything is <strong>job-based</strong>:<br />You submit an animation request, it runs asynchronously, and you get the result when it’s ready. In the iOS app, you even get push notifications.</p>
<p>I’m keeping the system simple and modular so I can iterate quickly.</p>
<h2>What’s next</h2>
<p>Right now, this is very much an MVP.</p>
<p>My focus going forward:</p>
<ul>
<li><p>improve motion quality and stability</p>
</li>
<li><p>try new models and experiments</p>
</li>
<li><p>make outputs more “sticker-ready” + try different export formats (GIFs, spritesheets, etc)</p>
</li>
</ul>
<p>I’m also trying to keep this as a <strong>small, sustainable project</strong>, not overbuild it.</p>
<h2>Final thoughts</h2>
<p>This is my first time launching something that feels like an actual SaaS product —<br />even if it’s still early.</p>
<p>I’m not expecting it to explode or anything.</p>
<p>But it’s a good step:</p>
<ul>
<li><p>building something end-to-end</p>
</li>
<li><p>putting it in front of users</p>
</li>
<li><p>and seeing what actually sticks</p>
</li>
</ul>
<p>If you’re curious, I’ll be sharing updates as I iterate on it.</p>
]]></content:encoded></item><item><title><![CDATA[How AI can help you to be a solo game developer]]></title><description><![CDATA[Intro
I interestingly got into programming. It all kicked off with a Half-Life 2 modification back in the day. I wanted to make something cool for my favorite game. Then I realized that I'm really bad at level design or modelling. So I then attempted...]]></description><link>https://alexcoder.dev/ai-sorcery-duel</link><guid isPermaLink="true">https://alexcoder.dev/ai-sorcery-duel</guid><category><![CDATA[AI]]></category><category><![CDATA[Swift]]></category><category><![CDATA[iOS]]></category><category><![CDATA[Game Development]]></category><category><![CDATA[Games]]></category><category><![CDATA[Mobile Development]]></category><category><![CDATA[ #TechPhilosophy]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Tue, 17 Jun 2025 01:40:04 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1749987343949/a6d74262-0160-4718-9b7c-0a16e94eae12.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-intro">Intro</h1>
<p>I interestingly got into programming. It all kicked off with a <em>Half-Life 2</em> modification back in the day. I wanted to make something cool for my favorite game. Then I realized that I'm really bad at level design or modelling. So I then attempted to dip into the game logic, scripting and programming, and later began experimenting with the source code. Since then, I have built my career in web development and experimented with mobile apps, but game development has always been my interest. There’s nothing quite like building a world where players can dive in, make choices, and feel the magic of an interactive story.</p>
<p>That passion led me to <em>Sorcery Duel</em>, my latest project—a fast-paced, strategy-driven game where wizards clash in intense spell-slinging battles. Picture tactical duels packed with clever combos and a touch of fantasy flair. But here’s the wild part: I’m crafting it solo. No team, no big studio—just me.</p>
<p>Going solo sounds intense, and it is, but AI tools have made it possible. I’ve leaned hard on Claude Code, Grok, and ChatGPT to bring <em>Sorcery Duel</em> to life. These tools aren’t just sidekicks; they’re like having a virtual dev team, helping me code, brainstorm, design, and create visuals completely on my own. My roots in programming and game modding gave me the foundation to start, but AI is what’s enabling me to proceed with this dream. Even 5 years ago, when I was thinking about solo gamedev, it always felt that I needed at least a designer and an artist, who would share passion, have free time, and be interested in doing that just as a hobby, which seems almost impossible. But not now.</p>
<p>This article’s my chance to share the ride of building <em>Sorcery Duel</em>—the wins, the struggles, and how tools like Claude, Grok, and ChatGPT became my secret weapons. Whether you’re a coder, a gamer, or just curious about solo game dev, I hope this sparks something in you. Let’s jump into the spellcraft of making a game, one line of code at a time.</p>
<h1 id="heading-overview-of-sorcery-duel">Overview of Sorcery Duel</h1>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749779871561/8730c2a4-7c89-4aee-8f63-72925e04cab3.png" alt class="image--center mx-auto" /></p>
<p>Imagine two wizards squaring off, arcane sparks flying, each casting spells with cunning and flair. That’s <em>Sorcery Duel</em>. It’s a strategy-driven, head-to-head battler where every choice matters—think chess with a magical twist.</p>
<p>The setting is a vivid fantasy world—misty arenas, glowing runes, ancient towers humming with power. You play as a sorcerer wielding a spellbook packed with abilities from different magic schools. Each spell has unique costs, cooldowns, and effects, so strategy is everything. Want to freeze your opponent’s mana and follow with a fireball? Go for it. Prefer to summon a lava golem to smash through their defences while you prep a big finisher? That’s a vibe, too. The mechanics reward thoughtful planning, but keep things simple enough for anyone to dive in. It's still a mobile game, not an airplane control panel.</p>
<p>My landmarks for tactical battle pieces were Heroes of Might and Magic 3/5 and World of Warcraft combat systems. It is going to be turn-based, but with a little spark: spells are resolved at the same time, so the result would depend on what both wizards are casting. I decided to add this little real-time element to the classic turn-based approach. I am going to release the first version on iOS, with matches lasting 5–10 minutes, perfect for a quick session with a friend. I’m eyeing other platforms, such as PC or Android, down the road, but for now, it’s all about making those iOS duels shine.</p>
<p>But okay, let’s stop advertising my game. I know everyone here is interested in technologies and learning how to build staff.</p>
<h1 id="heading-ai-tools-the-game-changer-for-solo-development-and-how-i-used-them">AI Tools: The Game-Changer for Solo Development and how I used them</h1>
<h2 id="heading-image-generation-with-chatgpt">Image generation with ChatGPT</h2>
<p>Visuals are the central part of the game, and for every game, it is what makes it feel like a whole fantasy world. I’ll admit, I don’t have a skill in illustration—my hands are better suited to a keyboard than a sketchpad. But what I do have is a sharp vision, formed from years of diving into games, soaking up fantasy universes like Warcraft or Skyrim, and getting a feel for how mechanics and aesthetics click.</p>
<p>That intuition guided me, even without artistic chops. Then came ChatGPT’s new image generation, and it flipped the script! Unlike the old days, when every AI-generated image was a fresh roll of the dice and tweaking meant starting over, this tool maintains consistency across generations. That lets me tweak specifics, such as replacing the staff with a glowing orb or swapping a button’s design with a single prompt. Sure, smart tools like “magic eraser” still help clean up edges, but this level of control, where I can refine without losing context, is a game-changer.</p>
<p>For example, here is how I modified this character's appearance</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749783026268/7a974f92-ee9d-4843-85ba-99124652da28.png" alt class="image--center mx-auto" /></p>
<p>Using the prompt: “Perfect! One small change - make her not wear a hood, the head should be uncovered”</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749783264636/09077da2-1768-4bbc-ae56-7d85d6ed8627.png" alt class="image--center mx-auto" /></p>
<p>It is also great at generating buttons and being very consistent with it:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749783606954/b4bb0a06-fb5c-4bbe-8e1a-c52e61107dbd.png" alt class="image--center mx-auto" /></p>
<p>All 3 generated within the same chat.</p>
<p>Another thing - animations. Most of animations on 2D games are just a few frames, so for simple effects I just asked ChatGPT to generate a grid of effects frame by frame, and it works quite well. Here is the flying and pulsing fireball:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749869580225/699e7059-a4c5-491e-96a7-5afa66fde558.png" alt class="image--center mx-auto" /></p>
<p>If you play every frame for 0.1 second, you get a nice animation.</p>
<p>The result is not always positioned perfectly. Sometimes, it still require a manual adjustment. So for flying boulder, where the position is important, I made a simple grid in the Pinta editor to perfectly position and align every frame:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1749869485029/93bb1da4-d99a-4c9b-931f-bbbb76cb097b.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-claude-code">Claude code</h2>
<p>I don’t have a huge experience with Swift, so Claude Code helped me learn some language features in practice. As the initial point, I started with a plan, so Claude drafted a nice 10-step process, and updated it when completing the next task. Then I stepped away from that plan, when I got something working, but this is the last used version:</p>
<pre><code class="lang-markdown"><span class="hljs-section"># SorceryDuel Implementation Progress</span>

<span class="hljs-section">## Completed Tasks</span>

<span class="hljs-section">### 1. Core Architecture &amp; Data Models</span>
<span class="hljs-bullet">-</span> Created enums (MagicSchool, SpellType, SpellDirection, GameState)
<span class="hljs-bullet">-</span> Implemented Wizard and Creature classes
<span class="hljs-bullet">-</span> Built Spell protocol with concrete implementations (Attack, Defense, Summon, Curse)
<span class="hljs-bullet">-</span> Created Effect system for ongoing spell effects
<span class="hljs-bullet">-</span> Implemented SpellFactory for centralized spell management
<span class="hljs-bullet">-</span> Designed TurnAction class for player actions
<span class="hljs-bullet">-</span> Added Shield Defense system with directional protection

<span class="hljs-section">### 2. Game Scene &amp; UI Layout</span>
<span class="hljs-bullet">-</span> Create 10x10 grid battlefield
<span class="hljs-bullet">-</span> Position wizards at opposite corners
<span class="hljs-bullet">-</span> Implement UI panels for spell selection
<span class="hljs-bullet">-</span> Add status display (health, mana)

<span class="hljs-section">### 3. Player Wizards &amp; Positioning</span>
<span class="hljs-bullet">-</span> Initialize wizards with appropriate textures
<span class="hljs-bullet">-</span> Set up grid-based positioning system
<span class="hljs-bullet">-</span> Implement visual indicators for active wizard

<span class="hljs-section">### 4. Bug Fixes &amp; Implementation Updates (May 15, 2025)</span>
<span class="hljs-bullet">-</span> Fixed missing direction parameter in takeDamage method calls
<span class="hljs-bullet">-</span> Updated DefenseEffect initialization with proper parameters
<span class="hljs-bullet">-</span> Implemented addEffect method in Wizard class
<span class="hljs-bullet">-</span> Added useMana method to Wizard class
<span class="hljs-bullet">-</span> Made DefenseEffect properly implement the Effect protocol
<span class="hljs-bullet">-</span> Improved spell effect handling in Wizard class

<span class="hljs-section">### 5. Turn-Based Gameplay Flow (Completed May 15, 2025)</span>
<span class="hljs-bullet">-</span> Built state machine for game phases
<span class="hljs-bullet">-</span> Implemented selection sequence: school → type → spell → target → direction (if target is other wizard)
<span class="hljs-bullet">-</span> Created turn resolution logic with visual effects
<span class="hljs-bullet">-</span> Added support for direct, left, and right spell directions
<span class="hljs-bullet">-</span> Implemented directional arrow sprites for spell targeting with visual feedback
<span class="hljs-bullet">-</span> Created dynamic projectile paths for different directions (direct, left, right)
<span class="hljs-bullet">-</span> Added smooth animation transitions between selection phases

<span class="hljs-section">### 6. Spell Direction &amp; Resolution System</span>
<span class="hljs-bullet">-</span> Implement spell casting mechanics
<span class="hljs-bullet">-</span> Build direction-based interaction rules
<span class="hljs-bullet">-</span> Create damage calculation with magic school advantages

<span class="hljs-section">## Next Steps</span>

<span class="hljs-section">### 7. Summoned Creatures &amp; Battlefield Grid</span>
<span class="hljs-bullet">-</span> Implement creature summoning and positioning
<span class="hljs-bullet">-</span> Create creature attack logic (they only shoot the closest target, whatever it is the opponent's wizard or opponent's creature)
<span class="hljs-bullet">-</span> Build grid management system

<span class="hljs-section">### 8. Visual Effects</span>
<span class="hljs-bullet">-</span> Design and implement spell animations
<span class="hljs-bullet">-</span> Create feedback for spell outcomes
<span class="hljs-bullet">-</span> Add visual indicators for active effects

<span class="hljs-section">### 9. Game State Management</span>
<span class="hljs-bullet">-</span> Implement win/lose conditions
<span class="hljs-bullet">-</span> Create game over screen
<span class="hljs-bullet">-</span> Add restart functionality

<span class="hljs-section">### 10. AI Opponent (Optional)</span>
<span class="hljs-bullet">-</span> Implement basic decision-making logic
<span class="hljs-bullet">-</span> Create difficulty levels

<span class="hljs-section">## Notes</span>
<span class="hljs-bullet">-</span> Core data models are complete and well-structured
<span class="hljs-bullet">-</span> The SpellFactory provides a centralized way to access all spells
<span class="hljs-bullet">-</span> The visual effects system is designed but will need implementation in GameScene
<span class="hljs-bullet">-</span> Effect system is now properly implemented with protocol conformance
<span class="hljs-bullet">-</span> Wizard class now handles effects and mana usage correctly
<span class="hljs-bullet">-</span> Shield defense system implemented with directional protection (100% for primary direction, 50% for other directions)
<span class="hljs-bullet">-</span> Magic school effectiveness is applied to shields (strong vs. weak schools)
<span class="hljs-bullet">-</span> Each magic school has a unique shield spell that can absorb up to 100 damage points
<span class="hljs-bullet">-</span> Shields last for 2 turns or until their protection value is depleted
</code></pre>
<p>It is always a really good idea to give Claude Code a lot of extra information and describe everything in detail. For instance, I usually formulated tasks in a separate text files, giving some particular code references and details. Here is an example that worked good for adding the spellpower ability to wizards:</p>
<pre><code class="lang-markdown">I want you to work on a new Wizard stat - spellpower.
For every point of spellpower, Wizard's spells become 5% stronger.
It has following effects for different spells:

<span class="hljs-bullet">1.</span> For attack spells - spellpower increases the damage.
For instance, ifa  fireball does 20 damage by default, then it will do 21 damage with 1 point in spellpower.

<span class="hljs-bullet">2.</span> For shields - spellpower increases the shield absorbed amount, also by 5% per every point.
For instance, if a fire shield absorbs 60 damage, then with 2 points in spellpower, it will absorb 66 damage.

<span class="hljs-bullet">3.</span> For summon spells - spellpower increases the creature's maximum health and damage.
For instance, if a golem has 50 health and does 3 damage by default, then with 4 points in spellpower,
it will be 60 health and 4 damage (3 + 3 <span class="hljs-emphasis">* 0.05 *</span> 4 = 3.6 --&gt; 4)

<span class="hljs-bullet">4.</span> For other future spells, I will implement them later myself

In case of a fraction resulting value, all numbers are rounded to the closest value. 4.4 --&gt; 4; 4.6 --&gt; 5; 4.5 --&gt; 5.
</code></pre>
<p>And this is an example of refactoring, performed by Claude Code, but when I described a particular required changes:</p>
<pre><code class="lang-markdown">Refactor GameState enum. The current implementation:
<span class="hljs-code">```
enum GameState {
    // Turn phases
    case player1Turn
    case player2Turn

    // Selection phases
    case selectingMagicSchool
    case selectingSpellType
    case selectingSpell
    case selectingTarget
    case selectingDirection

    // Resolution phases
    case spellResolution
    case gameOver
}
```</span>

Selection phases should go into a separate enum <span class="hljs-code">`TurnState`</span>, with these members:
<span class="hljs-code">```
enum TurnState {
    case selectingSpell
    case selectingTarget
    case selectingDirection
}
```</span>

Game state goes between these states:
player1Turn --&gt; player2Turn --&gt; spellResolution --&gt; player1Turn
<span class="hljs-code">                                        |
                                        |
                                        v
                                    gameOver
</span>
Turn state:
selectingSpell --&gt; selectingTarget --&gt; selectingDirection

selectingTarget and selectingDirection and optional for some spells
</code></pre>
<p>In the project, I created a separate folder for all the text files, so Claude can refer to them if needed to remember game rules or project tree, and I am saving completed tasks for the history. Here is my directory structure:</p>
<pre><code class="lang-markdown">info/
├── Completed
│   ├── defence<span class="hljs-emphasis">_spell_</span>logic.txt
│   ├── task<span class="hljs-emphasis">_animated_</span>chars.txt
│   ├── task<span class="hljs-emphasis">_animated_</span>sprites.txt
│   ├── task<span class="hljs-emphasis">_game_</span>state.txt
│   ├── task<span class="hljs-emphasis">_shield_</span>effect.txt
│   ├── task<span class="hljs-emphasis">_spellbook.txt
│   ├── task_</span>spellpower.txt
│   ├── task<span class="hljs-emphasis">_spells_</span>limit.txt
│   ├── task<span class="hljs-emphasis">_summon.txt
│   └── ui_</span>task.txt
├── game<span class="hljs-emphasis">_rules.txt
├── implementation_</span>progress.txt
├── plan.txt
├── project<span class="hljs-emphasis">_tree.txt
├── task_</span>dispell.txt
├── ui<span class="hljs-emphasis">_layout.txt
└── used_</span>assets.txt

2 directories, 17 files
</code></pre>
<p>I also added some important game facts to <code>CLAUDE.md</code> file, as recommended in <a target="_blank" href="https://www.anthropic.com/engineering/claude-code-best-practices">this article</a></p>
<h2 id="heading-grok">Grok</h2>
<p>I am, honestly, not a big fan of X. Even when it was Twitter, I kind of avoided that social network. But I have to assume - Grok is really good for certain tasks.</p>
<p>Even in the free version, it supports up to 25 attachments in one chat, which is excellent for the task of refactoring. This means that in the free version, it would almost certainly train on my codebase, but I don’t see a big problem with it if someone has some random modules from my hobby project.</p>
<p>So how I used it - I gave it a file and asked for refactoring plan:</p>
<blockquote>
<p>I am implementing a 2D game in Swift for iOS. I have my main class with game logic called <code>GameManager</code>. However, it is becoming a bit too large. Are there any ways I can refactor it? Come up with a plan for now, do not write the code yet!</p>
<p>📄 GameManager.swift</p>
</blockquote>
<p>Grok proposed a good plan. Here is the first part of the response with first 2 items:</p>
<blockquote>
<p>The <code>GameManager</code> class in your <code>GameManager.swift</code> file is indeed large and handles multiple responsibilities, which can make it difficult to maintain and extend. Refactoring it will improve modularity, readability, and testability. Below is a detailed plan to refactor the <code>GameManager</code> class by breaking it into smaller, more focused components while preserving the existing functionality.</p>
<h3 id="heading-refactoring-goals">Refactoring Goals</h3>
<ol>
<li><p><strong>Single Responsibility Principle (SRP)</strong>: Split the <code>GameManager</code> into smaller classes or structs, each responsible for a specific aspect of the game logic.</p>
</li>
<li><p><strong>Improved Readability</strong>: Organize related functionality into cohesive units to make the codebase easier to navigate.</p>
</li>
<li><p><strong>Testability</strong>: Isolate logic to enable unit testing of individual components.</p>
</li>
<li><p><strong>Maintainability</strong>: Make it easier to add new features or modify existing ones without altering the core <code>GameManager</code>.</p>
</li>
<li><p><strong>Reusability</strong>: Create components that can be reused in other parts of the game or future projects.</p>
</li>
</ol>
<h3 id="heading-analysis-of-current-responsibilities">Analysis of Current Responsibilities</h3>
<p>The <code>GameManager</code> class currently handles:</p>
<ul>
<li><p><strong>Game State Management</strong>: Tracks game state (<code>gameState</code>, <code>turnState</code>, <code>currentPlayerTurn</code>), player turns, and selection states (<code>selectedSpell</code>, <code>selectedTarget</code>, <code>selectedDirection</code>).</p>
</li>
<li><p><strong>Wizard Management</strong>: Initializes and manages wizards (<code>player1Wizard</code>, <code>player2Wizard</code>, <code>activeWizard</code>).</p>
</li>
<li><p><strong>Creature Management</strong>: Tracks and manages creatures (<code>allCreatures</code>, summoning, removal, and auto-attacks).</p>
</li>
<li><p><strong>Spell Management</strong>: Handles spell selection, targeting, direction selection, and resolution (including collision detection).</p>
</li>
<li><p><strong>UI and Visual Integration</strong>: Interacts with <code>UIManager</code>, <code>VisualEffectsManager</code>, <code>SpellbookUI</code>, and <code>GameCamera</code> for UI updates and visual feedback.</p>
</li>
<li><p><strong>Grid Interaction</strong>: Works with <code>GridManager</code> for position validation, highlighting, and creature placement.</p>
</li>
<li><p><strong>Game Flow</strong>: Manages turn switching, spell resolution phases, and game-over conditions.</p>
</li>
</ul>
<p>Given these responsibilities, we can split <code>GameManager</code> into several smaller components.</p>
<h3 id="heading-refactoring-plan">Refactoring Plan</h3>
<h4 id="heading-1-extract-game-state-management">1. Extract Game State Management</h4>
<p><strong>New Component</strong>: <code>GameStateManager</code> (Class)</p>
<ul>
<li><p><strong>Responsibility</strong>: Manage the game state, turn state, and player turns.</p>
</li>
<li><p><strong>Functions to Move</strong>:</p>
<ul>
<li><p><code>currentPlayerTurn</code>, <code>gameState</code>, <code>turnState</code>, <code>spellSelectionComplete</code></p>
</li>
<li><p><code>setCurrentPlayerTurn(_:)</code></p>
</li>
<li><p><code>switchPlayerTurn()</code></p>
</li>
<li><p><code>beginSpellResolutionPhase()</code></p>
</li>
<li><p><code>resetTurnActions()</code></p>
</li>
<li><p><code>checkGameOverCondition()</code></p>
</li>
<li><p><code>handleGameOver()</code></p>
</li>
<li><p><code>resetGame()</code></p>
</li>
</ul>
</li>
<li><p><strong>Dependencies</strong>: Will need references to <code>UIManager</code>, <code>GameCamera</code>, wizards (<code>player1Wizard</code>, <code>player2Wizard</code>), and stored actions (<code>player1SelectedAction</code>, <code>player2SelectedAction</code>).</p>
</li>
<li><p><strong>Purpose</strong>: Centralizes game flow logic (turn management, game phases, and game-over conditions) and reduces the <code>GameManager</code>’s responsibility for high-level state transitions.</p>
</li>
<li><p><strong>Interaction</strong>: <code>GameManager</code> will delegate turn-related and game state operations to <code>GameStateManager</code>.</p>
</li>
</ul>
<h4 id="heading-2-extract-spell-management">2. Extract Spell Management</h4>
<p><strong>New Component</strong>: <code>SpellManager</code> (Class)</p>
<ul>
<li><p><strong>Responsibility</strong>: Handle spell selection, targeting, direction selection, and resolution.</p>
</li>
<li><p><strong>Functions to Move</strong>:</p>
<ul>
<li><p><code>selectedSpell</code>, <code>selectedTarget</code>, <code>selectedDirection</code></p>
</li>
<li><p><code>startSpellCastingSequence()</code></p>
</li>
<li><p><code>showValidTargets()</code></p>
</li>
<li><p><code>handleTargetingGridSelection(_:in:)</code></p>
</li>
<li><p><code>showDirectionOptions(for:)</code></p>
</li>
<li><p><code>cancelSpellSelection()</code></p>
</li>
<li><p><code>storeSelectedSpell()</code></p>
</li>
<li><p><code>resolveSpell()</code></p>
</li>
<li><p><code>resolveSpellsInOrder(_:)</code></p>
</li>
<li><p><code>checkAttackSpellCollisions(_:)</code></p>
</li>
<li><p><code>applySpellEffects(_:)</code></p>
</li>
<li><p><code>applyAttackEffectsWithCollisions(_:collisionResult:)</code></p>
</li>
<li><p><code>calculateCollisionWorldPoint(from:to:direction1:from2:to2:direction2:)</code></p>
</li>
<li><p><code>areDirectionsCrossing(_:_:)</code></p>
</li>
<li><p><code>calculateHalfDistancePoint(from:to:direction:cellSize:)</code></p>
</li>
</ul>
</li>
<li><p><strong>Dependencies</strong>: Will need references to <code>GridManager</code>, <code>VisualEffectsManager</code>, <code>UIManager</code>, <code>SpellbookUI</code>, and wizards/creatures for targeting and effect application.</p>
</li>
<li><p><strong>Purpose</strong>: Isolates all spell-related logic, including selection, targeting, and resolution (including collision detection), into a single component.</p>
</li>
<li><p><strong>Notes</strong>:</p>
<ul>
<li><p>The <code>SpellManager</code> will handle the complex logic of spell interactions (e.g., collision detection and resolution order).</p>
</li>
<li><p>Consider creating a <code>SpellCollisionHandler</code> struct or class to encapsulate collision-specific logic (<code>checkAttackSpellCollisions</code>, <code>calculateCollisionWorldPoint</code>, <code>areDirectionsCrossing</code>, <code>calculateHalfDistancePoint</code>) for further modularity.</p>
</li>
</ul>
</li>
<li><p><strong>Interaction</strong>: <code>GameManager</code> will call <code>SpellManager</code> for spell-related actions, and <code>GameStateManager</code> will coordinate with <code>SpellManager</code> during the resolution phase.</p>
</li>
</ul>
<h4 id="heading-4ocm4ocm">……</h4>
</blockquote>
<p>Then you can execute that plan step-by-step, and it will provide new and updated files, so it is very easy to apply to the project:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750031515356/3d7ab84f-6411-4644-a1cf-e98d2a809b39.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750031532087/68701d11-a9fd-440a-bd18-2ff7e0fc3f08.png" alt class="image--center mx-auto" /></p>
<p>I feel like this combo of Claude Code for adding code and Grok for refactoring modules worked really well for me. Grok does not always generate working code, and sometimes it mixes up variables, so you definitely need to keep an eye on it. But it's the same for all the AI-generated code.</p>
<p>I also used it as a generic AI assistant, when I need to chat about my game for free, since it also has memory feature now, and remembers many facts about it, same as ChatGPT.</p>
<h1 id="heading-conclusion">Conclusion</h1>
<p>I like the possibilities that get unlocked with the help of AI assistants. It's great when you can work on your hobby projects with good progress, even if you have only an hour or two a day or over the weekend. That looks fascinating, but it's also important to understand, predict, and analyze the consequences and influence that AI can have worldwide. From small projects like mine to drastic societal effects, and changing the whole way we work. Here are some of my thoughts.</p>
<ol>
<li><p>AI is not a replacement for professionals, at least, for now. Despite of being helpful on generating images or code, it is still far from the professional level of understanding things deeply. For instance, the example above shows, that it put 2 different entities: turn state and game state into the same enum - <code>GameState</code>. Probably same applies to illustrations. Designers and illustrators make things look estetic and bringing some additional ideas or features into their characters, sometimes even not seen at the first glance, but understood subconsciously. Despite its complexity, AI is essentially about optimizing error minimization in data processing. So it would output a “middle”, “good” and “satisfying” solution, loosing some deep features. It would lack originality in the art images or generate suboptimal code. It leads to the next point.</p>
</li>
<li><p>AI makes things that work. But how exactly work - could be missing. For instance, there are multiple ways to implement concurrently executed tasks. At the first glance, it all is executed in the end, and it all working, but the actual result could be different. It plays even bigger role when you step away from the prototyping to the actual project, and trying to optimize it. For instance, when I implemented FitAssist, AI sometimes put some code on the main thread, which works fine while you don’t have much data, but if I would miss that, users could report, that the app is slow, and the more you use it, the slower it gets. So you <strong>really</strong> need to know your codebase.</p>
</li>
<li><p>We all started small. Learned how to code basic things and operations. Then built more knowledge on top of that. Now, AI can handle simple operations and tasks. People start saying that some Junior developer work could be replaced with AI. Yes, we might save money and human labour hours, but it would also mean that the person is not getting experience. Would not be able to proceed to being a middle and senior engineer later. Suppose we project for 10 years (which is always a silly idea, since processes never develop linearly, but just a mental exercise). In that case, we have products that somehow work, which were created with the active participation of AI, and the last person who manually worked with the codebase retired a year ago. Not a big win in the long term, right?</p>
<p> Also, with the help of AI, a beginner developer is not required to learn everything. All answers are so easily available, all the tasks are solved magically. I still remember nights debugging my self-written graphics engine about 15 years ago, kept getting an empty black screen and trying to understand - what was wrong there, and why was my level not drawn. Now, why would you bother learning how it actually works, when the result is achievable with AI, and the provided solution seems to be doing its job. Why would you ask questions about why it is structured that way, and what are the other options to organize the codebase? AI probably would know, what architecture to choose, right?</p>
<p> Some people can say that it already happened many times. Look at the car manufacturing facility in 1920:</p>
<p> <img src="https://www.pbs.org/wgbh/americanexperience/media/gallery_images/Ford-Gallery-5-1760.jpg" alt /></p>
<p> And at the modern South Korean manufactory with robots:</p>
<p> <img src="https://www.assemblymag.com/ext/resources/Issues/2023/august/asb0823edit2-780x439.jpg?1690915971" alt="Automotive factory in South Korea " /></p>
<p> Everything looks good - routine hard work gets automated, car quality and assembly precision increase significantly, and people are transitioning from manual labour workers to advanced robot operators.</p>
<p> But I feel like the AI shift is different. I believe it is the first time in human history that automation is occurring in the creative class and intellectual professions. And it might have a bigger impact than we thought. It may be the first time we might delegate decision-making to a non-human. Not a particular algorithm, developed by a person, to work with a specific type of data, and output the decision. But to a generic AI model, that can perform reasoning on provided data, emulate a thinking process, and form a final decision, supporting it with facts from the provided data and its existing knowledge. And even if it could create some additional jobs for more AI engineers or AI ethics specialists, it feels like it might eventually shrink the total number of jobs available.</p>
</li>
<li><p>Probably, if progress continues at the same speed, within the next 3-5 years, an experienced developer equipped with a range of AI tools can handle the entire team's workload, especially during the product's prototyping phase. A senior lawyer might delegate contract analysis and matching it with the rules of law to AI. So the company would be happy to increase their salaries, since they are so productive. However, that would also mean they no longer need junior specialists. As well as illustrators and designers for small companies, since SEO can do those things on their own in their free time. That might increase economic inequality even further, as the number of jobs may shrink.</p>
</li>
<li><p>Yes, somebody might say, that it is a dog-eat-dog world, and you have to adapt. However, if the process of replacing human labour is avalanche-like, and about 50% of the population is left behind, it's only a matter of time before the consequences affect everyone.</p>
<p> Economic inequality is not just a social justice theoretical concept, it might have broad real-world consequences. In the Russian Empire before the revolution, wealth distribution was quite extreme: 1.5% of the population owned 25% of the land, while most people were peasants or urban industrial workers living in poverty. That all, after communists seized power by force as a result of a civil war, led to an extreme process of repressing the wealthy, successful peasants - <a target="_blank" href="https://en.wikipedia.org/wiki/Dekulakization">Dekulakization</a>.</p>
<p> Of course, it is an extreme case, but even now, people who have been left behind the progress are becoming easy targets for populist politicians. Who promise to return to the good old times, defeat all enemies, and solve all the complex problems with simple solutions. This only leads to increased polarization and tension in society, making the world more chaotic and less safe for everyone.</p>
</li>
<li><p>Do I propose stopping the progress and development of AI? Of course, not. You can never stop progress; all reactionary forces and ideas have historically lost, and we have to embrace the progress, make it our friend, not our enemy. But let’s keep in mind the consequences of our actions. Let’s stay in control of our fate. Let’s put people first. Let’s build things that would <strong>help</strong> people, not <strong>replace</strong> them. Let’s be mindful, because with great power comes great responsibility.</p>
</li>
</ol>
]]></content:encoded></item><item><title><![CDATA[My first iOS Fitness app - FitAssist]]></title><description><![CDATA[Intro
I’ve been writing code for about 15 years now, ever since I kicked off my first project. My debut in the mobile world came in 2014 with an Android music player—a passion project that sparked my love for app development. These days, I spend most...]]></description><link>https://alexcoder.dev/fitassist</link><guid isPermaLink="true">https://alexcoder.dev/fitassist</guid><category><![CDATA[fitness]]></category><category><![CDATA[iOS]]></category><category><![CDATA[Swift]]></category><category><![CDATA[SwiftUI]]></category><category><![CDATA[Apple]]></category><category><![CDATA[fitness app]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Tue, 01 Apr 2025 00:35:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1743467612244/39b6f4c0-16f7-4212-98a8-46cd5d73f416.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-intro">Intro</h2>
<p>I’ve been writing code for about 15 years now, ever since I kicked off my first project. My debut in the mobile world came in 2014 with an Android music player—a passion project that sparked my love for app development. These days, I spend most of my professional life building web applications, but mobile apps remain my hobby. There’s something thrilling about the process—it’s different from web dev yet familiar enough that skills are transferable between these 2 worlds.</p>
<p>Sport, though, is my other passion. As a CrossFit athlete and fitness enthusiast, I’m no stranger to workout timers and the grind of tracking progress. I’ve always been fascinated by how technology can elevate that journey, turning sweat and reps into data-driven growth. I also used an app called “Start WOD Timer” in my workouts and wanted to improve some things there. That curiosity led me to "FitAssist," my first iOS app. It’s not just a tool—it’s a personal experiment in trying to pack many fitness tools into one app. I’d already experimented with pose recognition using YOLOv11 for a previous article, and I am looking forward to seeing that functionality in my fitness app someday.</p>
<p>I love a challenge, and learning Swift felt like the perfect one. After years of web projects, diving into Apple’s ecosystem was daunting but exciting—a chance to try something new and flex both my developer muscles and my fitness nerd brain. "FitAssist" was born from that spark, a mix of passion, code, and a little bit of grit. I wanted to make a useful app that would accompany your fitness exercises and help with tracking, posture, recommendations and so on, like a good assistant for your workouts.</p>
<h2 id="heading-diving-into-ios-development">Diving into iOS Development</h2>
<p>Stepping into iOS development wasn’t entirely uncharted territory for me. With 15 years of coding experience, I’m no stranger to compiled languages or the mixes of threads and asynchronous code—skills obtained from some of the Android development and years of web projects. My Android experience gave me a solid foundation: I knew how to structure an app, deal with build systems, and optimize for mobile. But iOS? That was a fresh canvas, and Swift was the brush I’d never picked up before.</p>
<p>What surprised me was how familiar SwiftUI felt. As someone with deep React experience, I spotted the inspiration right away—declarative UI, component-based design, and state management. It was like meeting an old friend in a new city. Still, the Apple ecosystem was a bit of a new world. However, developer support was great! Enrolling in the Apple Developer Program resulted in an error with a recommendation to contact support. It turned out that their address recognition system didn’t get the correct address from my Ontario driver’s license. So, I submitted a support ticket—and within minutes, my phone rang. Apple’s support team walked me through it, checked my account, and suggested swapping in a different document. I re-submitted and successfully enrolled! Their responsiveness surprised me in the best way.</p>
<p>But before that, I leaned on Apple’s beginner tutorial for app development ("<a target="_blank" href="https://developer.apple.com/tutorials/app-dev-training/getting-started-with-scrumdinger">Scrumdinger</a>" app at <a target="_blank" href="http://developer.apple.com">developer.apple.com</a>). It covers a big variety of topics: covering architecture, components, and best practices with a clarity that even a seasoned dev can appreciate. I dove into Xcode, started piecing together "FitAssist," and felt the thrill of seeing my first SwiftUI views render. This was my playground now, and I was ready to build.</p>
<h2 id="heading-building-the-core-the-mvp-takes-shape">Building the Core: The MVP Takes Shape</h2>
<p>When I started on "FitAssist," I looked to an app I’d used a ton: <em>Smart WOD Timer</em>. It’s a solid tool for CrossFit workouts: clean UI, reliable countdowns, and a simple setup.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743391278989/8e4aec6d-504b-4fbd-933e-da75fe3ea141.png" alt class="image--center mx-auto" /></p>
<p>But here’s the thing: you probably can’t build something totally new in fitness apps. What you <em>can</em> do is grab solid ideas and make them better. That’s where my MVP began. I wanted a workout timer that felt more flexible, inspired by the actual timer we use at the CrossFit gym. It’s generic enough to handle any workout: CrossFit WODs, Interval trainings, running - not locked into one style.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743392365544/fc102433-e6af-47fa-aeda-bec5f20d1722.jpeg" alt class="image--center mx-auto" /></p>
<p>For the config, I went with a fresh approach: configurable and adapts to whatever you’re training.</p>
<p>The timer itself? It shows what matters mid-workout—time, rounds, a lap button to mark progress—without cluttering your focus. I kept the UI simple and responsive.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743392525015/167edc78-1bf8-4ae8-ba9c-66c06e617b26.png" alt class="image--center mx-auto" /></p>
<p>Open the app, and you’re ready to sweat - quick load and no screen-hopping to start. That was my rule: instant, practical, no distractions.</p>
<p>I also played with the Health Storage API to pull in pulse data, and draw a graph with it, using <a target="_blank" href="https://github.com/willdale/SwiftUICharts">SwiftUICharts</a>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743392711914/378fd80f-685e-4c60-a78b-c4abce42d3cd.png" alt class="image--center mx-auto" /></p>
<p>HealthKit permissions seemed tricky at first, but I figured it out, and seeing heart rate tied into the app felt like a win. Coding it in Swift turned out to be exciting once I got rolling. This MVP wasn’t perfect, but it was mine—a fitness tool I’d actually use.</p>
<p>Also, I created a <a target="_blank" href="https://fit-assist.app/">landing page</a> for this project using <a target="_blank" href="http://lovable.dev">lovable.dev</a>.</p>
<p>I was so impressed with that service. It helps you make amazing things in just a few prompts and allows you to connect the custom domain and backend database with Supabase. The best thing - you own the code, it all gets pushed to your GitHub repo.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743426568451/fd553817-1336-4c23-aabd-9a18d74a2e8d.png" alt class="image--center mx-auto" /></p>
<p>Just as an experiment, I was able to create a dark / light mode button, tied to the system settings, in about half an hour. With just a few prompts and then fixing some styles after.</p>
<p>I used some code assistants, mostly Grok, to write the app code because Grok has many free features. But that is a topic for another article with a code assistant comparison.</p>
<h2 id="heading-the-road-ahead-planned-features-and-lessons-learned">The Road Ahead: Planned Features and Lessons Learned</h2>
<p>"FitAssist" is just getting started, and I’ve got big plans. Right now, I’m working on a running feature - tracking distance, pace, and location, all plotted on a map. It’s gotta be smooth and useful, something I’d rely on for my own runs. But here’s the deal: handling location and Health data means security’s non-negotiable. I’m not about letting that stuff leak. So, I think I should keep everything on the user’s device using SwiftData, with iCloud sync as an option. It’s all local, all private, data never goes to any external server. I want "FitAssist" to be “privacy first,” because that’s how I’d want my own data treated.</p>
<p>Later, I want to bring in computer vision algorithms with pose recognition and movement detection. Picture this: I prop my phone up, and it films me doing a handstand, then breaks down the move: form, balance, the line. I could re-watch it, tweak my technique. Also, it would be great to have rep counts or feedback on back squats. Similar to what I did and described in my previous article, where I was experimenting with YOLOv11.</p>
<p>Another feature I’m thinking about is a workout generator. Pick your intensity, gear, and favorite moves, and let it generate a custom plan, based on common and well-known WODs, structuring it with warmup - main workout - cooldown and stretch.</p>
<p>Building this MVP taught me a ton—Swift’s features, Apple’s ecosystem, and how to make things work there. FitAssist isn’t just an app - it’s my main hobby project and a sandbox for blending my passions: code and fitness. I am looking forward to implementing many cool ideas. The road ahead is wide open, and I’m ready to run with it.</p>
<p>You can download the app completely free from the AppStore:</p>
<p><a target="_blank" href="https://apps.apple.com/ca/app/fitassist-workout-assistant/id6743733478">https://apps.apple.com/ca/app/fitassist-workout-assistant/id6743733478</a></p>
]]></content:encoded></item><item><title><![CDATA[Real-time pose estimation and movement detection with YOLOv11]]></title><description><![CDATA[Intro
In this article, we explore how to stream video from your smartphone to your computer, using it as a webcam. Then, we combine this with OpenCV video capturing and use YOLOv11 for pose estimation and movement detection.
I am using Python, OpenCV...]]></description><link>https://alexcoder.dev/pose-estimation</link><guid isPermaLink="true">https://alexcoder.dev/pose-estimation</guid><category><![CDATA[pose estimation]]></category><category><![CDATA[Computer Vision]]></category><category><![CDATA[Python]]></category><category><![CDATA[opencv]]></category><category><![CDATA[YOLO]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Sun, 12 Jan 2025 05:19:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736691420711/5213c782-7860-424f-b3b0-f2899c95d4a7.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-intro">Intro</h2>
<p>In this article, we explore how to stream video from your smartphone to your computer, using it as a webcam. Then, we combine this with OpenCV video capturing and use YOLOv11 for pose estimation and movement detection.</p>
<p>I am using Python, OpenCV and Ultralytics library for coding and Camo Studio as software.</p>
<h2 id="heading-streaming-from-smartphone">Streaming from smartphone</h2>
<p>To emulate real-life gym usage, I decided to use a video from my smartphone for the detection task. Luckily, I found a nice article, that describes how to do exactly that and convert your smartphone to a webcam: <a target="_blank" href="https://medium.com/@saicoumar/how-to-use-a-smartphone-as-a-webcam-with-opencv-b68773db9ddd">https://medium.com/@saicoumar/how-to-use-a-smartphone-as-a-webcam-with-opencv-b68773db9ddd</a></p>
<p>Long story short, you install Camo Studio on both a laptop and a phone, and connect them:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736609918842/7c8134a9-b138-4dd2-8580-defefdc3644a.png" alt class="image--center mx-auto" /></p>
<p>After that, it could be used in OpenCV as video capture:</p>
<pre><code class="lang-python">    cap = cv2.VideoCapture(<span class="hljs-number">1</span>)

    <span class="hljs-comment"># Your resolution from Camo</span>
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, <span class="hljs-number">720</span>)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, <span class="hljs-number">1280</span>)

    <span class="hljs-keyword">while</span> cap.isOpened():
        ret, frame = cap.read()
</code></pre>
<h2 id="heading-pose-estimation-with-yolov11">Pose estimation with YOLOv11</h2>
<p>Pose estimation is the task of identifying the specific points of an object, in this particular example, in the human body. The keypoints can represent various parts of the object, such as joints.</p>
<p>There are different models that could be used for this task, such as:</p>
<ul>
<li><p><a target="_blank" href="https://docs.ultralytics.com/tasks/pose/">Ultralytics YOLOv11</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/facebookresearch/sapiens">Facebook Sapiens</a></p>
</li>
<li><p><a target="_blank" href="https://ai.google.dev/edge/mediapipe/solutions/vision/pose_landmarker">Mediapipe</a> from Google</p>
</li>
<li><p><a target="_blank" href="https://developer.apple.com/documentation/coreml/detecting-human-body-poses-in-an-image">CoreML</a></p>
</li>
</ul>
<p>A popular dataset and output format is <a target="_blank" href="http://presentations.cocodataset.org/COCO17-Keypoints-Overview.pdf">COCO17</a>, which is widely used in pose estimation.</p>
<p>It looks like many models were trained using this dataset since the output format is the same. However, some of them are using extended COCO-WholeBody: 133 keypoints (17 kps body, 6 kps feet, 68 kps face, 42 kps hands). The Google model outputs 33 key points.</p>
<p>However, for tracking body movements in our task, we only need body keypoints, which are pretty much the same for all models, so I decided to stick to YOLOv11 since it is easy to use.</p>
<p>It outputs the following keypoints:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736626227455/65894f6b-67bb-45c3-9b62-e51ea754883f.png" alt class="image--center mx-auto" /></p>
<pre><code class="lang-python"><span class="hljs-number">0</span>: Nose <span class="hljs-number">1</span>: Left Eye <span class="hljs-number">2</span>: Right Eye <span class="hljs-number">3</span>: Left Ear <span class="hljs-number">4</span>: Right Ear 
<span class="hljs-number">5</span>: Left Shoulder <span class="hljs-number">6</span>: Right Shoulder <span class="hljs-number">7</span>: Left Elbow <span class="hljs-number">8</span>: Right Elbow 
<span class="hljs-number">9</span>: Left Wrist <span class="hljs-number">10</span>: Right Wrist <span class="hljs-number">11</span>: Left Hip <span class="hljs-number">12</span>: Right Hip 
<span class="hljs-number">13</span>: Left Knee <span class="hljs-number">14</span>: Right Knee <span class="hljs-number">15</span>: Left Ankle <span class="hljs-number">16</span>: Right Ankle
</code></pre>
<p>You can simply call the model on every frame. I used the medium version, which gives about 10 FPS when working on the CPU - enough for a test, anyway.</p>
<p>To draw the result, you can use <code>result.plot()</code>, which will return the copy of the frame, but I decided, that I wanted to build a pipeline, that applies all the changes to the existing frame, so I went one level deeper, and used <code>Annotator</code> utility class, and its <code>kpts</code> function, to draw the keypoints.</p>
<p>The resulting code:</p>
<pre><code class="lang-python">model = YOLO(<span class="hljs-string">"yolo11m-pose.pt"</span>)

<span class="hljs-comment"># ... capture init code and a loop ...</span>
    results = model(frame)
    <span class="hljs-comment"># Model can accept a list of images or a single image, but returns a list</span>
    result = results[<span class="hljs-number">0</span>]
    keypoints = result.keypoints.xy.tolist()
    keypoints = keypoints[<span class="hljs-number">0</span>]  <span class="hljs-comment"># we are taking the first detected person</span>

    annotator = Annotator(frame)
    annotator.kpts(result.keypoints.data[<span class="hljs-number">0</span>], result.orig_shape, <span class="hljs-number">5</span>, <span class="hljs-literal">True</span>)
    annotated_frame = annotator.result()

    draw_angles(annotated_frame, keypoints)

    cv2.imshow(<span class="hljs-string">"YOLO Inference"</span>, annotated_frame)
</code></pre>
<p><code>draw_angles</code> function calculates the angle in the knee and a hip joint, and draws that single number next to them.</p>
<p>I also added a FPS counter in the top left corner. So the resulting image looks like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736657436403/b2fcd0f5-cce7-4f8d-bb44-832a789b90fe.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-movement-detection">Movement detection</h2>
<p>Movement detection is a more advanced technique, relying on analyzing keypoints across time.</p>
<p>Ultralytics provides a separate <a target="_blank" href="https://docs.ultralytics.com/guides/workouts-monitoring/">Workouts Monitoring module</a>, that can automatically track movements across certain keypoints.</p>
<p><img src="https://github.com/ultralytics/docs/releases/download/0/pushups-counting.avif" alt="PushUps Counting" /></p>
<p>However, since I plan to keep experimenting with different models and even platforms, it is more helpful to implement them yourself.</p>
<p>I wanted to create a simple gesture detector that would use a raised hand for 2 seconds as a signal to start and stop video recording.</p>
<p>I implemented this simple class:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">GesturedVideoCapture</span>:</span>
    is_recording: bool = <span class="hljs-literal">False</span>
    video_writer: cv2.VideoWriter | <span class="hljs-literal">None</span> = <span class="hljs-literal">None</span>
    capture_gesture_start_time: float = <span class="hljs-number">0.0</span>

    GESTURE_HOLD_TIME = <span class="hljs-number">2.0</span> <span class="hljs-comment"># 2 seconds</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_frame</span>(<span class="hljs-params">self, frame, keypoints, current_time</span>):</span>
        nose = keypoints[<span class="hljs-number">0</span>]
        nose_seen = nose[<span class="hljs-number">0</span>] &gt; <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> nose[<span class="hljs-number">1</span>] &gt; <span class="hljs-number">0</span>
        left_ear_seen = keypoints[<span class="hljs-number">3</span>][<span class="hljs-number">0</span>] &gt; <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> keypoints[<span class="hljs-number">3</span>][<span class="hljs-number">1</span>] &gt; <span class="hljs-number">0</span>
        right_ear_seen = keypoints[<span class="hljs-number">4</span>][<span class="hljs-number">0</span>] &gt; <span class="hljs-number">0</span> <span class="hljs-keyword">and</span> keypoints[<span class="hljs-number">4</span>][<span class="hljs-number">1</span>] &gt; <span class="hljs-number">0</span>

        left_wrist = keypoints[<span class="hljs-number">9</span>]
        right_wrist = keypoints[<span class="hljs-number">10</span>]

        in_capture_gesture = (
            nose_seen <span class="hljs-keyword">and</span> left_ear_seen <span class="hljs-keyword">and</span> right_ear_seen <span class="hljs-keyword">and</span>
            right_wrist[<span class="hljs-number">1</span>] &lt; nose[<span class="hljs-number">1</span>] &lt; left_wrist[<span class="hljs-number">1</span>]
        )

        <span class="hljs-keyword">if</span> self.is_recording <span class="hljs-keyword">and</span> self.video_writer <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
            resized = cv2.resize(frame, (WRITER_WIDTH, WRITER_HEIGHT))
            self.video_writer.write(resized)

        <span class="hljs-keyword">if</span> self.capture_gesture_start_time:
            <span class="hljs-keyword">if</span> in_capture_gesture:
                <span class="hljs-keyword">if</span> self.capture_gesture_start_time + self.GESTURE_HOLD_TIME &lt; current_time:
                    <span class="hljs-comment"># Hold for GESTURE_HOLD_TIME seconds - starting or stopping capture</span>
                    <span class="hljs-keyword">if</span> self.is_recording:
                        self.stop_and_save_capture()
                    <span class="hljs-keyword">else</span>:
                        self.start_capture()

                    <span class="hljs-keyword">return</span>
            <span class="hljs-keyword">else</span>:
                self.capture_gesture_start_time = <span class="hljs-number">0</span>
        <span class="hljs-keyword">elif</span> in_capture_gesture:
                self.capture_gesture_start_time = current_time


    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">start_capture</span>(<span class="hljs-params">self</span>):</span>
        print(<span class="hljs-string">"Start capturing"</span>)
        self.is_recording = <span class="hljs-literal">True</span>
        self.capture_gesture_start_time = <span class="hljs-number">0</span>
        self.video_writer = cv2.VideoWriter(
            <span class="hljs-string">f"output_<span class="hljs-subst">{int(time.time())}</span>.mp4"</span>,
            cv2.VideoWriter_fourcc(*<span class="hljs-string">'avc1'</span>),
            <span class="hljs-number">10</span>,
            (WRITER_WIDTH, WRITER_HEIGHT)
        )


    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">stop_and_save_capture</span>(<span class="hljs-params">self</span>):</span>
        print(<span class="hljs-string">"Stop capturing"</span>)
        self.is_recording = <span class="hljs-literal">False</span>
        self.capture_gesture_start_time = <span class="hljs-number">0</span>
        self.video_writer.release()
        self.video_writer = <span class="hljs-literal">None</span>
</code></pre>
<p>That could be added into a main loop pipeline, after <code>draw_angles</code>:</p>
<pre><code class="lang-python">video_capture = GesturedVideoCapture()

<span class="hljs-comment"># .....</span>
        draw_angles(annotated_frame, keypoints)
        video_capture.process_frame(annotated_frame, keypoints, time.time())
</code></pre>
<p>And here is the result: the video writer, that is activated by a raised hand:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1736657914218/b2ac8c0d-00ea-4334-a782-a097e4accab3.gif" alt class="image--center mx-auto" /></p>
<p>The code is available here - <a target="_blank" href="https://github.com/hypnocapybara/pose_estimation_test">https://github.com/hypnocapybara/pose_estimation_test</a></p>
]]></content:encoded></item><item><title><![CDATA[Implementing your personal production-ready Telegram bot using AI tools to monitor, transcribe, summarize and voice videos from YouTube]]></title><description><![CDATA[Intro
Artificial intelligence (AI) is transforming the way we consume, create, and share information on the internet. From speech recognition to natural language processing, from computer vision to text summarization, AI tools are enabling us to acce...]]></description><link>https://alexcoder.dev/yotube-summarize-and-voice</link><guid isPermaLink="true">https://alexcoder.dev/yotube-summarize-and-voice</guid><category><![CDATA[Python]]></category><category><![CDATA[telegram]]></category><category><![CDATA[openai]]></category><category><![CDATA[summarization]]></category><category><![CDATA[tts]]></category><category><![CDATA[text to speech]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Fri, 29 Dec 2023 02:01:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1703961485074/d8bcd1be-c53f-4ace-b65c-b2267a0b8b0e.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-intro">Intro</h1>
<p>Artificial intelligence (AI) is transforming the way we consume, create, and share information on the internet. From speech recognition to natural language processing, from computer vision to text summarization, AI tools are enabling us to access and understand vast amounts of data in various formats and languages.</p>
<p>In this article, I will show you how to build your personal Telegram bot using open-source AI tools and Python libraries to monitor, transcribe, summarize, and voice videos from your YouTube subscriptions. I tried to focus on self-hosted solutions and tools, and experimented with voice cloning.</p>
<p>As a desired result, you automatically get short and informative summaries, voiced by your favorite blogger, using voice-cloning technology!</p>
<p>Special attention is paid to app architecture, to make it as close to the real production-ready and scalable service as possible. I used the background workers and job scheduling library <a target="_blank" href="https://github.com/rq/rq">RQ</a> (<em>Redis Queue</em>) and its Django integration <a target="_blank" href="https://github.com/rq/django-rq">django-rq</a>. Also, I will show how to integrate the asynchronous Telegram bot library <a target="_blank" href="https://aiogram.dev/">aiogram</a> with modern Django.</p>
<h1 id="heading-project-schema-and-architecture">Project schema and architecture</h1>
<p>Showcased below is a visual representation of the pipeline.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1703698449429/1e95a2e9-1a56-4c41-9a08-461181f89643.png" alt class="image--center mx-auto" /></p>
<p>To implement the scheme shown above, I created a Django project, since this library has a nice ORM and out-of-the-box admin interface. django-rq is used for executing tasks.</p>
<p>The whole list of used libraries:</p>
<ul>
<li><p>Django - for the admin web interface and database management</p>
</li>
<li><p>Django-RQ - for task workers and job execution queue</p>
</li>
<li><p>pytubefix - to parse YouTube videos and channels</p>
</li>
<li><p>OpenAI Whisper - for video transcription</p>
</li>
<li><p>nltk.tokenizer - to split text into sentences</p>
</li>
<li><p>OpenAI API - to generate key moments with ChatGPT</p>
</li>
<li><p>aiogram - for Telegram bot</p>
</li>
</ul>
<p>Every step of the pipeline could be replaced with a similar solution because of the modular architecture.</p>
<p>The core logic is described in the main module. It has 2 models. <code>YoutubeChannel</code>:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">YoutubeChannel</span>(<span class="hljs-params">models.Model</span>):</span>
    url = models.URLField()
    enabled = models.BooleanField(default=<span class="hljs-literal">True</span>)
    title = models.CharField(max_length=<span class="hljs-number">200</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    last_parsed_at = models.DateTimeField(blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    voice_file = models.FileField(upload_to=<span class="hljs-string">'channels/'</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
</code></pre>
<p>And <code>YoutubeVideo</code>:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">YoutubeVideo</span>(<span class="hljs-params">CreatedUpdatedMixin</span>):</span>
    channel = models.ForeignKey(YoutubeChannel, on_delete=models.CASCADE)
    url = models.URLField()
    youtube_id = models.CharField(max_length=<span class="hljs-number">100</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    title = models.CharField(max_length=<span class="hljs-number">200</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    audio_file = models.FileField(upload_to=<span class="hljs-string">'videos/audio/'</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    transcription = models.TextField(blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    transcription_language = models.CharField(max_length=<span class="hljs-number">10</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    summary = models.TextField(blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
    voiced_summary = models.FileField(upload_to=<span class="hljs-string">'videos/voiced/'</span>, blank=<span class="hljs-literal">True</span>, null=<span class="hljs-literal">True</span>)
</code></pre>
<p>It saves the data for all the pipeline stages.</p>
<p>Tasks are executed in django-rq, and split into 2 queues: <code>default</code> and <code>ai</code>. Default is used for any external API calls, like parsing YouTube or sending Telegram messages. It could have multiple workers and not be resource-consuming. "AI" queue, on the other hand, works with high-performance tasks, like transcription generation, summarization or voicing. In code, tasks are marked with <code>@job</code> decorator, specifying the queue for the task.</p>
<p>Now, let's explore step-by-step how the whole pipeline works.</p>
<h1 id="heading-youtube-monitor">Youtube monitor</h1>
<p>To work with YouTube, there is a nice library called <a target="_blank" href="https://github.com/pytube/pytube">pytube</a>. Unfortunately, it has not been actively maintained in recent times and crashes when working with Channels. So I have found another one, that implements the same functionality and uses the same interfaces - <a target="_blank" href="https://github.com/JuanBindez/pytubefix">pytubefix</a>.</p>
<p>There are 3 functions in the YouTube Monitor component of the pipeline.</p>
<p><code>parse_all_channels</code> - loops through enabled channels and creates tasks to parse each of them:</p>
<pre><code class="lang-python"><span class="hljs-meta">@job('default')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_all_channels</span>():</span>
    print(<span class="hljs-string">'running parse all channels'</span>)

    <span class="hljs-keyword">for</span> channel <span class="hljs-keyword">in</span> YoutubeChannel.objects.filter(enabled=<span class="hljs-literal">True</span>):
        parse_channel.delay(channel)
</code></pre>
<p><code>parse_channel</code> - processes videos of a particular channel. For simplicity, the current implementation analyzes only the most recent video. It checks if it is not already in the database, and then creates the <code>YoutubeVideo</code> record and runs the pipeline with it:</p>
<pre><code class="lang-python"><span class="hljs-meta">@job('default')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_channel</span>(<span class="hljs-params">channel: YoutubeChannel</span>):</span>
    print(<span class="hljs-string">f'running parse channel <span class="hljs-subst">{str(channel)}</span>'</span>)

    yt_channel = Channel(channel.url)
    channel.title = yt_channel.channel_name
    channel.save()

    <span class="hljs-keyword">try</span>:
        last_video = yt_channel.videos[<span class="hljs-number">0</span>]
    <span class="hljs-keyword">except</span> IndexError:
        <span class="hljs-keyword">return</span>

    <span class="hljs-keyword">if</span> YoutubeVideo.objects.filter(channel=channel, youtube_id=last_video.video_id).exists():
        <span class="hljs-keyword">return</span>

    new_video = YoutubeVideo.objects.create(
        channel=channel,
        url=last_video.watch_url,
        youtube_id=last_video.video_id,
        title=last_video.title,
    )

    channel.last_parsed_at = timezone.now()
    channel.save()

    parse_video.delay(new_video)
</code></pre>
<p><code>parse_video</code> - fetches and saves the audio stream with the highest bitrate:</p>
<pre><code class="lang-python"><span class="hljs-meta">@job('default')</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_video</span>(<span class="hljs-params">video: YoutubeVideo</span>):</span>
    print(<span class="hljs-string">f'running parse video <span class="hljs-subst">{str(video)}</span>'</span>)

    yt_video = YouTube(video.url)

    video.youtube_id = yt_video.video_id
    video.title = yt_video.title
    video.save()

    audio_streams = [stream <span class="hljs-keyword">for</span> stream <span class="hljs-keyword">in</span> yt_video.streams <span class="hljs-keyword">if</span> stream.type == <span class="hljs-string">'audio'</span>]
    audio_streams = sorted(audio_streams, key=<span class="hljs-keyword">lambda</span> s: s.bitrate, reverse=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> audio_streams:
        <span class="hljs-keyword">return</span>

    stream: Stream = audio_streams[<span class="hljs-number">0</span>]
    buffer = io.BytesIO()
    stream.stream_to_buffer(buffer)
    video.audio_file.save(<span class="hljs-string">f'<span class="hljs-subst">{video.youtube_id}</span>.<span class="hljs-subst">{stream.subtype}</span>'</span>, buffer)
    video.save()

    transcribe_video.delay(video)
</code></pre>
<h1 id="heading-transcription">Transcription</h1>
<p>The next important stage is the audio transcription. Some libraries allow you to work with YouTube-generated captions. But I find that <a target="_blank" href="https://github.com/openai/whisper">OpenAI's Whisper</a> gives better results. I used the self-hosted solution with a "medium" model. They are also providing <a target="_blank" href="https://platform.openai.com/docs/guides/speech-to-text">API</a> for this if you want to avoid high-performant computations.</p>
<p>The process is pretty straightforward. The only thing to point out is the job timeout, I set it to 1 hour. It usually takes 15-20 minutes to transcribe a 2h video on my personal MSI GF65 Thin laptop.</p>
<pre><code class="lang-python"><span class="hljs-meta">@job('ai', timeout=60 * 60)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">transcribe_video</span>(<span class="hljs-params">video: YoutubeVideo</span>):</span>
    print(<span class="hljs-string">f'running transcribe video <span class="hljs-subst">{str(video)}</span>'</span>)

    model = whisper.load_model(<span class="hljs-string">'medium'</span>)
    result = model.transcribe(video.audio_file.path)

    video.transcription_language = result[<span class="hljs-string">'language'</span>]
    video.transcription = result[<span class="hljs-string">'text'</span>]
    video.save()

    summarize_video.delay(video)
</code></pre>
<h1 id="heading-summary">Summary</h1>
<p>For summarization, I initially tried to use open-source LLMs. Unfortunately, my laptop is too weak for it and cannot run even quantified LLaMA 2 or Mistral 7B. So, I gave up and switched to using ChatGPT's API.</p>
<p>The core idea of the process is that I split the input text into sentences (using <code>nltk.tokenize</code> package), and form blocks of sentences until I reach the context window limit. I then summarize each block separately, with a 2-sentence overlap between blocks.</p>
<p>In the resulting summary, I combine outputs from all the blocks. In some algorithms, you also summarize all the outputs. However, I found that concatenation is better since there is not a lot of overlap of key moments.</p>
<p>This algorithm is quite simple but assumes that the initial transcription is correctly split into sentences. So it could be presented like this sequence of steps:</p>
<ol>
<li><p>Split the text into a list of sentences</p>
</li>
<li><p>Group sentences into slightly overlapped blocks. Each block should fit the LLM context window size</p>
</li>
<li><p>Summarize each block using LLM and a special prompt</p>
</li>
<li><p>Join resulting outputs into the generic summary</p>
</li>
</ol>
<p>The code and LLM prompts can be found in the <a target="_blank" href="https://github.com/hypnocapybara/yt_summarizer/blob/main/apps/main/summary/openai.py">related module in my repository</a></p>
<h1 id="heading-voicing">Voicing</h1>
<p>My core idea was that people want to hear the voice of their favorite blogger. That's why I was oriented on tools that provide a voice cloning feature. Some libraries can do that. For example, <a target="_blank" href="https://www.microsoft.com/en-us/research/project/vall-e-x/">VALL-E</a> looks promising, and would be interesting to try it later. But I end up using <a target="_blank" href="https://github.com/coqui-ai/TTS">Coqui TTS</a>. I tried to use their studio, and the internal models look really awesome! But even open-source models are pretty good, providing a quality result.</p>
<p>For simplicity, I uploaded the speaker prompt to the <code>YoutubeChannel</code> model, but it could be extracted from the audio stream, based on intervals, returned by Whisper.</p>
<p>The resulting function:</p>
<pre><code class="lang-python"><span class="hljs-meta">@job('ai', timeout=20 * 60)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">voice_summary</span>(<span class="hljs-params">video: YoutubeVideo</span>):</span>
    <span class="hljs-keyword">from</span> apps.telegram.tasks <span class="hljs-keyword">import</span> send_video_notifications

    print(<span class="hljs-string">f'running voice summary <span class="hljs-subst">{str(video)}</span>'</span>)

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> video.summary:
        <span class="hljs-keyword">return</span>

    device = <span class="hljs-string">'cuda'</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">'cpu'</span>
    tts = TTS(<span class="hljs-string">'tts_models/multilingual/multi-dataset/xtts_v2'</span>).to(device)
    _fd, temp_filename = tempfile.mkstemp()
    <span class="hljs-keyword">try</span>:
        tts.tts_to_file(
            text=video.summary,
            speaker_wav=video.channel.voice_file.path,
            language=video.transcription_language,
            file_path=temp_filename,
        )
        <span class="hljs-keyword">with</span> open(temp_filename, <span class="hljs-string">'rb'</span>) <span class="hljs-keyword">as</span> file:
            video.voiced_summary.save(<span class="hljs-string">f'<span class="hljs-subst">{video.youtube_id}</span>.wav'</span>, file)
            video.save()
    <span class="hljs-keyword">finally</span>:
        os.remove(temp_filename)

    send_video_notifications.delay(video)
</code></pre>
<h1 id="heading-sending-results">Sending results</h1>
<p>Now we have all the data. The last step: deliver it to the user. For that purpose, I am using <a target="_blank" href="https://aiogram.dev/">aiogram</a> library. For more info on basic bot setup, feel free to check my <a target="_blank" href="https://hashnode.com/post/cl5or5uhy0333hanvghpc2cg9">last article</a>.</p>
<p>The tricky part is how to run an async bot in the sync-by-default Django environment and vice-versa, how to execute sync Django ORM commands inside the async bot.</p>
<p>So the bot entry point is this management command <code>python manage.py run_bot</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> asyncio

<span class="hljs-keyword">from</span> django.core.management <span class="hljs-keyword">import</span> BaseCommand
<span class="hljs-keyword">from</span> django.conf <span class="hljs-keyword">import</span> settings

<span class="hljs-keyword">from</span> apps.telegram.bot <span class="hljs-keyword">import</span> run_bot


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Command</span>(<span class="hljs-params">BaseCommand</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handle</span>(<span class="hljs-params">self, *args, **options</span>):</span>
        asyncio.run(run_bot(settings.TELEGRAM_BOT_TOKEN))
</code></pre>
<p>The implementation of async <code>run_bot</code> is also quite simple:</p>
<pre><code class="lang-python"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">run_bot</span>(<span class="hljs-params">token: str</span>) -&gt; <span class="hljs-keyword">None</span>:</span>
    dp = Dispatcher()

    dp.message(CommandStart())(command_start_handler)
    dp.message()(echo_handler)

    bot = Bot(token, parse_mode=ParseMode.HTML)
    dp.message.middleware(UserMiddleware())
    <span class="hljs-keyword">await</span> dp.start_polling(bot)
</code></pre>
<p>Since the service purpose is mostly to transfer information from the server to users, we need just any handler, to register user. And <code>TelegramUser</code> creation happens in the <code>UserMiddleware</code>:</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">UserMiddleware</span>(<span class="hljs-params">BaseMiddleware</span>):</span>
    <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__call__</span>(<span class="hljs-params">
        self,
        handler: Callable[[Message, Dict[str, Any]], Awaitable[Any]],
        event: Message,
        data: Dict[str, Any]
    </span>) -&gt; Any:</span>
        data[<span class="hljs-string">'user'</span>] = <span class="hljs-keyword">await</span> TelegramUser.get_or_create_from_telegram_user(event.from_user)
        <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> handler(event, data)
</code></pre>
<p><code>get_or_create_from_telegram_user</code> uses new asynchronous Django ORM functions: <code>aget</code> and <code>acreate</code> which allow you to execute database queries from an async context. Now, it is enough to send the "/start" command to the bot, and the user should be created in the database. Plus, the bot can send notifications to the user.</p>
<h1 id="heading-running-amp-getting-a-result">Running &amp; getting a result</h1>
<p>The most exciting part - seeing how every cog builds up into a mechanism, and it all starts working together! There are a certain amount of processes to be run for the pipeline:</p>
<ul>
<li><p>Web server with an admin panel</p>
<ul>
<li><code>python manage.py runserver</code></li>
</ul>
</li>
<li><p>RQ workers for AI tasks</p>
<ul>
<li><code>python manage.py rqworker ai</code></li>
</ul>
</li>
<li><p>RQ workers for other tasks</p>
<ul>
<li><code>python manage.py rqworker default</code></li>
</ul>
</li>
<li><p>RQ scheduler</p>
<ul>
<li><code>python manage.py rqscheduler</code></li>
</ul>
</li>
</ul>
<p>Here is an example of running processes and consumed resources:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1703814753895/3ae78b20-c95a-4427-aef2-95f86095bbbb.png" alt class="image--center mx-auto" /></p>
<p>And the example of the resulting message (using GPT-4 for summary):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1703730388088/91cbf505-8ba7-44a0-87f3-135d66f6e779.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-conclusion-amp-future-improvements">Conclusion &amp; future improvements</h1>
<p>In the end, I got a working prototype that could be easily Dockerized and run as a personal service. Especially considering that with so many models available online, you might not need to run high-performance computing on your own. So, any VPS could deal with simple API calls.</p>
<p>The biggest problem that I see now is the quality of summarized key moments. During some tests, I realized that ChatGPT 3.5 can mix up facts, so you cannot fully trust the resulting information to be 100% accurate, especially if it is about daily news or current events. ChatGPT-4 handles that much better, and the resulting quality is quite impressive. <code>gpt-4-1106-preview</code> model has a context window of 128k tokens, which is enough to fit almost any video in one chunk. But it is expensive - costed me 0.20$ per one API call for a medium-size transcription. So, it might be necessary to play more with prompts or try different LLMs.</p>
<p>Also, besides summary accuracy, to make it 100% production-ready, I would improve things like:</p>
<ul>
<li><p>Add mutexes to tasks</p>
</li>
<li><p>Add tasks error handling</p>
</li>
<li><p>Dockerize the project</p>
</li>
<li><p>Use webhooks, instead of polling for the Telegram bot</p>
</li>
</ul>
<p>Regardless, this working prototype is an interesting playground. Source code is available <a target="_blank" href="https://github.com/hypnocapybara/yt_summarizer">in GitHub</a>.</p>
<p>Hopefully, you enjoyed reading this!</p>
<h2 id="heading-disclaimer">Disclaimer</h2>
<p>This article and the content herein, including code, example processes, and the use of AI tools for voice cloning, are intended for personal information and educational purposes only. Unauthorized distribution of generated voice-cloned content without the author's explicit permission is strictly prohibited. Before using any of the mentioned tools in commercial products, consult with their license agreement. All credit goes to the rightful owners.</p>
]]></content:encoded></item><item><title><![CDATA[Cell colony game in C++]]></title><description><![CDATA[Intro
I started my programming path from the book "C++ for Dummies". Then I solved programming tasks and made my first steps in different computer science areas using this language.
I have my favorite game: "Half-Life 2", and during high school / fir...]]></description><link>https://alexcoder.dev/cell-colony-game-in-c</link><guid isPermaLink="true">https://alexcoder.dev/cell-colony-game-in-c</guid><category><![CDATA[C++]]></category><category><![CDATA[Game Development]]></category><category><![CDATA[Games]]></category><category><![CDATA[Physics]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Mon, 09 Jan 2023 04:01:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1673227686979/64d2e9d0-080a-4cf9-ba43-d25d5bd1f152.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-intro">Intro</h2>
<p>I started my programming path from the book "C++ for Dummies". Then I solved programming tasks and made my first steps in different computer science areas using this language.</p>
<p>I have my favorite game: "Half-Life 2", and during high school / first year in university, I tried making mods for it. And because I was really bad at modelling and level design, I practiced modifying the game logic, adding new entities and implementing some features in the game code. And the C++ code of the "Source Engine" (game engine developed by Valve company and used in Half-Life games) was a really nice example of best practices in objective-oriented programming. I was very impressed with how logical and straightforward that code is. After typical examples in programming books and articles, it was an absolutely new view. I realized, that the code could be abstract, human-readable and simple. And operate with objects, representing real-world entities, instead of just variables.</p>
<p>That's why I feel really warm about C++, game development / simulations, and graphics visualization. And in this article, I will implement and describe a simple game about cell colonies.</p>
<h2 id="heading-game-rules">Game rules</h2>
<p>The idea for the game comes from John Conway's "the game of life", Spore and just the nature of bacterial columns.</p>
<p><img src="https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41467-018-06370-3/MediaObjects/41467_2018_6370_Fig1_HTML.png" alt="Fig. 1" /></p>
<p>There are 2 objects: a food base and a cell / bacteria.</p>
<p>The food base has a limited amount of supply and a limited amount of cells, that could feed from it. It has an "active radius", so the cell can take food from the base when being within this radius. The limited capacity and amount of food will motivate cells to expand the colony and move toward other bases.</p>
<p>Cell / bacteria have these features:</p>
<ul>
<li><p>Limited lifetime, after that cell dies</p>
</li>
<li><p>Limited health, after some attacks cell dies</p>
</li>
<li><p>After eating a needed amount of food, the cell splits, creating 2 new cells</p>
</li>
<li><p>When hungry, the cell moves to a nearby food base</p>
</li>
<li><p>When seeing other colony's cells, attacks them</p>
</li>
<li><p>Collides with other cells</p>
</li>
</ul>
<p>So generally, the cell is a basic unit, like in every RTS game.</p>
<h2 id="heading-objects">Objects</h2>
<p>Let's finally put that rules into the code.</p>
<p>Since both game entities have common props: position, a link to the world, they could go into a base class <code>Entity</code>:</p>
<pre><code class="lang-cpp"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Entity</span> {</span>
    <span class="hljs-keyword">public</span>:
    Entity(World* world, Point2 pos);
    <span class="hljs-function"><span class="hljs-keyword">virtual</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Process</span><span class="hljs-params">()</span> </span>= <span class="hljs-number">0</span>;
    <span class="hljs-function"><span class="hljs-keyword">inline</span> Point2 <span class="hljs-title">GetPosition</span><span class="hljs-params">()</span></span>;

    <span class="hljs-keyword">protected</span>:
    Point2 position;
    World* world;

    <span class="hljs-keyword">friend</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">World</span>;</span>
};
</code></pre>
<p>The <code>Process</code> function is triggered by the world update and should handle the object's internal state changes.</p>
<p>For food, need to describe the amount, radius and track cells, that enter or leave the food base:</p>
<pre><code class="lang-cpp"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Food</span> :</span> <span class="hljs-keyword">public</span> Entity {
    <span class="hljs-keyword">public</span>:
    Food(World* world, Point2 pos, <span class="hljs-keyword">int</span> maxAmount, <span class="hljs-keyword">float</span> radius, <span class="hljs-keyword">int</span> maxCellsCount);

    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Process</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Die</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">float</span> <span class="hljs-title">GetRadius</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">int</span> <span class="hljs-title">GetMaxCellsCount</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">int</span> <span class="hljs-title">GetCurrentCellsCount</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">bool</span> <span class="hljs-title">HasFreeSpots</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">bool</span> <span class="hljs-title">HasAmountAvailable</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsCellInActiveZone</span><span class="hljs-params">(Cell* cell)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">HasAvailableAmount</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">HandleEaten</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">CellLeave</span><span class="hljs-params">(Cell* cell)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">CellOccupy</span><span class="hljs-params">(Cell* cell)</span></span>;

    <span class="hljs-keyword">protected</span>:
    <span class="hljs-keyword">int</span> currentAmount = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">int</span> maxAmount = <span class="hljs-number">0</span>;
    <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Cell*&gt; cells;
    <span class="hljs-keyword">int</span> maxCellsCount = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> radius = <span class="hljs-number">0</span>;

    <span class="hljs-keyword">friend</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">World</span>;</span>
};
</code></pre>
<p>Cells need functions and variables to track health, starvation, attacks, split, etc. So the class declaration is bigger, but the implementation of each function should be simple and obvious, so all this logic is assembled piece-by-piece, which is a nice benefit of abstract programming</p>
<pre><code class="lang-cpp"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Cell</span> :</span> <span class="hljs-keyword">public</span> Entity {
    <span class="hljs-keyword">public</span>:
    Cell(World* world, Point2 pos, <span class="hljs-keyword">int</span> ownerId, <span class="hljs-built_in">std</span>::<span class="hljs-built_in">map</span>&lt;<span class="hljs-built_in">std</span>::<span class="hljs-built_in">string</span>, <span class="hljs-built_in">std</span>::<span class="hljs-built_in">string</span>&gt; params);

    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Process</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">CanSplit</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Split</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsTooHungry</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsTooOld</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">bool</span> <span class="hljs-title">IsOutOfHealth</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Die</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsInCooldownFromAttack</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">CanAttack</span><span class="hljs-params">(Cell* otherCell)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Attack</span><span class="hljs-params">(Cell* otherCell)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">TakeDamage</span><span class="hljs-params">(<span class="hljs-keyword">float</span> amount)</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">CanEat</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Eat</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">OccupyFoodBase</span><span class="hljs-params">(Food* food)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">LeaveFoodBase</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsWithinFoodBase</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsInCooldownFromFeed</span><span class="hljs-params">()</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">float</span> <span class="hljs-title">GetRadius</span><span class="hljs-params">()</span> </span>{<span class="hljs-keyword">return</span> <span class="hljs-keyword">this</span>-&gt;radius;}
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">int</span> <span class="hljs-title">GetUserId</span><span class="hljs-params">()</span> </span>{<span class="hljs-keyword">return</span> <span class="hljs-keyword">this</span>-&gt;userId;}

    <span class="hljs-comment">// === Should go to AI module ===</span>
    <span class="hljs-keyword">enum</span> Intention {
        Nothing,
        WannaFeed,
        WannaAttack,
        Patrolling
    };

    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">FormDecission</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">HasReachedThePoi</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">MoveToPoint</span><span class="hljs-params">(Point2 poi, <span class="hljs-keyword">float</span> poiRadius)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">StopActivity</span><span class="hljs-params">()</span></span>;

    <span class="hljs-keyword">protected</span>:
    <span class="hljs-comment">// eat</span>
    Food* feedBase = <span class="hljs-literal">nullptr</span>;

    <span class="hljs-comment">// health</span>
    <span class="hljs-keyword">float</span> healthCurrent = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> healthMax = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> birthTime = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> lifetime = <span class="hljs-number">0</span>;

    <span class="hljs-comment">// move</span>
    <span class="hljs-keyword">bool</span> inMove = <span class="hljs-literal">false</span>;
    Point2 poi;
    <span class="hljs-keyword">float</span> poiRadius;
    Vector2 velocity;
    <span class="hljs-keyword">float</span> speed = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> radius = <span class="hljs-number">0</span>;

    <span class="hljs-comment">// feed &amp; fission variables</span>
    <span class="hljs-comment">// when reach feedMax - self-dublicate</span>
    <span class="hljs-keyword">int</span> feedCurrent = <span class="hljs-number">0</span>, feedMax = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> lastFeedTime = <span class="hljs-number">0</span>; <span class="hljs-comment">// last global time when cell feeded</span>
    <span class="hljs-keyword">float</span> feedCooldown = <span class="hljs-number">0</span>; <span class="hljs-comment">// cooldown between feed actions (when cell cannot act)</span>
    <span class="hljs-keyword">float</span> feedInterval = <span class="hljs-number">0</span>; <span class="hljs-comment">// min interval before feed actions</span>
    <span class="hljs-keyword">float</span> maxTimeWithoutFood = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> foodDetectRadius = <span class="hljs-number">0</span>;

    <span class="hljs-comment">// attack variables</span>
    <span class="hljs-keyword">float</span> attackRange = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> attackPower = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> lastAttackTime = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> attackCooldown = <span class="hljs-number">0</span>;
    <span class="hljs-keyword">float</span> enemiesDetectRadius = <span class="hljs-number">0</span>;

    <span class="hljs-comment">// belonging</span>
    <span class="hljs-keyword">int</span> userId = <span class="hljs-number">0</span>;

    <span class="hljs-comment">// === Should go to AI module ===</span>
    Intention intention;

    <span class="hljs-keyword">friend</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">World</span>;</span>
};
</code></pre>
<p>Ideally, all the decision-making parts should go into a separate AI class, that would handle only intension-forming logic.</p>
<p>It is a lot of code, but just take a look at the <code>Process</code> function, it is so abstract and human-readable:</p>
<pre><code class="lang-cpp"><span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Cell::Process</span><span class="hljs-params">()</span> </span>{
    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">this</span>-&gt;feedBase &amp;&amp; !<span class="hljs-keyword">this</span>-&gt;IsWithinFoodBase()) {
        <span class="hljs-keyword">this</span>-&gt;LeaveFoodBase();
    }

    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">this</span>-&gt;IsOutOfHealth() || <span class="hljs-keyword">this</span>-&gt;IsTooHungry() || <span class="hljs-keyword">this</span>-&gt;IsTooOld()) {
        <span class="hljs-keyword">return</span> <span class="hljs-keyword">this</span>-&gt;Die();
    }

    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">this</span>-&gt;IsInCooldownFromAttack() || <span class="hljs-keyword">this</span>-&gt;IsInCooldownFromFeed()) {
        <span class="hljs-keyword">this</span>-&gt;inMove = <span class="hljs-literal">false</span>;
        <span class="hljs-keyword">return</span>;
    }

    <span class="hljs-keyword">if</span> (<span class="hljs-keyword">this</span>-&gt;CanSplit()) {
        <span class="hljs-keyword">return</span> <span class="hljs-keyword">this</span>-&gt;Split();
    }

    <span class="hljs-keyword">this</span>-&gt;FormDecission();
}
</code></pre>
<p>And the main object - <code>World</code> which is just a container for cells and food, that runs the simulation and implements functions to manipulate it (but not to render it! Which is important, to split those 2 processes)</p>
<pre><code class="lang-cpp"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">World</span> {</span>
    <span class="hljs-keyword">public</span>:
    World(<span class="hljs-keyword">float</span> width, <span class="hljs-keyword">float</span> height);
    <span class="hljs-function"><span class="hljs-keyword">inline</span> <span class="hljs-keyword">float</span> <span class="hljs-title">GetCurrentTime</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Step</span><span class="hljs-params">(<span class="hljs-keyword">float</span> delta)</span></span>;

    <span class="hljs-function">Cell* <span class="hljs-title">CreateBacteria</span><span class="hljs-params">(Point2 position, <span class="hljs-keyword">int</span> ownerId)</span></span>;
    <span class="hljs-function">Food* <span class="hljs-title">CreateFood</span><span class="hljs-params">(Point2 position, <span class="hljs-keyword">int</span> maxAmount, <span class="hljs-keyword">float</span> radius, <span class="hljs-keyword">int</span> maxCellsCount)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">DestroyCell</span><span class="hljs-params">(Cell* cell)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">DestroyFood</span><span class="hljs-params">(Food* food)</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">const</span> <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Cell*&gt;&amp; <span class="hljs-title">GetCells</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">const</span> <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Food*&gt;&amp; <span class="hljs-title">GetFood</span><span class="hljs-params">()</span></span>;

    <span class="hljs-keyword">protected</span>:
    <span class="hljs-keyword">float</span> width, height;

    <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Cell*&gt; cells;
    <span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;Food*&gt; food;

    <span class="hljs-keyword">float</span> currentTime = <span class="hljs-number">0</span>;
};
</code></pre>
<p>The main idea here - you have a special class, that can do the simulation, and it is separated from render. Potentially, that could even be used to train a neural network with a positive reinforcement algorithm, that will process cell decisions.</p>
<p>Also, basic cell collisions were implemented. In real usage, better to trust that logic to a separate physics engine (like <a target="_blank" href="https://box2d.org/">box2d</a>), but for a simple demo, it is fine to remember a simple 2D math and process collisions manually.</p>
<p>The article can even end here, but for the game, you need user input to manipulate the world, plus it needs to be rendered.</p>
<h2 id="heading-render">Render</h2>
<p>When it comes to rendering hardware-accelerated graphics, you usually have to write a lot of extra code, before you can actually draw something. Such as:</p>
<ul>
<li><p>A code to create a window and obtain an OpenGL context</p>
</li>
<li><p>OpenGL code to bind vertex buffer, vertex and fragment shaders, map coordinates and many other different instructions</p>
</li>
</ul>
<p>That's why I would like to use any library, that will encapsulate all this logic and give a simple API to draw objects and handle keyboard and mouse input. Plus, a nice bonus if it is cross-platform (including web). I searched a little bit and found a perfectly matching library - <a target="_blank" href="https://bernhardfritz.github.io/piksel/#/">piksel</a>.</p>
<p>Everything that you need is just to inherit the <code>BaseApp</code> class and implement needed functions:</p>
<pre><code class="lang-cpp"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">App</span> :</span> <span class="hljs-keyword">public</span> piksel::BaseApp {
<span class="hljs-keyword">public</span>:
    App(World* w, <span class="hljs-keyword">int</span> width, <span class="hljs-keyword">int</span> height);
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">setup</span><span class="hljs-params">()</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">draw</span><span class="hljs-params">(piksel::Graphics&amp; g)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">keyPressed</span><span class="hljs-params">(<span class="hljs-keyword">int</span> keycode)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">keyReleased</span><span class="hljs-params">(<span class="hljs-keyword">int</span> keycode)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">mousePressed</span><span class="hljs-params">(<span class="hljs-keyword">int</span> code)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">mouseMoved</span><span class="hljs-params">(<span class="hljs-keyword">int</span> x, <span class="hljs-keyword">int</span> y)</span></span>;

<span class="hljs-keyword">protected</span>:
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">ProcessCamera</span><span class="hljs-params">(<span class="hljs-keyword">float</span> step)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">bool</span> <span class="hljs-title">IsSeenByCamera</span><span class="hljs-params">(<span class="hljs-keyword">const</span> Point2&amp; point)</span></span>;

    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">DrawCells</span><span class="hljs-params">(piksel::Graphics&amp; g)</span></span>;
    <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">DrawFood</span><span class="hljs-params">(piksel::Graphics&amp; g)</span></span>;

    World* world;
    <span class="hljs-built_in">std</span>::chrono::system_clock::time_point lastUpdateTime;
    Point2 cameraPos;
    <span class="hljs-keyword">int</span> pressedKeys;
    <span class="hljs-keyword">int</span> mouseX, mouseY;
};
</code></pre>
<p>To draw primitives, you can use simple functions <code>g.rect</code>, <code>g.ellipse</code>, etc. To simplify, let's just draw cells as circles (of a different color) and food as white rect.</p>
<p>As a result, we can have cell colonies epic battles:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1673234747873/3e08f8d1-d234-4c40-b90d-5469c7a22892.gif" alt class="image--center mx-auto" /></p>
<p>So the resulting project structure is as follows:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1673236007327/9f781339-b2fb-4b46-8b1f-25f714470d60.png" alt class="image--center mx-auto" /></p>
<p>The source code is available on <a target="_blank" href="https://github.com/hypnocapybara/cells">GitHub</a>.</p>
<p>Maybe when having some time, I will continue to develop that game. There are different possible directions:</p>
<ul>
<li><p>Use box2d as a physics engine (it would be a nice experience)</p>
</li>
<li><p>Play more with the best cells and food params</p>
</li>
<li><p>Use sprites for game entities</p>
</li>
<li><p>Implement UI, cells selection and control (like in RTS games)</p>
</li>
<li><p>Implement different levels with complicated relief and a path search algorithm to allow cells to move there</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Asynchronous Telegram bot with aiogram and GINO ORM]]></title><description><![CDATA[Telegram
Telegram is a cloud-based mobile and desktop messaging app with a focus on security and speed.
For me, Telegram - is not just a messenger. It is the first app I open in the morning to read the news. And I am making a significant part of dail...]]></description><link>https://alexcoder.dev/asynchronous-telegram-bot-with-aiogram-and-gino-orm</link><guid isPermaLink="true">https://alexcoder.dev/asynchronous-telegram-bot-with-aiogram-and-gino-orm</guid><category><![CDATA[Python]]></category><category><![CDATA[telegram]]></category><category><![CDATA[asynchronous]]></category><category><![CDATA[orm]]></category><dc:creator><![CDATA[Alexander Polev]]></dc:creator><pubDate>Sun, 17 Jul 2022 03:22:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1656989229094/aGd5UuDI-.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-telegram">Telegram</h1>
<p><a target="_blank" href="https://telegram.org/">Telegram</a> is a cloud-based mobile and desktop messaging app with a focus on security and speed.
For me, Telegram - is not just a messenger. It is the first app I open in the morning to read the news. And I am making a significant part of daily communication with Telegram. Plus, it is a developer-friendly service with an open API, so you can create your own clients and apps. More than, Telegram creators are encouraging it and have published a <a target="_blank" href="https://core.telegram.org/tdlib">library</a> to speed up that process.</p>
<h2 id="heading-bots">Bots</h2>
<p>Bots are third-party applications, hosted on your own server and run inside Telegram. Users can interact with bots by sending them messages, commands and inline requests. You implement the whole bots' logic and control their interaction with users using HTTPS requests to Telegram's Bot API.
For example, if you have a website and monitor its availability. You can write a simple bot that will send you an alert if it's offline.
There is a nice <a target="_blank" href="https://core.telegram.org/bots">article</a> about bots on the official website</p>
<h2 id="heading-creating-a-bot">Creating a bot</h2>
<p>To create your own bot, you need to message a special bot - <a target="_blank" href="https://t.me/botfather">BotFather</a>.
In this article, we will create our own bot, that will implement a to-do list app. It should:</p>
<ul>
<li>keep user's notes in the database</li>
<li>delete expired items (let's say that 1 hour is the expiry period)</li>
<li>be able to fetch and send the whole list of saved notes</li>
</ul>
<p>So, let's talk to BotFather:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657161910199/_Rw7rv5ow.png" alt="image.png" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657162023404/A2SsFz_-k.png" alt="Screenshot from 2022-07-06 22-45-36.png" /></p>
<p>Now, we have a bot token, it's everything we need. Let's dive into the code. 
Important code listings will be provided during the article. The whole repo link is at the bottom.</p>
<h1 id="heading-gino-orm">GINO ORM</h1>
<p>First, let me introduce you to the other subject of this article - <a target="_blank" href="https://python-gino.org/">GINO ORM</a>.
Gino is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.1 supports PostgreSQL with asyncpg, and MySQL with aiomysql.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657499772556/WHXiGdfP-.png" alt="image.png" /></p>
<p>Create a new poetry project and add the required dependencies:</p>
<pre><code>poetry <span class="hljs-keyword">new</span> hashnode<span class="hljs-operator">-</span>todo
cd hashnode<span class="hljs-operator">-</span>todo
poetry add python<span class="hljs-operator">-</span>dotenv aiogram gino emoji
</code></pre><p>Then let's create <code>models.py</code> (and I would prefer to rename the project folder to "bot"):</p>
<pre><code><span class="hljs-keyword">from</span> gino <span class="hljs-keyword">import</span> Gino
<span class="hljs-keyword">from</span> sqlalchemy <span class="hljs-keyword">import</span> func

db = Gino()


<span class="hljs-keyword">class</span> <span class="hljs-keyword">User</span>(db.Model):
    __tablename__ = <span class="hljs-string">'users'</span>

    id = db.<span class="hljs-keyword">Column</span>(db.Integer, primary_key=<span class="hljs-keyword">True</span>)
    telegram_user_id = db.<span class="hljs-keyword">Column</span>(db.String(length=<span class="hljs-number">200</span>))
    created_at = db.<span class="hljs-keyword">Column</span>(db.DateTime(timezone=<span class="hljs-keyword">True</span>), default=func.now())
    first_name = db.<span class="hljs-keyword">Column</span>(db.String(length=<span class="hljs-number">255</span>))
    last_name = db.<span class="hljs-keyword">Column</span>(db.String(length=<span class="hljs-number">255</span>))
    username = db.<span class="hljs-keyword">Column</span>(db.String(length=<span class="hljs-number">255</span>))


<span class="hljs-keyword">class</span> Note(db.Model):
    __tablename__ = <span class="hljs-string">'notes'</span>

    id = db.<span class="hljs-keyword">Column</span>(db.Integer, primary_key=<span class="hljs-keyword">True</span>)
    user_id = db.<span class="hljs-keyword">Column</span>(db.Integer, db.ForeignKey(<span class="hljs-string">'users.id'</span>))
    created_at = db.<span class="hljs-keyword">Column</span>(db.DateTime(timezone=<span class="hljs-keyword">True</span>), default=func.now())
    <span class="hljs-type">text</span> = db.<span class="hljs-keyword">Column</span>(db.Text)
</code></pre><p>Also, we need to define a settings file:</p>
<pre><code><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> dataclasses <span class="hljs-keyword">import</span> dataclass
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv


load_dotenv()


<span class="hljs-meta">@dataclass</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Settings</span>:</span>
    DATABASE_CONNECTION: str
    TELEGRAM_BOT_TOKEN: str


settings = Settings(
    DATABASE_CONNECTION=os.getenv(<span class="hljs-string">'DATABASE_CONNECTION'</span>),
    TELEGRAM_BOT_TOKEN=os.getenv(<span class="hljs-string">'TELEGRAM_BOT_TOKEN'</span>),
)
</code></pre><p>And simple <code>.env</code> file:</p>
<pre><code>DATABASE_CONNECTION<span class="hljs-operator">=</span>postgresql:<span class="hljs-comment">//postgres@localhost/hashnode_todo</span>
TELEGRAM_BOT_TOKEN<span class="hljs-operator">=</span>
</code></pre><p>Now we are ready to work with a database.</p>
<p>For this tutorial, we will initialize the database with <code>gino.create_all()</code> call. In real usage, the migration tool "<a target="_blank" href="https://bitbucket.org/zzzeek/alembic/src/master/">Alembic</a>" is recommended by GINO authors (and it was created by SQLAcheme author). Or you could assume, that other applications manage the database, so you only define the schema for the already existing database.</p>
<p>The database initialization file:</p>
<pre><code><span class="hljs-keyword">import</span> asyncio

<span class="hljs-keyword">from</span> bot.models <span class="hljs-keyword">import</span> db
<span class="hljs-keyword">from</span> config <span class="hljs-keyword">import</span> settings


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">main</span>():</span>
    <span class="hljs-keyword">await</span> db.set_bind(settings.DATABASE_CONNECTION)
    <span class="hljs-keyword">await</span> db.gino.create_all()
    <span class="hljs-keyword">await</span> db.pop_bind().close()


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    asyncio.get_event_loop().run_until_complete(main())
</code></pre><p>Now you can run it with <code>poetry run python bot/create_models.py</code></p>
<h1 id="heading-gino-queries">GINO queries</h1>
<p>Let's think about the functions list, that we gonna need. And we know, that Telegram always has the user_id to identify the sender.
So we need:</p>
<ul>
<li>add_note(user, text)</li>
<li>get_notes(user)</li>
<li>process_expired_notes()</li>
</ul>
<p>And the simple service code with all required basic operations:</p>
<pre><code><span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime, timedelta
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List

<span class="hljs-keyword">from</span> bot.models <span class="hljs-keyword">import</span> User, Note


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_note</span>(<span class="hljs-params">user: <span class="hljs-string">'User'</span>, text: str</span>) -&gt; 'Note':</span>
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> Note.create(
        user_id=user.id,
        text=text
    )


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_notes</span>(<span class="hljs-params">user: <span class="hljs-string">'User'</span></span>) -&gt; List['Note']:</span>
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">await</span> Note.query.where(Note.user_id == user.id).gino.all()


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_expired_notes</span>():</span>
    <span class="hljs-keyword">await</span> Note.delete.where(
        Note.created_at &lt; datetime.now() - timedelta(days=<span class="hljs-number">1</span>)
    ).gino.status()
</code></pre><p>Now we have all database management functionality, and we can use it in the bot</p>
<h1 id="heading-aiogram">aiogram</h1>
<p><a target="_blank" href="https://github.com/aiogram/aiogram">aiogram</a> is a pretty simple and fully asynchronous framework for Telegram Bot API written in Python 3.7 with asyncio and aiohttp.</p>
<p>If your bot needs to handle many requests, then asynchronous is your parachute. </p>
<h2 id="heading-basic-bot">Basic bot</h2>
<p>Let's start with a simple bot, replying with "hello" on your <code>/start</code> command.
The proposed structure is probably not ideal, but very scalable. I use it in some complex projects.</p>
<p>First, we need a base runner file <code>main.py</code> in the project root:</p>
<pre><code><span class="hljs-keyword">from</span> aiogram <span class="hljs-keyword">import</span> executor

<span class="hljs-keyword">from</span> bot.models <span class="hljs-keyword">import</span> db
<span class="hljs-keyword">from</span> bot.main <span class="hljs-keyword">import</span> dispatcher
<span class="hljs-keyword">from</span> config <span class="hljs-keyword">import</span> settings


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_startup</span>(<span class="hljs-params">dispatcher</span>):</span>
    <span class="hljs-keyword">await</span> db.set_bind(settings.DATABASE_CONNECTION)


<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">on_shutdown</span>(<span class="hljs-params">dispatcher</span>):</span>
    <span class="hljs-keyword">await</span> db.pop_bind().close()


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    executor.start_polling(
        dispatcher=dispatcher,
        skip_updates=<span class="hljs-literal">True</span>,
        on_startup=on_startup,
        on_shutdown=on_shutdown,
    )
</code></pre><p>And file for dispatcher and importing dialogs (<code>bot/main.py</code>):</p>
<pre><code><span class="hljs-keyword">from</span> aiogram <span class="hljs-keyword">import</span> <span class="hljs-title">Bot</span>
<span class="hljs-title"><span class="hljs-keyword">from</span></span> <span class="hljs-title">aiogram</span>.<span class="hljs-title">dispatcher</span> <span class="hljs-title"><span class="hljs-keyword">import</span></span> <span class="hljs-title">Dispatcher</span>
<span class="hljs-title"><span class="hljs-keyword">from</span></span> <span class="hljs-title">config</span> <span class="hljs-title"><span class="hljs-keyword">import</span></span> <span class="hljs-title">settings</span>


<span class="hljs-title">bot</span> <span class="hljs-operator">=</span> <span class="hljs-title">Bot</span>(<span class="hljs-title">token</span><span class="hljs-operator">=</span><span class="hljs-title">settings</span>.<span class="hljs-title">TELEGRAM_BOT_TOKEN</span>)
<span class="hljs-title">dispatcher</span> <span class="hljs-operator">=</span> <span class="hljs-title">Dispatcher</span>(<span class="hljs-title">bot</span>)
<span class="hljs-title">Dispatcher</span>.<span class="hljs-title">set_current</span>(<span class="hljs-title">dispatcher</span>)


<span class="hljs-title"><span class="hljs-keyword">import</span></span> <span class="hljs-title">bot</span>.<span class="hljs-title">dialogs</span>
</code></pre><p>Finally, the first dialog file for the "start" command:</p>
<pre><code><span class="hljs-keyword">from</span> aiogram <span class="hljs-keyword">import</span> types
<span class="hljs-keyword">from</span> aiogram.dispatcher <span class="hljs-keyword">import</span> Dispatcher


dp = Dispatcher.get_current()


<span class="hljs-meta">@dp.message_handler(commands=['start'])</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">start_handler</span>(<span class="hljs-params">message: types.Message</span>):</span>
    <span class="hljs-keyword">await</span> message.answer(<span class="hljs-string">'Hello from bot!'</span>)
</code></pre><p>Then just run it with <code>poetry run python main.py</code></p>
<p>Now our simple bot can send the simple answer:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657945069026/CBqsDa-Wv.png" alt="image.png" /></p>
<h2 id="heading-middlewares">Middlewares</h2>
<p>Middlewares are called on every message, so you can do throttling, logging or add variables to all future handlers. Let's add a middleware, that will get or create a user in the database.</p>
<p>First, adding a new service with one function - <code>get_or_create_user</code>:</p>
<pre><code><span class="hljs-keyword">from</span> bot.models <span class="hljs-keyword">import</span> <span class="hljs-title">User</span>


<span class="hljs-title">async</span> <span class="hljs-title">def</span> <span class="hljs-title">get_or_create_user</span>(
        <span class="hljs-title">user_id</span>: <span class="hljs-title">str</span>,
        <span class="hljs-title">first_name</span>: <span class="hljs-title">str</span>,
        <span class="hljs-title">last_name</span>: <span class="hljs-title">str</span>,
        <span class="hljs-title">username</span>: <span class="hljs-title">str</span>,
) <span class="hljs-operator">-</span><span class="hljs-operator">&gt;</span> (<span class="hljs-string">'User'</span>, <span class="hljs-title"><span class="hljs-keyword">bool</span></span>):
    <span class="hljs-title">created</span> <span class="hljs-operator">=</span> <span class="hljs-title">False</span>
    <span class="hljs-title">user</span> <span class="hljs-operator">=</span> <span class="hljs-title">await</span> <span class="hljs-title">User</span>.<span class="hljs-title">query</span>.<span class="hljs-title">where</span>(<span class="hljs-title">User</span>.<span class="hljs-title">telegram_user_id</span> <span class="hljs-operator">=</span><span class="hljs-operator">=</span> <span class="hljs-title">user_id</span>).<span class="hljs-title">gino</span>.<span class="hljs-title">first</span>()
    <span class="hljs-title"><span class="hljs-keyword">if</span></span> <span class="hljs-title">not</span> <span class="hljs-title">user</span>:
        <span class="hljs-title">user</span> <span class="hljs-operator">=</span> <span class="hljs-title">await</span> <span class="hljs-title">User</span>.<span class="hljs-title">create</span>(
            <span class="hljs-title">telegram_user_id</span><span class="hljs-operator">=</span><span class="hljs-title">user_id</span>,
            <span class="hljs-title">first_name</span><span class="hljs-operator">=</span><span class="hljs-title">first_name</span>,
            <span class="hljs-title">last_name</span><span class="hljs-operator">=</span><span class="hljs-title">last_name</span>,
            <span class="hljs-title">username</span><span class="hljs-operator">=</span><span class="hljs-title">username</span>
        )
        <span class="hljs-title">created</span> <span class="hljs-operator">=</span> <span class="hljs-title">True</span>

    <span class="hljs-title"><span class="hljs-keyword">return</span></span> <span class="hljs-title">user</span>, <span class="hljs-title">created</span>
</code></pre><p>Second, the middleware itself:</p>
<pre><code><span class="hljs-keyword">from</span> aiogram <span class="hljs-keyword">import</span> <span class="hljs-title">types</span>
<span class="hljs-title"><span class="hljs-keyword">from</span></span> <span class="hljs-title">aiogram</span>.<span class="hljs-title">dispatcher</span>.<span class="hljs-title">middlewares</span> <span class="hljs-title"><span class="hljs-keyword">import</span></span> <span class="hljs-title">BaseMiddleware</span>

<span class="hljs-title"><span class="hljs-keyword">from</span></span> <span class="hljs-title">bot</span>.<span class="hljs-title">services</span>.<span class="hljs-title">users</span> <span class="hljs-title"><span class="hljs-keyword">import</span></span> <span class="hljs-title">get_or_create_user</span>


<span class="hljs-title">class</span> <span class="hljs-title">UserStorageMiddleware</span>(<span class="hljs-title">BaseMiddleware</span>):
    <span class="hljs-title">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-title"><span class="hljs-built_in">self</span></span>, <span class="hljs-title">database</span>):
        <span class="hljs-title"><span class="hljs-built_in">self</span></span>.<span class="hljs-title">database</span> <span class="hljs-operator">=</span> <span class="hljs-title">database</span>
        <span class="hljs-title"><span class="hljs-built_in">super</span></span>().<span class="hljs-title">__init__</span>()

    <span class="hljs-title">async</span> <span class="hljs-title">def</span> <span class="hljs-title">on_process_message</span>(<span class="hljs-title"><span class="hljs-built_in">self</span></span>, <span class="hljs-title">message</span>: <span class="hljs-title">types</span>.<span class="hljs-title">Message</span>, <span class="hljs-title">data</span>: <span class="hljs-title">dict</span>):
        <span class="hljs-title">user</span> <span class="hljs-operator">=</span> <span class="hljs-title">message</span>.<span class="hljs-title">from_user</span>
        <span class="hljs-title">db_user</span>, <span class="hljs-title">_created</span> <span class="hljs-operator">=</span> <span class="hljs-title">await</span> <span class="hljs-title">get_or_create_user</span>(
            <span class="hljs-title">user_id</span><span class="hljs-operator">=</span><span class="hljs-title">str</span>(<span class="hljs-title">user</span>.<span class="hljs-title">id</span>),
            <span class="hljs-title">first_name</span><span class="hljs-operator">=</span><span class="hljs-title">user</span>.<span class="hljs-title">first_name</span>,
            <span class="hljs-title">last_name</span><span class="hljs-operator">=</span><span class="hljs-title">user</span>.<span class="hljs-title">last_name</span>,
            <span class="hljs-title">username</span><span class="hljs-operator">=</span><span class="hljs-title">user</span>.<span class="hljs-title">username</span>
        )
        <span class="hljs-title">data</span>[<span class="hljs-string">'user'</span>] <span class="hljs-operator">=</span> <span class="hljs-title">db_user</span>
</code></pre><p>Pay attention to conditionally-added variables to the <code>data</code> dictionary. If for some reasons, the 'user' is not set in the middleware, and you are declaring it in the handler, you will get an exception!</p>
<p>Third, add this middleware to the dispatcher:</p>
<pre><code>dispatcher <span class="hljs-operator">=</span> Dispatcher(bot)
Dispatcher.set_current(dispatcher)

dispatcher.middleware.setup(UserStorageMiddleware(database<span class="hljs-operator">=</span>db))
</code></pre><p>And you can now use the <code>user</code> variable in handlers. Let's make our bot reply with the user's first name:</p>
<pre><code>@dp.message_handler(commands<span class="hljs-operator">=</span>[<span class="hljs-string">'start'</span>])
async def start_handler(message: types.Message, user: <span class="hljs-string">'User'</span>):
    await message.answer(f<span class="hljs-string">'Hi, {user.first_name}!'</span>)
</code></pre><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657855696277/CAIs6nGBb.png" alt="image.png" /></p>
<h2 id="heading-keyboards-and-state">Keyboards and state</h2>
<p>Keyboards are the last thing we need to implement for the planned note-taking bot. They allow you to show pre-defined actions and easily interact with the bot in one tap.</p>
<p>Let's add a file with a function to return the keyboard (<code>bot/keyboards.py</code>):</p>
<pre><code><span class="hljs-title">from</span> aiogram <span class="hljs-keyword">import</span> types
<span class="hljs-title">from</span> aiogram.utils.emoji <span class="hljs-keyword">import</span> emojize

<span class="hljs-title">from</span> bot.texts <span class="hljs-keyword">import</span> KEYBOARDS


<span class="hljs-title">def</span> main_keyboard() -&gt; types.<span class="hljs-type">ReplyKeyboardMarkup</span>:
    markup = types.<span class="hljs-type">ReplyKeyboardMarkup</span>(resize_keyboard=<span class="hljs-type">True</span>)
    markup.row(
        types.<span class="hljs-type">KeyboardButton</span>(emojize(<span class="hljs-type">KEYBOARDS</span>['list_notes'])),
        types.<span class="hljs-type">KeyboardButton</span>(emojize(<span class="hljs-type">KEYBOARDS</span>['add_note']))
    )
    return markup
</code></pre><p>And a small file with texts:</p>
<pre><code>RESPONSES <span class="hljs-operator">=</span> {
    <span class="hljs-string">'welcome'</span>: <span class="hljs-string">'Hello, {user}, I am note taking bot for hashnode'</span>,
}

KEYBOARDS <span class="hljs-operator">=</span> {
    <span class="hljs-string">'list_notes'</span>: <span class="hljs-string">':spiral_notepad: List notes'</span>,
    <span class="hljs-string">'add_note'</span>: <span class="hljs-string">':heavy_plus_sign: Add note'</span>,
}
</code></pre><p>Plus, modify the start command handler:</p>
<pre><code>@dp.message_handler(commands<span class="hljs-operator">=</span>[<span class="hljs-string">'start'</span>])
async def start_handler(message: types.Message, user: <span class="hljs-string">'User'</span>):
    await message.answer(
        RESPONSES[<span class="hljs-string">'welcome'</span>].format(user<span class="hljs-operator">=</span>user.first_name),
        reply_markup<span class="hljs-operator">=</span>main_keyboard(),
    )
</code></pre><p>And that simple steps allow you to make the bot more interactive:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657941125815/WJm_tQR3D.png" alt="image.png" /></p>
<p>Just 2 more steps to finish! Stay tuned :) </p>
<p>Pressing the button just sends its text. So we can use the <code>Text</code> filter in the message handler, to catch that event. Based on that, let's add another dialog file (<code>bot/dialogs/list_notes.py</code>), which will return all your notes</p>
<pre><code><span class="hljs-keyword">from</span> aiogram <span class="hljs-keyword">import</span> types
<span class="hljs-keyword">from</span> aiogram.dispatcher <span class="hljs-keyword">import</span> Dispatcher
<span class="hljs-keyword">from</span> aiogram.dispatcher.filters <span class="hljs-keyword">import</span> Text
<span class="hljs-keyword">from</span> aiogram.utils.emoji <span class="hljs-keyword">import</span> emojize

<span class="hljs-keyword">from</span> bot.texts <span class="hljs-keyword">import</span> KEYBOARDS
<span class="hljs-keyword">from</span> bot.models <span class="hljs-keyword">import</span> User
<span class="hljs-keyword">from</span> bot.services <span class="hljs-keyword">import</span> notes <span class="hljs-keyword">as</span> notes_service
<span class="hljs-keyword">from</span> bot.keyboards <span class="hljs-keyword">import</span> main_keyboard


dp = Dispatcher.get_current()


<span class="hljs-meta">@dp.message_handler(Text(equals=emojize(KEYBOARDS['list_notes'])))</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">handler_list_notes</span>(<span class="hljs-params">message: types.Message, user: User</span>):</span>
    notes = <span class="hljs-keyword">await</span> notes_service.get_notes(user)

    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> notes:
        answer_text = <span class="hljs-string">"You don't have any actual notes"</span>
    <span class="hljs-keyword">else</span>:
        notes_text = <span class="hljs-string">'\n\n'</span>.join([
            <span class="hljs-string">f'<span class="hljs-subst">{n.created_at.isoformat()}</span>\n<span class="hljs-subst">{n.text}</span>'</span> <span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> notes
        ])
        answer_text = <span class="hljs-string">'Your notes:\n\n'</span> + notes_text

    <span class="hljs-keyword">await</span> message.answer(
        answer_text,
        reply_markup=main_keyboard()
    )
</code></pre><p>For adding notes - we gonna use the state. If any state is activated, only related handlers can process the message. So we need to declare the State class and a handler, that will set it. Plus, we gonna replace the keyboard, showing only one "Cancel" button:</p>
<pre><code>class AddNoteState(StatesGroup):
    enter_text <span class="hljs-operator">=</span> State()


@dp.message_handler(Text(equals<span class="hljs-operator">=</span>emojize(KEYBOARDS[<span class="hljs-string">'add_note'</span>])))
async def handler_add_note(message: types.Message):
    await AddNoteState.enter_text.set()
    await message.answer(
        RESPONSES[<span class="hljs-string">'add_note'</span>][<span class="hljs-string">'prompt'</span>],
        reply_markup<span class="hljs-operator">=</span>cancel_keyboard(),
    )
</code></pre><p>Then we just define the handler for the state, which will save the entered text as a new note:</p>
<pre><code>@dp.message_handler(state<span class="hljs-operator">=</span>AddNoteState.enter_text)
async def handler_note_text(message: types.Message, state: FSMContext, user: <span class="hljs-string">'User'</span>):
    await notes_service.add_note(user, message.text)
    await state.finish()
    await message.answer(
        RESPONSES[<span class="hljs-string">'add_note'</span>][<span class="hljs-string">'success'</span>],
        reply_markup<span class="hljs-operator">=</span>main_keyboard(),
    )
</code></pre><p>A full resulting bot interaction:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1657944845560/RBXK4kM-c.png" alt="image.png" /></p>
<p>A full source code is available on <a target="_blank" href="https://github.com/hypnocapybara/hashnode-todo-bot">GitHub</a></p>
]]></content:encoded></item></channel></rss>