# Codex handoff for BTRC Hub / タグ広場 This document transfers project-specific context from prior ChatGPT-assisted design and review work to Codex. Use this file as project background. Use `AGENTS.md`, `backend/AGENTS.md`, and `frontend/AGENTS.md` for concrete coding rules and verification commands. ## Project identity BTRC Hub / タグ広場 is a collaborative knowledge base for collecting, tagging, explaining, and rediscovering Bocchi the Rock creature-related works. It is not a generic SNS. It is not a comment board. It is not a service for rehosting external content. It is primarily a structured link, tag, wiki, material, and viewing-party system. Core domains: 1. Posts 2. Tags 3. Wiki pages 4. Materials 5. Theatre / watch-party features The project is already publicly accessible and indexed by search engines, but it has not been broadly announced. Treat it as a small public production system, not a private prototype. ## Current stack Backend: - Ruby 3.2.2 - Rails 8.0.2 API - MySQL 8 - Active Storage - Cloudflare R2 / S3-compatible storage is expected for uploaded files - RSpec Frontend: - React 19.1 - Vite 6.3 - TypeScript 5.8 - Axios - TanStack Query - Tailwind CSS - Framer Motion - shadcn-like local components - react-markdown - react-markdown-editor-lite - remark-wiki-autolink Batch / background-like tasks: - Rake tasks - Nico sync - YouTube sync - Similarity calculation tasks ## Repository working principle Before editing, inspect the existing implementation. Do not invent a new architecture when the current repo already has an established convention. Keep changes scoped to the requested issue. Prefer small, reviewable changes over broad rewrites. Do not perform unrelated cleanup in the same patch. When a task has design ambiguity, first produce a short investigation and recommended plan. Do not silently choose a risky design. ## User coding preferences General: - Prefer single quotes for strings unless interpolation, escaping, or framework convention makes double quotes better. - Do not add production dependencies without explicit approval. - Do not perform broad formatting churn. - Do not convert unrelated files to a different style. Ruby: - Do not put a space before method-call parentheses. - Do not use `%w`. - Do not use `%i`. - Keep Rails code idiomatic, but preserve the user's style where the repo already uses it. TypeScript / Python: - The user prefers GNU-style spacing before parentheses where syntactically valid. - Preserve existing project formatting if a formatter or nearby code dictates otherwise. ## Current authentication model The system does not use normal email/password authentication. Users are authenticated by inheritance code. Frontend: - Stores the code in `localStorage.user_code`. - Sends it as the `X-Transfer-Code` header. Backend: - Looks up `users.inheritance_code`. - Sets `current_user`. Roles: - `guest` - `member` - `admin` Important helper: - `User#gte_member?` returns true for `member` and `admin`. Never introduce a conventional login assumption unless the issue explicitly asks for it. ## BAN / abuse-control model The backend currently enforces BAN at API level. The relevant before_action order is conceptually: 1. Reject banned IP address. 2. Authenticate user if transfer code exists. 3. Reject banned user. Entities: - `users.banned_at` - `ip_addresses.banned_at` - `user_ips` IP addresses are stored as binary values using `IPAddr#hton`. Do not weaken BAN behavior. Do not move BAN checks behind optional authentication. Do not make preview, theatre, verify, user creation, or public-looking endpoints bypass BAN without an explicit design decision. ## Public-operation assumptions Current practical operation: - A few editor accounts exist. - Meaningful editing is mostly done by the owner. - Read access is already public. - Search engines have indexed the site. - Future editor applications are expected through Discord. - Prospective editors are likely people known in the Bocchi creature community. This means security and moderation issues matter even if traffic is still small. ## Core domain summary ### Posts Posts are external URL-based link records. Important properties: - `url` is required and unique. - URLs are normalized. - Only HTTP / HTTPS are allowed. - Posts can have thumbnails through Active Storage. - `uploaded_user_id` may be NULL for synced or bot-created posts. - `original_created_from` and `original_created_before` represent a time range for original content creation. - When both original time bounds exist, `from < before` is required. Parent/child posts: - Current implementation uses `post_implications`. - It is many-to-many. - Do not assume `posts.parent_id`. - Frontend/API clients must send `parent_post_ids`, even when empty. - `parent_post_ids` is parsed as a space-separated ID string. - Self-parenting is invalid. - Missing parent IDs are invalid. Versions: - `post_versions` stores snapshots. - `version_no` is a per-post sequence. - Snapshot includes title, URL, thumbnail base, tags, parent post IDs, original time bounds, event type, and actor. - Optimistic locking for posts is planned / important, but do not assume it is fully implemented unless the code proves it. ### Tags Tags are central. There is separation between tag names and tag entities: - `tag_names` - `tags` Categories: - `deerjikist` - `meme` - `character` - `general` - `material` - `nico` - `meta` Alias model: - `tag_names.canonical_id` expresses aliases. - `canonical_id = NULL` means canonical name. - `canonical_id != NULL` means alias. - An alias must not point to another alias. - A tag name that already has a tag or wiki page generally must not be aliasified. Tag normalization: - User-entered tags are normalized through existing backend logic. - Known aliases are canonicalized. - Parent tags are expanded recursively. - `nico:` is normally rejected for manual entry. - Special tags such as tag-request / bot / unknown-deerjikist / video / niconico / youtube must be protected. Do not casually change tag normalization, alias resolution, or parent expansion. These affect search, wiki, sync, and historical data. ### Nico tags Nico tags use the `nico` category and have separate versioning. Important relation: - `nico_tag_relations` maps external Nico tags to internal tags. - `nico_tag_id` must be a Nico category tag. - `tag_id` must not be Nico category. Do not allow ordinary manual tag editing to create or corrupt Nico tags. ### Deerjikists Deerjikists map external platform identities to internal `deerjikist` tags. Known platforms include: - `nico` - `youtube` YouTube handles may be normalized to `UC...` channel IDs. Do not treat user-facing handles and canonical channel IDs as interchangeable without checking existing code. ### Wiki Wiki pages are a major knowledge layer. Important points: - Wiki pages are tied to tag-like titles. - Title handling, aliases, and canonical tag names matter. - There is line-level storage / revision-oriented behavior in the current implementation. - There has been design tension between wiki revisions and wiki versions. - Wiki conflict detection using `base_revision_id` exists on the backend side. - Frontend support for conflict detection must be verified before assuming it is complete. Do not redesign Wiki storage casually. Do not add a second competing history system. Do not break existing wiki URLs. ### Materials Materials connect files or reference URLs to `material` or `character` tags. Important properties: - A material has a `tag_id`. - The tag must be `material` or `character`. - A material requires either `url` or attached `file`. - Active Storage is involved. - Upload/security policy matters more than plain link posting. Important unresolved/risky area: - Material creation permissions have historically been risky because upload endpoints can be abused. - Prefer `member` or higher for material creation unless the issue explicitly says otherwise. ### Theatre Theatre is an experimental watch-party style feature. Known pieces include: - Display - Presence - Next post - Comments - Host-like control Do not assume theatre has complete CRUD/admin support unless the code proves it. Theatre may become expensive if next-item selection uses random DB ordering. ## Current high-risk areas Treat these areas with extra care. ### Security - Preview API SSRF protection. - External iframe / embed CSP. - Markdown link safety. - BAN / IP BAN bypass. - Transfer-code leakage. - Guest write access. - Upload endpoints. - Admin-only tag operations. - System tag mutation. ### Data integrity - Tag alias canonicalization. - Tag parent expansion. - Post parent many-to-many relationships. - Version tables. - `version_no` synchronization. - Schema drift from branch migration contamination. - Wiki revision/version split. - Material version recording. ### Frontend correctness - React Hooks must not be called conditionally. - Role guards are currently spread across components/pages. - TanStack Query keys must not collide between ID/name or ID/title variants. - URL path segments containing tag names or wiki titles must use `encodeURIComponent`. - API response types may allow `null` users for bot or migration data. - Tag autocomplete has had duplicated logic and stale state hazards. ### Performance - Avoid unbounded `limit`. - Avoid `order('RAND()')` for growing tables. - Avoid loading full relations just to count. - Avoid Ruby-side sorting/paging for large histories. - Tag sidebar client-side aggregation can become expensive. - Wiki full-text search needs deliberate indexing/design. ## Current priority order Use this as the default priority unless an issue says otherwise. ### P0: Safety before broad announcement 1. Preview API SSRF hardening. 2. Material creation permission tightening. 3. System tag mutation holes. 4. `GET /users/me` transfer-code leakage through query params. 5. Limit caps for index/history/comment APIs. 6. CSP / iframe sandbox policy. 7. Confirm BAN enforcement remains global. ### P1: Core correctness 1. Post optimistic locking with `version_no`. 2. Wiki edit conflict handling. 3. Wiki history/revision model clarification. 4. Wiki search truthfulness: implement body search or remove false UI. 5. Tag alias/canonical/wiki interaction. 6. Tag URL encoding. 7. TanStack Query key separation. 8. Frontend null-user handling. 9. React Hooks rule fixes. 10. Material version policy. ### P2: Operational/admin usability 1. Admin screens for users, IPs, bans, aliases, and settings. 2. Settings table and user settings usage. 3. Better tag sidebar. 4. Better role guard helpers. 5. Better frontend tests. 6. Better issue triage and closure of already-implemented issues. ### P3: Future features 1. Theatre list/create/edit/admin flow. 2. Muted/hidden tags. 3. Tag category custom colors. 4. Responsive refinements. 5. Watch-party improvements. 6. Broader embed support. ## Known issue triage notes Some existing issues may already be partially or mostly implemented. Before implementing an issue, check code first. Examples: - Tag search and OR/NOT search may already be mostly implemented. - BAN enforcement may have been implemented after earlier issue drafts. - YouTube sync exists and should not be treated as purely planned. - Parent posts are many-to-many in current schema, even if older issues mention one-to-many. - Some issues may reflect old schema or old branch state. When in doubt: 1. Inspect current code. 2. Inspect schema. 3. Inspect routes. 4. Inspect frontend usage. 5. Report whether the issue is implemented, partially implemented, not implemented, or obsolete. 6. Only then edit. ## Verification expectations Backend changes: - Run RSpec when possible. - Add request specs for API behavior changes. - Add model specs for validation / normalization changes. - Check migrations and schema consistency. - Do not silently ignore pending migrations. Frontend changes: - Run build. - Run lint if configured. - Run tests if configured. - Add tests for important behavior when the test framework exists. - If frontend tests are not yet installed, state that clearly. Full-stack changes: - Verify both backend and frontend compile/test paths where possible. - Confirm API response shapes match TypeScript types. - Confirm authorization behavior on both server and UI. If commands cannot be run because dependencies are missing, report that explicitly. Do not pretend verification passed. ## Branch / migration caution The project has previously suffered from schema contamination caused by running migrations from another branch. Be careful when touching: - `db/schema.rb` - migration files - parent post schema - banned / banned_at schema - version_no migrations - wiki asset schema Before changing migrations: 1. Inspect current schema. 2. Inspect existing migrations. 3. Confirm whether the intended branch already includes related migrations. 4. Prefer additive migrations for shared branches. 5. Do not edit already-applied production migrations unless explicitly instructed. ## API design principles Prefer explicit server-side authorization. Do not rely only on frontend hiding. Do not return sensitive codes unnecessarily. Use 403 for authorization failures. Use 422 for validation failures. Use 409 for edit conflicts. Do not expose internal exception messages to users. Clamp or reject abusive limits consistently. Keep response shape stable unless the issue explicitly includes a breaking API change. ## Frontend design principles Use existing route and query-key conventions. Use TanStack Query `enabled` rather than conditional hook calls. Do not let role-based early returns change hook order. Centralize repeated tag autocomplete logic when touching it. Use `encodeURIComponent` for tag names and wiki titles in URL path segments. Prefer graceful fallback for nullable actors: - bot operation - deleted user - migration-created data - external sync Do not assume all API user fields are non-null. ## Testing priorities to add over time Frontend tests are especially important because the backend already has more mature RSpec coverage. Suggested first frontend tests: 1. Tag autocomplete. 2. Post form tag editing. 3. Tag URL encoding. 4. Wiki edit conflict UI. 5. Role guard behavior. 6. Null-user history rendering. 7. Dialog behavior. 8. Top navigation responsive behavior. Backend test priorities: 1. BAN enforcement across public-looking endpoints. 2. Material permissions. 3. Preview SSRF rejection. 4. System tag protection. 5. Post optimistic locking. 6. Wiki conflict detection. 7. Tag alias/canonical behavior. 8. Limit caps. 9. Parent post parsing. 10. Version recorder behavior. ## What Codex should not do without explicit approval Do not: - Replace Rails. - Replace React. - Replace TanStack Query. - Redesign the database. - Rewrite Wiki storage. - Remove version tables. - Change authentication model. - Change role names. - Change tag category names. - Add background job infrastructure. - Add a new UI framework. - Add a new test framework if one already exists. - Add major dependencies. - Change public URL design. - Change production storage configuration. - Remove historical data behavior. - Simplify BAN/security checks. - Treat the site as private-only. ## Good first Codex tasks Start with investigation-only tasks. Example: ```txt Inspect the repository and summarize the Rails, React, TypeScript, and test setup. Do not modify files. List commands that actually exist in this repository. List risks Codex should know before editing. ``` Then small safe patches: ```txt Fix a React Hooks rule violation in one file. Keep behavior unchanged. Run the relevant frontend verification commands. ``` ```txt Add encodeURIComponent around one tag-name URL path segment. Add or update a test if the project has a frontend test setup. Run build/lint. ``` ```txt Add a request spec for a known authorization rule. Do not change implementation unless the spec fails for the expected reason. ``` Avoid starting with: - Wiki history redesign. - Post versioning redesign. - Full admin screen suite. - Broad frontend refactor. - Database cleanup. - Authentication rewrite. ## Relationship with ChatGPT ChatGPT has been used for: - Design review. - Risk analysis. - Prioritization. - Specification reconstruction. - Migration/locking discussions. - Codex migration planning. Codex should be used mainly for: - Repository inspection. - Localized implementation. - Test addition. - Running verification commands. - Producing small reviewable diffs. For ambiguous architecture, Codex should stop and present options rather than implement a guessed design. ## Current strategic stance The project should not be rewritten from scratch. The current Rails + React system is acceptable. The immediate goal is not elegance. The immediate goal is safe public operation, data integrity, and maintainable incremental improvement. Priority is: 1. Prevent abuse/security incidents. 2. Preserve data correctness. 3. Make editing safe for multiple users. 4. Add tests around fragile frontend behavior. 5. Improve admin/operation workflows. 6. Optimize performance after obvious dangerous patterns are removed. ## Final rule When current code, old specs, issue drafts, and memory disagree, current code wins. When current code is unsafe, write that explicitly and propose a small safe fix. When the task is too broad, split it. When verification cannot be performed, say exactly what was not verified.