=== Mescio for Agents ===
Contributors: vinsmach
Tags: markdown, ai, agents, llm, rest-api
Requires at least: 6.0
Tested up to: 6.9
Requires PHP: 8.0
Stable tag: 1.7.0
License: GPLv2 or later
License URI: https://www.gnu.org/licenses/gpl-2.0.html

Mescio for Agents serves your WordPress content as clean Markdown to AI agents and GPT crawlers. Human visitors never notice a thing.

== Description ==

**Mescio for Agents** makes your WordPress site AI-ready by silently serving posts, pages and WooCommerce products as clean, structured Markdown to any AI agent or LLM pipeline that requests it — using the standard HTTP `Accept: text/markdown` content negotiation header.

Human visitors using a browser are **completely unaffected**. Mescio for Agents only activates when an AI agent or crawler explicitly asks for Markdown.

= Why Markdown? =

Feeding raw HTML to an AI is expensive and noisy. A heading like `## About Us` costs ~3 tokens in Markdown vs 12–15 tokens as HTML — before accounting for `<div>` wrappers, navigation bars and script tags that carry zero semantic value. This blog post you are reading takes 16,180 tokens in HTML and 3,150 tokens in Markdown. That is an 80% reduction.

Markdown has become the *lingua franca* for AI systems. Mescio for Agents lets your site speak it natively, at zero cost to your human visitors.

= How it works =

When an HTTP client sends a request with `Accept: text/markdown`, Mescio for Agents intercepts the WordPress request lifecycle before any template is rendered, converts the post content to clean Markdown, and returns it with the correct `Content-Type: text/markdown` header.

`
curl https://yoursite.com/your-post/ \
  -H "Accept: text/markdown"
`

= Features =

* **Zero configuration** — works out of the box on any singular post, page or custom post type
* **`/llms.txt` endpoint** — auto-generated index of all your content in the [llmstxt.org](https://llmstxt.org) standard format, so AI agents can discover what's on your site
* **`/llms-full.txt` endpoint** — full site content in a single Markdown file, ready for RAG pipelines
* **WooCommerce support** — product pages include price, SKU, stock status, rating, attributes and gallery; products are grouped by category in `llms.txt`
* **YAML front matter** — every document includes structured metadata (title, description, URL, date, categories, tags, featured image)
* **Multilingual** — detects language via WPML, Polylang, TranslatePress or WordPress locale; emits `Content-Language` and `Link: rel=alternate` headers
* **REST API endpoint** — `/wp-json/mescio-for-agents/v1/markdown?id=<post_id>` or `?url=<permalink>`
* **Page builder cleanup** — aggressively strips Elementor, Divi, WPBakery and Beaver Builder layout noise
* **Token count header** — `X-Markdown-Tokens` tells AI pipelines how large the document is before processing
* **Content Signals** — emits `Content-Signal: ai-train=yes, search=yes, ai-input=yes`
* **Correct HTTP caching** — `Vary: Accept` ensures CDNs cache HTML and Markdown versions separately

= Response headers =

* `Content-Type: text/markdown; charset=utf-8`
* `Content-Language: it` (or detected language)
* `Vary: Accept`
* `X-Markdown-Tokens: 725`
* `Content-Signal: ai-train=yes, search=yes, ai-input=yes`
* `Link: <url>; rel="alternate"; hreflang="en"` (when translations available)

= Multilingual plugin support =

* **WPML** — reads language and available translations automatically
* **Polylang** — reads language and links to translated post IDs
* **TranslatePress** — reads `trp_language` post meta
* **Manual** — configure primary language and additional languages in Settings → Mescio for Agents

= REST API =

`
GET /wp-json/mescio-for-agents/v1/markdown?id=42
GET /wp-json/mescio-for-agents/v1/markdown?url=https://yoursite.com/my-page/
`

= Developer hooks =

**Filter: `mescio_enabled_post_types`** — add or remove post types dynamically.

**Filter: `mescio_pre_convert_content`** — modify the HTML before conversion to Markdown.

**Filter: `mescio_post_convert_content`** — modify the Markdown after conversion.

= Privacy =

This plugin does not collect, store or transmit any personal data. It does not set cookies. It does not make external HTTP requests.

== Installation ==

1. Upload the `mescio-for-agents` folder to `/wp-content/plugins/`, or install directly from the WordPress plugin directory.
2. Activate the plugin through the **Plugins** menu in WordPress.
3. Optionally configure it at **Settings → Mescio for Agents**.
4. Test it: `curl https://yoursite.com/any-post/ -H "Accept: text/markdown"`

No API keys, no external services, no additional dependencies required.

== Frequently Asked Questions ==

= Does it work with WP Rocket or other full-page cache plugins? =

**Yes, with a caveat worth understanding.**

HTTP Link headers (the `Link: <...>; rel="..."` response headers) are injected by PHP at request time. Full-page cache plugins like WP Rocket, LiteSpeed Cache, and W3 Total Cache serve cached HTML directly — bypassing PHP entirely — so those HTTP headers are not present on cached responses. This is intentional behavior by those plugins and affects every WordPress plugin that relies on PHP-injected headers, not just Mescio for Agents.

**What Mescio for Agents does to work around this:**

Starting from version 1.7.0, the plugin also emits equivalent HTML `<link>` tags in the `<head>` of your homepage via `wp_head`:

```html
<link rel="api-catalog" href="/.well-known/api-catalog" type="application/linkset+json">
<link rel="service-desc" href="/wp-json/mescio-for-agents/v1/openapi" ...>
<link rel="describedby" href="/llms.txt" type="text/plain">
```

These tags are part of the HTML content and are cached along with the page — so they are present even on cached responses and are readable by AI agents and checkers that parse the HTML `<head>`.

If you want to disable the HTML tags and keep only HTTP headers (or disable both), use these filters in your theme's `functions.php`:

```php
// Disable only the HTML <link> tags
add_filter( 'mescio_link_html_tags_enabled', '__return_false' );

// Disable both HTTP headers and HTML tags
add_filter( 'mescio_link_headers_enabled', '__return_false' );
```

= Will this affect my site's SEO or how Google crawls it? =

No. The plugin only responds with Markdown when an HTTP client sends `Accept: text/markdown`. Standard browsers and Googlebot never send this header, so they receive the normal HTML page. The Markdown responses include `X-Robots-Tag: noindex` to be safe.

= Does it work with the Gutenberg block editor? =

Yes. The plugin applies WordPress's `the_content` filter, which fully processes Gutenberg blocks into HTML before conversion.

= Does it work with Elementor, Divi or other page builders? =

Yes. The HTML cleaner aggressively strips layout wrapper elements, data attributes and icon-only links produced by visual page builders, resulting in clean semantic Markdown.

= Does it work without WooCommerce? =

Yes. WooCommerce is completely optional. Without it, the plugin works normally for posts and pages.

= Does it require WPML or Polylang? =

No. Both are optional. Without them, Mescio for Agents detects the site language from the WordPress locale setting. You can also configure languages manually in Settings → Mescio for Agents.

= Can I add custom post types? =

Yes, either from the settings page or via the `mescio_enabled_post_types` filter:

`add_filter( 'mescio_enabled_post_types', function( $types ) {
    $types[] = 'my_cpt';
    return $types;
});`

= Is there a "powered by" link added to my site? =

No. The plugin adds no frontend links, scripts or styles to your site.

== Screenshots ==

1. Settings page — post types and language configuration
2. Example Markdown output with YAML front matter
3. WooCommerce product rendered as Markdown with product details table

== Changelog ==
= 1.6.6 = solve problem robots.txt

= 1.6.5 =
* Security: added two-layer burst protection to rate limiter — IPs sending more than 20 req/5s are auto-blocked for 10 minutes; blocked IPs rejected with a single transient read before any WordPress processing
* All burst thresholds filterable: `mescio_burst_limit`, `mescio_burst_window`, `mescio_block_duration`
* REST 429 responses now include `Retry-After`, `X-RateLimit-Remaining`, `X-RateLimit-Reset` headers
* New: Content Signals support (draft-romm-aipref-contentsignals) — site owners can now declare AI usage preferences
* Content Signals published in `robots.txt` via `Content-Signal:` directive and in all Markdown/REST response headers
* Default values: `ai-train=no, search=yes, ai-input=yes` (conservative — protects owners who install without reading docs)
* Configurable from Settings → Mescio for Agents with live preview
* Filterable by developers via `mescio_content_signals` filter
* Replaced hardcoded `Content-Signal: ai-train=yes` header with dynamic value from settings

= 1.7.0 =
* New: HTML `<link>` tags in `<head>` for discovery — cache-proof fallback for WP Rocket and all full-page cache plugins; emitted on the homepage via `wp_head`
* New: added FAQ entry explaining WP Rocket / full-page cache compatibility and the HTML tag fallback
* Improved: `wp_headers` filter replaces `send_headers` action for HTTP Link headers — more compatible with standard setups
* Added: `api-catalog` Link (RFC 9727) pointing to `/.well-known/api-catalog`
* Added: `service-doc` Link alongside `service-desc` — broader checker compatibility
* Fixed: OpenAPI `service-desc` now uses correct media type `application/vnd.oai.openapi+json`
* Fixed: Content Signals settings now save correctly — option group mismatch resolved
* Security: two-layer burst protection in rate limiter (20 req/5s → 10-minute auto-block)
* New filter: `mescio_link_html_tags_enabled` to disable HTML `<link>` tags independently

= 1.6.3 =
* Fix: shortcode builder tags (WPBakery, WoodMart, ecc.) rimossi anche dagli excerpt nel llms.txt index
* Fix: llms.txt ora serve titoli, excerpt e permalink nella lingua corretta su siti multilingua (WPML/Polylang)
* Fix: llms-full.txt ora serve il contenuto completo dei post nella lingua corretta (WPML/Polylang)


= 1.6.1 =
* Fixed 404 on multilingual sites using WPML or Polylang: `llms.txt` and `llms-full.txt` now resolve correctly under language-prefixed URLs (e.g. `/it/llms.txt`, `/en/llms-full.txt`)
* Added `parse_request` early intercept (priority 1) as fallback for language plugins that rewrite REQUEST_URI before WordPress rewrite rules run
* Added rewrite rule variant matching `/xx/llms.txt` and `/xx-XX/llms.txt` patterns

= 1.6.0 =
* Added rate limiting: per-IP request throttling on all endpoints via WordPress transients
* `llms-full.txt` limited to 10 req/60s, REST search to 20, other REST to 30, default to 60
* Returns 429 Too Many Requests with `Retry-After` header when limit exceeded
* Respects Cloudflare, nginx and standard `X-Forwarded-For` proxy headers
* Added sensitive meta key filter: fields containing `password`, `token`, `email`, `phone`, `iban` and similar patterns are never exposed in Markdown front matter
* Both rate limiting and sensitive key filter are filterable by developers

= 1.5.0 =
* Added automatic custom fields support in Markdown front matter
* If ACF is active, uses `get_fields()` with label-keyed, typed values; nested groups and repeaters flattened to dot notation (e.g. `group.field`)
* Without ACF, exposes plain post meta — skipping internal keys (`_` prefix), serialized data, JSON blobs and known plugin internals
* Added `mescio_custom_meta` filter for developer overrides

= 1.4.0 =
* Added `/agents.txt` endpoint following IETF draft-srijal-agents-policy-00
* SHA-256 hash computed automatically from directive content
* Configurable directives (path, ALLOW/DISALLOW, optional params) via admin settings
* Live preview of generated file with hash in settings page
* Default directives: `/ ALLOW`, `/wp-admin DISALLOW`, `/wp-login.php DISALLOW`
* Added `/agents.txt` link in Quick Links table

= 1.3.0 =
* Refactored monolith into modular architecture (6 separate class files)
* Added REST endpoint `/wp-json/mescio-for-agents/v1/context` — site metadata + llms.txt in JSON for MCP servers
* Added REST endpoint `/wp-json/mescio-for-agents/v1/search` — full-text search with Markdown output
* Added REST endpoint `/wp-json/mescio-for-agents/v1/page` — page by slug or ID
* Added REST endpoint `/wp-json/mescio-for-agents/v1/openapi` — OpenAPI 3.1 schema
* Added `llms-full.txt` pagination via `?limit=N&offset=N` with `X-LLMS-Next` header
* Improved caching: real `Last-Modified` from content timestamp, `ETag` from body hash, full 304 support
* Fixed excess blank lines in Markdown output from Elementor and other page builders
* Expanded admin API Examples panel with 8 tabs and copy buttons

= 1.2.0 =
* Added `/llms.txt` endpoint — auto-generated site index in the llmstxt.org standard format
* Added `/llms-full.txt` endpoint — full site content in a single Markdown file for RAG pipelines
* Products in `llms.txt` grouped by WooCommerce category with price and stock status
* Added `mescio_llms_txt_posts_limit`, `mescio_llms_txt_products_limit`, `mescio_llms_full_txt_limit` filters
* Added flush rewrite rules on plugin activation/deactivation
* Added `/llms.txt` and `/llms-full.txt` clickable links in the settings test panel

= 1.1.0 =
* Added multilingual support: WPML, Polylang, TranslatePress and manual configuration
* Added `Content-Language` and `Link: rel=alternate` response headers
* Improved HTML to Markdown converter with aggressive page builder noise removal
* Added UTF-8 and mojibake encoding auto-correction
* Improved whitespace normalisation
* Added 27-language selector in admin with flags and native names
* Added plugin detection badges in settings page
* Added test panel with ready-to-use curl and Python examples

= 1.0.0 =
* Initial release
* Content negotiation via `Accept: text/markdown` header
* YAML front matter with post metadata
* WooCommerce product support
* REST API endpoint `/wp-json/mescio-for-agents/v1/markdown`
* `X-Markdown-Tokens` header with token count estimate
* `Content-Signal` header
* `Vary: Accept` for correct HTTP caching

== Upgrade Notice ==

= 1.6.0 =
Adds rate limiting and sensitive data protection for custom fields. Recommended for all sites exposed to public AI agents.

= 1.5.0 =
Custom fields and ACF data now automatically included in Markdown front matter.

= 1.4.0 =
Adds /agents.txt endpoint following the IETF draft standard for AI agent access policy.

= 1.3.0 =
Major update: new REST endpoints, pagination, improved caching, and better Markdown output for page builder sites.

= 1.1.0 =
Adds multilingual support and significantly improved Markdown output quality for page builder sites. Upgrade recommended for all users.
