Automatically populating a blog from external 1C-Bitrix sources

Our company is engaged in the development, support and maintenance of Bitrix and Bitrix24 solutions of any complexity. From simple one-page sites to complex online stores, CRM systems with 1C and telephony integration. The experience of developers is confirmed by certificates from the vendor.
Our competencies:
Development stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1175
  • image_bitrix-bitrix-24-1c_fixper_448_0.png
    Website development for FIXPER company
    811
  • image_bitrix-bitrix-24-1c_development_of_an_online_appointment_booking_widget_for_a_medical_center_594_0.webp
    Development based on Bitrix, Bitrix24, 1C for the company Development of an Online Appointment Booking Widget for a Medical Center
    564
  • image_bitrix-bitrix-24-1c_mirsanbel_458_0.webp
    Development based on 1C Enterprise for MIRSANBEL
    747
  • image_crm_dolbimby_434_0.webp
    Website development on CRM Bitrix24 for DOLBIMBY
    655
  • image_crm_technotorgcomplex_453_0.webp
    Development based on Bitrix24 for the company TECHNOTORGKOMPLEKS
    976

Auto-Filling a Blog from External Sources in 1C-Bitrix

A corporate blog requires regular publications — both for SEO traffic and to demonstrate expertise. Internal resources for content creation are limited, however. Auto-filling from external sources (industry news, partner articles, professional community publications) allows the section to remain active with minimal editorial involvement.

Differences from RSS News Aggregation

A blog is not a news feed. The key differences in content approach:

  • News: timeliness matters more than quality; published quickly, often as a summary + link
  • Blog: quality matters more than speed; a unique angle is required; long-form content

For a blog, automatic import produces a draft, not a final piece. The system creates entries with status ACTIVE = N; an editor reviews them and publishes — with minimal edits if any.

Content Sources for a Blog

Professional platforms (Habr, Medium, dev.to) — via RSS API. Habr provides RSS by hub: https://habr.com/ru/rss/hubs/php/articles/.

YouTube channels — via YouTube Data API v3. Retrieve the video transcript via youtube-transcript-api (Python) or third-party services → adapt into article text.

Telegram channels — via MTProto API (Telethon/Pyrogram), or public channels via RSS converters (rsshub.app).

Own article base in multiple languages — if the company has a blog in one language, automatically translate to other languages via DeepL API or GPT.

Content Processing and Adaptation

Aggregated content cannot be published as-is. Minimum processing pipeline:

  1. HTML sanitization — HTMLPurifier with an allowed tag set (p, h2-h4, ul, ol, li, strong, em, a, img)
  2. Source branding removal — regex replacement of source company mentions
  3. Headline adaptation — reformulation via AI or template-based addition of the site's topic
  4. Introduction generation — AI generates 1–2 opening paragraphs in the blog's style
  5. CTA addition — a block linking to a relevant service or product is automatically appended

Editorial Workflow

The system acts as an editorial assistant, not a replacement:

  1. The system creates a draft in the blog info block (ACTIVE = N, ACTIVE_FROM = date+3days)
  2. The editor receives a notification about new drafts (daily digest via \Bitrix\Main\Mail\Event)
  3. The editor reviews, edits if necessary, and publishes
  4. If a draft has not been reviewed after 7 days — a repeat notification is sent

Additionally: AI scoring of each draft on parameters (uniqueness, readability, relevance to the site's topic). The editor sees the score and can filter by it.

Technical Components in 1C-Bitrix

The blog is implemented as a standard info block. Additional properties for auto-filling:

  • SOURCE_URL — link to the original
  • SOURCE_NAME — source name
  • AUTO_DRAFT — flag for automatically created drafts
  • CONTENT_SCORE — content quality score (0–100)
  • PUBLICATION_DATE_PLANNED — planned publication date

Project Timeline

Phase Duration
Collector development (RSS, YouTube, Telegram) 3–5 days
Content processing pipeline 2–3 days
Draft system and editor notifications 1–2 days
Admin interface, AI scoring 1–2 days

Total: 7–12 working days depending on the set of sources.