AI Public Opinion Analysis System from Open Data

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Public Opinion Analysis System from Open Data
Medium
~2-4 weeks
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

AI System Development for Public Opinion and Open Data Analysis

Government agencies, analytical centers, and large companies need systematic monitoring of public discourse: what concerns people, how attitudes toward regulation change, what topics gain popularity. AI system aggregates data from open sources and transforms them into actionable analytics.

Data Sources

Social Networks and Forums: VKontakte API, Odnoklassniki API, Telegram (via MTProto or public channel parsing), Reddit, Pikabu. Public groups, comments, posts — without personal data.

Media and News Aggregators: RSS feeds, Yandex.News API, MediaMetrics, Google News API. Over 50,000 sources.

Government Open Data: data.gov.ru, regional open data portals, FTS registries, Rosstat API.

Petition Platforms: Change.org, RCI (Russian Public Initiative) — topics and signature dynamics.

Government Services Reviews: Government Services portal (public ratings), regional portals, Active Citizen platform.

Topic Modeling

from bertopic import BERTopic
from sentence_transformers import SentenceTransformer

class PublicOpinionAnalyzer:
    def __init__(self):
        self.embedder = SentenceTransformer("sentence-transformers/paraphrase-multilingual-mpnet-base-v2")
        self.topic_model = BERTopic(
            embedding_model=self.embedder,
            language="russian",
            min_topic_size=50,
            nr_topics="auto"
        )

    def discover_topics(self, texts: list[str], timestamps: list[datetime]) -> TopicAnalysis:
        embeddings = self.embedder.encode(texts, batch_size=512)

        # Dynamic topic modeling — how topics change over time
        topics, probs = self.topic_model.fit_transform(texts, embeddings)
        topics_over_time = self.topic_model.topics_over_time(texts, timestamps)

        return TopicAnalysis(
            topics=self.topic_model.get_topic_info(),
            temporal_dynamics=topics_over_time,
            trending=self._detect_trending(topics_over_time)
        )

    def _detect_trending(self, topics_over_time) -> list[TrendingTopic]:
        # Topics with growth > 2σ over last 7 days
        ...

Sentiment by Population Groups

Analysis not only of overall tone, but also differences between groups: youth vs elderly (by audience characteristics), regions, professional communities. Reveals what concerns specific segments, not averaged "audience."

class SegmentedSentiment(BaseModel):
    topic: str
    segments: dict[str, SentimentScore]  # segment → sentiment
    overall: SentimentScore
    divergence_score: float    # how much segments disagree
    sample_quotes: dict[str, list[str]]  # sample quotes by segment

Public Trust Index

For government agencies key metric is trust dynamics toward specific agency, policy, decision:

  • Share of positive mentions in topic context
  • Tone change relative to baseline (before decision announcement)
  • Comparison with similar agencies / regions
  • Correlation with media activity (effect of press releases and official statements)

Manipulation and Bot Detection

Coordinated campaigns, petition manipulation, artificial hype — system detects anomalies:

  • Sharp spike in similar messages over short period
  • Accounts with bot signs (age, activity, vocabulary)
  • Coordinated posting — same texts across channels
  • Detected manipulations marked and excluded from analytics

Reporting and Visualization

Weekly automated reports with: top-10 trending topics, sentiment dynamics, comparison with previous period, expert quotes. Interactive dashboard with time series, maps (regional view), word clouds by topic.