How Arsi Analyzes the News
Arsi monitors Georgian news sources and compares how they cover the same events. This page explains what we track, how we organize stories, and what the framing labels mean.
1. What Arsi Does
ARSI (arsinews.ge) monitors dozens of news sources covering Georgia. Every day, we collect articles, group them by story, and compare how different sources framed the same event.
The output is a weekly briefing, delivered by email and published on this site, that shows readers which stories got covered, which sources covered them, and where the framing diverged.
ARSI does not produce original journalism. It does not report news. It analyzes how news is reported.
2. Our Sources
ARSI currently monitors 38 news sources across three categories:
Sources are selected based on reach, editorial influence, and coverage of Georgian affairs. We aim for broad spectrum representation, from outlets that consistently frame stories favorably toward the government to those that frame them critically, plus international outlets that provide an outside perspective. International sources include Western, regional, and Russian state-affiliated media.
International sources receive the label Outside Perspective regardless of their framing score, because their editorial context is different from domestic Georgian media.
3. The Framing Spectrum
Each Georgian domestic source is placed on a five-point framing spectrum based on how it frames stories relative to the ruling party’s narrative. This is a measure of editorial framing patterns, not political affiliation.
| Label | What It Means |
|---|---|
| Government-Critical | Consistently frames stories critically toward the ruling party |
| Leans Critical | Tends toward government-critical framing |
| Mixed Framing | No consistent framing direction |
| Leans Government | Tends toward government-favorable framing |
| Government-Aligned | Consistently frames stories favorably toward the ruling party |
| Outside Perspective | International outlet, assigned regardless of framing score |
International sources receive an "Outside Perspective" designation and appear on the spectrum bar with a globe indicator. Their framing score is computed the same way as domestic sources, but reflects a structurally different editorial position — reporting on Georgia from outside rather than from within the domestic political axis. International sources are excluded from framing gap calculations between domestic outlets.
Framing and reliability are independent
A source’s framing label says nothing about its journalistic quality, and vice versa. A government-aligned source can have high reliability. A government-critical source can have low reliability. One dimension does not determine the other.
4. How Stories Are Grouped
When new articles are collected, they are converted into numerical representations (embeddings) that capture semantic meaning. Articles about the same event are automatically grouped into story clusters based on content similarity.
-
Similarity threshold: 55%
Two articles must share at least 55% semantic similarity to be grouped together. This is tuned to catch articles covering the same event while keeping distinct stories separate. -
Time window: 7 days
Clustering operates within a rolling 7-day window, matching the weekly briefing cadence. Articles older than 7 days start a new cluster even if they cover an ongoing story. -
New articles match existing clusters first
Each new article is compared against existing clusters before being grouped with other unmatched articles. This means a story cluster grows as more sources cover the same event throughout the week.
Clustering uses vector embeddings (Voyage AI) and cosine similarity. No AI language model is involved in this step. The grouping is purely mathematical.
5. Perspective Checks
The Perspective Check is ARSI’s core analytical feature. For each story cluster, we generate a structured comparison of how different sources covered the same event.
Every Perspective Check follows the same three-part structure:
Common Ground establishes what all sources agree happened: the baseline facts.
Key Divergence identifies the single biggest framing split: what each source emphasized or omitted.
Reader Note flags information the reader should weigh, often a fact reported by only one source, or a perspective missing from all coverage.
Special cases
- Single source: When only one source covered a story, the Perspective Check notes that no comparison is possible and flags the source’s reliability score so readers can calibrate accordingly.
- Uniform framing: When all sources share the same framing label, the check notes the absence of alternative perspectives explicitly.
6. How Source Labels Are Derived
Source framing labels are derived from article-level scoring. When articles are scored, each is evaluated on two independent dimensions using 10 weighted signals:
Framing signals
Each scored from −1.0 (government-critical) to +1.0 (government-aligned):
- Language Valence (25%): Does the article use loaded language that presents government actions positively or negatively?
- Source Attribution (20%): Who gets quoted? Government officials, opposition voices, or a balanced mix?
- Story Selection (25%): How is the core event framed? As a government achievement, failure, or neutral development?
- Omission Pattern (15%): Compared to other sources, what does this article leave out?
- Headline Framing (15%): Does the headline frame the story favorably or unfavorably for the government?
Reliability signals
Each scored from 0.0 (lowest quality) to 1.0 (highest quality):
- Factual Verifiability (25%): Are claims specific and verifiable, or vague and unattributable?
- Attribution Quality (20%): Are claims attributed to named sources, or anonymous and unattributed?
- Factual Consistency (25%): Do the article’s facts match what other sources report?
- Context Completeness (15%): Does the article provide relevant background, or present events in a vacuum?
- Opinion Separation (15%): Does the article clearly separate fact from opinion?
From article scores to source labels
Individual article scores feed into a source-level score using an Exponential Moving Average (EMA), a running average that weighs recent articles more heavily than older ones.
- Framing decay: 0.95. Editorial stance changes slowly. The most recent ~20 articles dominate a source’s framing score.
- Reliability decay: 0.90. Quality can shift faster. The most recent ~10 articles dominate a source’s reliability score.
Sources with fewer than 5 scored articles are labeled “Preliminary” to indicate the score is based on limited data.
Label thresholds
| Label | Score Range |
|---|---|
| Government-Aligned | +0.50 to +1.00 |
| Leans Government | +0.15 to +0.49 |
| Mixed Framing | −0.14 to +0.14 |
| Leans Critical | −0.49 to −0.15 |
| Government-Critical | −1.00 to −0.50 |
| Outside Perspective | Assigned to international outlets regardless of score |
How Story-Level Scores Work
When you see a source's position on a story's spectrum bar, that position reflects the average of that source's article scores within that specific story — not the source's overall historical position.
ARSI scores at three levels:
- Article level: Each article receives a framing score and reliability score based on the five signals described above.
- Story level: Within a single story cluster, a source's position is the mean of its article scores for that story. This shows how that source framed this specific event.
- Source level: A source's overall position is an exponential moving average (EMA) across all its articles over time. This shows its general editorial pattern.
The spectrum bar in each story shows story-level scores. The source map and sparklines show source-level scores.
7. AI and Editorial Process
What AI does
- Embeddings (Voyage AI): Converts article text into numerical vectors for clustering. This is mathematical, not interpretive.
- Summarization (Claude, by Anthropic): Generates bilingual analysis (Georgian and English) for each story cluster: headline, summary, comparative analysis, and Perspective Check. Operates under strict neutrality rules with 15 domain-specific few-shot examples for Georgian language quality.
What humans do
- Source selection: Which outlets to monitor and how to categorize them.
- Editorial curation: Which story clusters are selected for the weekly briefing. Not every cluster makes the cut.
- Quality review: AI-generated summaries and Perspective Checks are reviewed before publication.
- Methodology design: The scoring signals, weights, thresholds, and voice rules are all human editorial choices.
What AI does not do
- AI does not decide which sources to monitor or how to label them on the spectrum.
- AI does not determine which stories appear in the briefing. That is an editorial decision.
- AI has no access to subscriber data or any personal information.
Calibration Samples
To make our scoring tangible, here are real articles with their scores and the signals that drove them. These are permanent reference examples.
Government-Critical (Framing: -0.818 | Reliability: 0.807)
Interpressnews — October 4 detainee case: defense lawyer on investigator's testimony
- Language Tone (-0.80): Headline foregrounds investigator's failure to recall key facts — loaded framing that implies obstruction.
- Source Attribution (-0.90): Quotes exclusively from the defense side; prosecution perspective absent.
- Story Selection (-0.85): Story framed around government accountability failure, not the legal process itself.
- Headline Tone (-0.80): Headline quotes the defense lawyer's characterization directly, adopting their framing.
Leans Critical (Framing: -0.425 | Reliability: 0.805)
Formula News — Public Defender submits amicus brief in Paata Manjgaladze case
- Language Tone (-0.70): Uses terms associated with rights-oriented framing; measured but tilted.
- Source Attribution (-0.80): Sources primarily from the defender's office; limited government-side response.
- Omission Pattern (+0.90): High omission score — the prosecution's position and government context are largely absent from the article.
- Headline Tone (-0.50): Neutral headline structure, but story selection itself carries a critical lean.
Mixed Framing (Framing: 0.000 | Reliability: 0.922)
Publika.ge — Plea deal in prison card game case; three fined 5,000 GEL each
- Language Tone (0.00): Entirely neutral language — no loaded adjectives or value-laden framing.
- Source Attribution (0.00): Reports court proceedings directly; no selective quoting.
- Story Selection (0.00): Standard court reporting with no editorial angle.
- Headline Tone (0.00): Factual headline stating what happened, no interpretive framing.
This article scores the highest reliability (0.922) in our sample — verifiable facts, clear attribution, clean separation of reporting and opinion.
Leans Government (Framing: +0.320 | Reliability: 0.915)
Ambebi.ge — Bishop Nikoloz on patriarchate succession candidates
- Language Tone (0.00): Neutral language — the lean does not come from loaded words.
- Source Attribution (+1.00): Sources exclusively from the church institution; no independent or critical voices included.
- Omission Pattern (+0.80): Discussion of succession is presented only through the institution's preferred framing; alternative perspectives on church governance are absent.
- Headline Tone (0.00): Neutral headline that quotes the bishop directly.
This article illustrates how framing can lean government-aligned through sourcing and omission, even when the language itself is neutral.
Government-Aligned (Framing: +0.833 | Reliability: 0.873)
Imedi News — PM congratulates Peter Magyar on Hungarian election victory; thanks Orban for supporting Georgia
- Language Tone (+0.85): Celebratory language — "congratulations," "firm support," "national interests" — adopts the government's diplomatic framing.
- Source Attribution (+0.90): Sources exclusively from the PM's statement; no opposition reaction or independent analysis.
- Story Selection (+0.80): Frames a foreign election result entirely through the lens of Georgia's ruling party relationship with the winner.
- Omission Pattern (+0.75): No mention of EU criticism of Georgian government or context around why the Orban relationship is politically charged.
- Headline Tone (+0.85): Headline amplifies the PM's congratulatory message, adopting the government's positive framing of the relationship.
Limitations and Validation
Architecture
ARSI uses a single AI model (Claude) to score all articles. There is no multi-rater consensus mechanism — unlike platforms that use human analyst triads, ARSI relies on one model's judgment per article. This architecture provides high consistency (the same article scored twice is expected to produce similar results) but means any systematic biases in the model are present in every score.
What We Do About It
- Automated consistency checks: We periodically re-score a random sample of articles and compare with original scores. Latest agreement rate: --% (framing), --% (reliability).
- Human review: The founder reviews flagged articles — those where re-scoring produces significantly different results — to identify systematic errors.
- Transparent signals: Every source's score can be expanded to show the five individual signals that produced it, so readers can evaluate whether the scoring makes sense.
Known Blind Spots
- Satire and irony detection in Georgian text
- Cultural idioms that carry political weight but are not explicitly partisan
- Distinguishing editorial columns from news reporting in outlets that do not clearly separate them
9. Right of Response
Any monitored outlet may request a review of their framing label or reliability score. Contact hello@arsinews.ge with specific articles you believe were scored incorrectly. We will review and publish corrections where warranted.
ARSI welcomes scrutiny of its analytical framework. If you believe our methodology has a systematic flaw, we want to hear about it.