How Arsi Analyzes the News
Arsi monitors Georgian news sources and compares how they cover the same events. This page explains what we track, how we organize stories, and what the framing labels mean.
1. What Arsi Does
ARSI (arsinews.ge) monitors dozens of news sources covering Georgia. Every day, we collect articles, group them by story, and compare how different sources framed the same event.
The output is a weekly briefing — delivered by email and published on this site — that shows readers which stories got covered, which sources covered them, and where the framing diverged.
ARSI does not produce original journalism. It does not report news. It analyzes how news is reported.
2. Our Sources
ARSI currently monitors 38 news sources across three categories:
Sources are selected based on reach, editorial influence, and coverage of Georgian affairs. We aim for broad spectrum representation — from outlets that consistently frame stories favorably toward the government to those that frame them critically, plus international outlets that provide an outside perspective. International sources include Western, regional, and Russian state-affiliated media.
International sources receive the label Outside Perspective regardless of their framing score, because their editorial context is different from domestic Georgian media.
3. The Framing Spectrum
Each Georgian domestic source is placed on a five-point framing spectrum based on how it frames stories relative to the ruling party’s narrative. This is a measure of editorial framing patterns, not political affiliation.
| Label | What It Means |
|---|---|
| Government-Critical | Consistently frames stories critically toward the ruling party |
| Leans Critical | Tends toward government-critical framing |
| Mixed Framing | No consistent framing direction |
| Leans Government | Tends toward government-favorable framing |
| Government-Aligned | Consistently frames stories favorably toward the ruling party |
| Outside Perspective | International outlet — assigned regardless of framing score |
Framing and reliability are independent
A source’s framing label says nothing about its journalistic quality, and vice versa. A government-aligned source can have high reliability. A government-critical source can have low reliability. One dimension does not determine the other.
4. How Stories Are Grouped
When new articles are collected, they are converted into numerical representations (embeddings) that capture semantic meaning. Articles about the same event are automatically grouped into story clusters based on content similarity.
-
Similarity threshold: 55%
Two articles must share at least 55% semantic similarity to be grouped together. This is tuned to catch articles covering the same event while keeping distinct stories separate. -
Time window: 7 days
Clustering operates within a rolling 7-day window, matching the weekly briefing cadence. Articles older than 7 days start a new cluster even if they cover an ongoing story. -
New articles match existing clusters first
Each new article is compared against existing clusters before being grouped with other unmatched articles. This means a story cluster grows as more sources cover the same event throughout the week.
Clustering uses vector embeddings (Voyage AI) and cosine similarity — no AI language model is involved in this step. The grouping is purely mathematical.
5. Perspective Checks
The Perspective Check is ARSI’s core analytical feature. For each story cluster, we generate a structured comparison of how different sources covered the same event.
Every Perspective Check follows the same three-part structure:
Common Ground establishes what all sources agree happened — the baseline facts.
Key Divergence identifies the single biggest framing split — what each source emphasized or omitted.
Reader Note flags information the reader should weigh — often a fact reported by only one source, or a perspective missing from all coverage.
Special cases
- Single source: When only one source covered a story, the Perspective Check notes that no comparison is possible and flags the source’s reliability score so readers can calibrate accordingly.
- Uniform framing: When all sources share the same framing label, the check notes the absence of alternative perspectives explicitly.
6. How Source Labels Are Derived
Source framing labels are derived from article-level scoring. When articles are scored, each is evaluated on two independent dimensions using 10 weighted signals:
Framing signals
Each scored from −1.0 (government-critical) to +1.0 (government-aligned):
- Language Valence (25%) — Does the article use loaded language that presents government actions positively or negatively?
- Source Attribution (20%) — Who gets quoted? Government officials, opposition voices, or a balanced mix?
- Story Selection (25%) — How is the core event framed? As a government achievement, failure, or neutral development?
- Omission Pattern (15%) — Compared to other sources, what does this article leave out?
- Headline Framing (15%) — Does the headline frame the story favorably or unfavorably for the government?
Reliability signals
Each scored from 0.0 (lowest quality) to 1.0 (highest quality):
- Factual Verifiability (25%) — Are claims specific and verifiable, or vague and unattributable?
- Attribution Quality (20%) — Are claims attributed to named sources, or anonymous and unattributed?
- Factual Consistency (25%) — Do the article’s facts match what other sources report?
- Context Completeness (15%) — Does the article provide relevant background, or present events in a vacuum?
- Opinion Separation (15%) — Does the article clearly separate fact from opinion?
From article scores to source labels
Individual article scores feed into a source-level score using an Exponential Moving Average (EMA) — a running average that weighs recent articles more heavily than older ones.
- Framing decay: 0.95 — Editorial stance changes slowly. The most recent ~20 articles dominate a source’s framing score.
- Reliability decay: 0.90 — Quality can shift faster. The most recent ~10 articles dominate a source’s reliability score.
Sources with fewer than 5 scored articles are labeled “Preliminary” to indicate the score is based on limited data.
Label thresholds
| Label | Score Range |
|---|---|
| Government-Aligned | +0.50 to +1.00 |
| Leans Government | +0.15 to +0.49 |
| Mixed Framing | −0.14 to +0.14 |
| Leans Critical | −0.49 to −0.15 |
| Government-Critical | −1.00 to −0.50 |
| Outside Perspective | Assigned to international outlets regardless of score |
7. AI and Editorial Process
What AI does
- Embeddings (Voyage AI) — Converts article text into numerical vectors for clustering. This is mathematical, not interpretive.
- Summarization (Claude, by Anthropic) — Generates bilingual analysis (Georgian and English) for each story cluster: headline, summary, comparative analysis, and Perspective Check. Operates under strict neutrality rules with 15 domain-specific few-shot examples for Georgian language quality.
What humans do
- Source selection — Which outlets to monitor and how to categorize them.
- Editorial curation — Which story clusters are selected for the weekly briefing. Not every cluster makes the cut.
- Quality review — AI-generated summaries and Perspective Checks are reviewed before publication.
- Methodology design — The scoring signals, weights, thresholds, and voice rules are all human editorial choices.
What AI does not do
- AI does not decide which sources to monitor or how to label them on the spectrum.
- AI does not determine which stories appear in the briefing — that is an editorial decision.
- AI has no access to subscriber data or any personal information.
8. Limitations and Disclaimers
- All scores and labels represent ARSI’s editorial assessment based on this published methodology. They are not measurements of media quality or political alignment.
- AI-based analysis has inherent limitations. Language nuance, satire, irony, and cultural context may be misinterpreted.
- Framing labels reflect patterns in published text. They do not measure a source’s intent, internal editorial policies, or journalists’ personal views.
- Source labels change over time as new articles are scored. A source’s current label reflects its recent reporting patterns, not a permanent classification.
- ARSI’s methodology involves editorial choices — which signals to weight, where to draw thresholds, how to define framing categories. These choices are documented and applied consistently, but they are choices. ARSI is analytical, not neutral.
- Coverage gaps are real. If a story is covered by only one or two sources in our set, the Perspective Check is limited. Absence from ARSI’s monitoring set does not mean a story was not covered elsewhere.
9. Right of Response
Any monitored outlet may request a review of their framing label or reliability score. Contact hello@arsinews.ge with specific articles you believe were scored incorrectly. We will review and publish corrections where warranted.
ARSI welcomes scrutiny of its analytical framework. If you believe our methodology has a systematic flaw, we want to hear about it.