Skip to content

Social Media Analytics & Reporting -- Complete Data-Driven Strategy Guide

DodaTech Updated 2026-06-23 9 min read

In this tutorial, you'll learn about Social Media Analytics & Reporting. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Social media analytics collects, measures, and interprets data from platforms like Twitter, LinkedIn, and Instagram to optimize content Strategy, track brand sentiment, benchmark competitors, and connect social media activities to business outcomes.

What You'll Learn

In this tutorial, you will learn how to collect social media data via platform APIs, calculate meaningful engagement metrics, perform sentiment analysis with natural language processing, build competitor benchmarking dashboards, create automated reporting pipelines, and avoid common social analytics pitfalls that lead to misleading conclusions.

Why It Matters

Social media is where customers discover, evaluate, and complain about brands. Without analytics, you post blindly hoping something works. Social media analytics reveals which content resonates, which platforms drive traffic, and how your brand is perceived relative to competitors. Companies using social analytics see 3x higher engagement rates, 2x faster response times to customer issues, and 40% better ROI on social ad spend. A single viral negative sentiment Spike detected early can prevent a PR crisis.

Real-World Use

DodaZIP used social media sentiment analysis during a major product launch and detected that 65% of mentions were negative within the first 6 hours, concentrated around a specific installation error on macOS. The engineering team identified and fixed the bug within 4 hours, and the social team launched a targeted apology campaign with a fix announcement. Sentiment shifted from 65% negative to 78% positive within 48 hours, preventing what could have become a week-long PR crisis.

Social Media Analytics Pipeline

flowchart LR
    A[Twitter API v2] --> B[Data Collection Service]
    C[LinkedIn API] --> B
    D[Instagram Graph API] --> B
    E[TikTok API] --> B
    B --> F[ETL Pipeline]
    F --> G[(Data Warehouse)]
    G --> H[Engagement Metrics]
    G --> I[Sentiment Analysis]
    G --> J[Competitor Benchmarks]
    G --> K[Content Performance]
    H --> L[Automated Dashboard]
    I --> L
    J --> L
    K --> L
    L --> M[Weekly Report]
    L --> N[Alert System]

Collecting Platform Data via APIs

Use platform APIs to collect engagement data programmatically:

import requests
import pandas as pd
from datetime import datetime, timedelta
import time

class SocialCollector:
    def __init__(self, api_keys):
        self.api_keys = api_keys

    def collect_twitter_mentions(self, query, days_back=30, max_results=500):
        url = "https://api.twitter.com/2/tweets/search/recent"
        start_time = (datetime.utcnow() - timedelta(days=days_back)).isoformat() + "Z"

        params = {
            "query": query,
            "start_time": start_time,
            "max_results": min(max_results, 100),
            "tweet.fields": "public_metrics,created_at,lang,referenced_tweets",
            "user.fields": "public_metrics,verified",
            "expansions": "author_id",
        }

        headers = {"Authorization": f"Bearer {self.api_keys['twitter']}"}
        all_tweets = []
        next_token = None

        while len(all_tweets) < max_results:
            if next_token:
                params["next_token"] = next_token

            response = requests.get(url, headers=headers, params=params, timeout=30)

            if response.status_code != 200:
                raise Exception(f"Twitter API error: {response.status_code}: {response.text}")

            data = response.json()
            tweets = data.get("data", [])
            users = {u["id"]: u for u in data.get("includes", {}).get("users", [])}

            for tweet in tweets:
                author = users.get(tweet.get("author_id"), {})
                metrics = tweet.get("public_metrics", {})
                all_tweets.append({
                    "tweet_id": tweet["id"],
                    "text": tweet["text"],
                    "author_id": tweet["author_id"],
                    "author_followers": author.get("public_metrics", {}).get("followers_count", 0),
                    "author_verified": author.get("verified", False),
                    "likes": metrics.get("like_count", 0),
                    "retweets": metrics.get("retweet_count", 0),
                    "replies": metrics.get("reply_count", 0),
                    "quotes": metrics.get("quote_count", 0),
                    "impressions": metrics.get("impression_count", 0),
                    "created_at": tweet["created_at"],
                    "lang": tweet.get("lang", "unknown"),
                    "platform": "twitter",
                })

            next_token = data.get("meta", {}).get("next_token")
            if not next_token:
                break
            time.sleep(1)

        return pd.DataFrame(all_tweets)

collector = SocialCollector({"twitter": "your_bearer_token_here"})
df = collector.collect_twitter_mentions("DodaTech OR DodaBrowser OR DodaZIP", days_back=14)
print(f"Collected {len(df)} tweets")
print(df.groupby("lang").agg({
    "tweet_id": "count",
    "likes": "sum",
    "retweets": "sum",
    "author_followers": "mean",
}).rename(columns={"tweet_id": "tweet_count"}).round(0))

Expected output: A DataFrame of recent brand mentions with engagement metrics and author information. Average likes and retweets per mention establish a baseline for content performance comparison. Low engagement relative to impressions indicates content resonance issues.

Sentiment Analysis Pipeline

Classify brand sentiment from social media text:

import pandas as pd
import re
from textblob import TextBlob
from collections import Counter

def clean_text(text):
    text = re.sub(r"http\S+|www\S+|https\S+", "", str(text))
    text = re.sub(r"@\w+|#\w+", "", text)
    text = re.sub(r"[^a-zA-Z\s]", "", text)
    return text.strip().lower()

def analyze_social_sentiment(mentions_df, text_column="text"):
    df = mentions_df.copy()
    df["clean_text"] = df[text_column].apply(clean_text)

    def get_sentiment_scores(text):
        blob = TextBlob(str(text))
        polarity = blob.sentiment.polarity
        subjectivity = blob.sentiment.subjectivity

        if polarity > 0.15:
            sentiment = "positive"
        elif polarity < -0.15:
            sentiment = "negative"
        else:
            sentiment = "neutral"

        return pd.Series({
            "polarity": round(polarity, 3),
            "subjectivity": round(subjectivity, 3),
            "sentiment": sentiment,
        })

    sentiment_df = df.apply(get_sentiment_scores, axis=1)
    df = pd.concat([df, sentiment_df], axis=1)

    summary = df.groupby(["platform", "sentiment"]).agg(
        mention_count=("tweet_id", "count"),
        avg_likes=("likes", "mean"),
        avg_retweets=("retweets", "mean"),
        total_impressions=("impressions", "sum"),
    ).round(1)

    return df, summary

df_with_sentiment, sentiment_summary = analyze_social_sentiment(df)
print("Sentiment distribution:")
print(df_with_sentiment["sentiment"].value_counts(normalize=True).mul(100).round(1).astype(str) + "%")
print("\nPlatform-level sentiment:")
print(sentiment_summary)

Expected output: A sentiment distribution showing positive, negative, and neutral percentages. Platform-level breakdown reveals where brand perception is strongest. A sudden Spike in negative sentiment (above 25% of all mentions) within a 24-hour window should trigger an alert for the social team to investigate.

Engagement Rate and Content Performance

Calculate standardized engagement metrics across platforms:

// Engagement rate calculator normalized by impressions
const posts = [
  {
    platform: "twitter",
    content_type: "announcement",
    likes: 342,
    retweets: 128,
    replies: 56,
    quotes: 23,
    impressions: 18500,
    followers: 5200,
    posted_at: "2026-06-22T14:00:00Z",
  },
  {
    platform: "linkedin",
    content_type: "thought_leadership",
    likes: 456,
    shares: 89,
    comments: 67,
    impressions: 32000,
    followers: 9800,
    posted_at: "2026-06-22T09:00:00Z",
  },
  {
    platform: "instagram",
    content_type: "product_showcase",
    likes: 1234,
    shares: 45,
    comments: 89,
    saves: 234,
    impressions: 45000,
    followers: 12400,
    posted_at: "2026-06-21T18:00:00Z",
  },
];

function calculateEngagementMetrics(post) {
  const totalEngagements =
    (post.likes || 0) +
    (post.retweets || post.shares || 0) +
    (post.replies || post.comments || 0) +
    (post.quotes || post.saves || 0);

  const engagementRateByImpressions =
    (totalEngagements / post.impressions) * 100;
  const engagementRateByFollowers =
    (totalEngagements / post.followers) * 100;

  return {
    platform: post.platform,
    contentType: post.content_type,
    engagementRate: `${engagementRateByImpressions.toFixed(2)}%`,
    engagementByFollowers: `${engagementRateByFollowers.toFixed(2)}%`,
    totalEngagements,
    impressions: post.impressions.toLocaleString(),
    benchmark: engagementRateByImpressions >= 3 ? "Above average" : "Below average",
  };
}

posts.forEach((post) => {
  const result = calculateEngagementMetrics(post);
  console.log(
    `${result.platform} (${result.contentType}): ${result.engagementRate} engagement rate - ${result.benchmark}`
  );
});

Expected output: Engagement rates by platform. Industry benchmarks vary by platform: Twitter 0.5-1%, LinkedIn 2-5%, Instagram 1-3%. Rates consistently below benchmark suggest content-audience mismatch, wrong posting times, or platform algorithm changes.

Competitor Benchmarking with SQL

Compare brand performance against competitors:

WITH weekly_brand_metrics AS (
  SELECT
    DATE_TRUNC('week', post_date) AS week,
    brand,
    platform,
    COUNT(*) AS total_posts,
    SUM(engagements) AS total_engagements,
    SUM(impressions) AS total_impressions,
    ROUND(AVG(engagement_rate), 2) AS avg_engagement_rate,
    COUNT(DISTINCT CASE WHEN sentiment = 'positive' THEN mention_id END) AS positive_mentions,
    COUNT(DISTINCT CASE WHEN sentiment = 'negative' THEN mention_id END) AS negative_mentions
  FROM social_analytics
  WHERE post_date >= CURRENT_DATE - INTERVAL '90 days'
  GROUP BY DATE_TRUNC('week', post_date), brand, platform
)

SELECT
  brand,
  platform,
  COUNT(*) AS weeks_active,
  ROUND(AVG(total_posts), 1) AS avg_weekly_posts,
  ROUND(AVG(total_engagements), 0) AS avg_weekly_engagements,
  ROUND(AVG(avg_engagement_rate), 2) AS avg_engagement_rate,
  ROUND(SUM(total_engagements) * 1.0 / NULLIF(SUM(total_posts), 0), 1) AS engagements_per_post,
  ROUND(SUM(total_impressions) * 1.0 / NULLIF(SUM(total_posts), 0), 0) AS avg_impressions_per_post,
  ROUND(AVG(positive_mentions) * 100.0 / NULLIF(AVG(positive_mentions + negative_mentions), 0), 1) AS sentiment_score
FROM weekly_brand_metrics
GROUP BY brand, platform
ORDER BY brand, platform;

Expected output: A comparison table showing posting frequency, engagement, and sentiment for your brand vs competitors. A competitor with higher engagement per post but lower posting frequency suggests a quality-over-quantity Strategy. A competitor with declining sentiment over 90 days presents an opportunity to capture their dissatisfied audience.

Tool Comparison

Feature Hootsuite Sprout Social Buffer Brandwatch Custom Python
Sentiment analysis Basic Advanced No Advanced TextBlob/VADER
Competitor tracking Yes Yes No Yes API queries
Automated reports Yes Yes Limited Yes Python + jinja2
Influencer ID No Yes No Yes API + scoring
API access Limited Yes No Yes Full control
Cost (mid-tier) $99/mo $249/mo $65/mo $800/mo Infrastructure

Common Errors

1. Vanity Metrics Over Business Metrics

Likes and followers do not pay bills. Optimizing for vanity metrics at the expense of click-through rates, conversions, and leads creates popular content that drives no business value. Tie every social metric to a downstream action: website visit, signup, demo request, or purchase.

2. Ignoring Platform-Specific Benchmarks

A 2% engagement rate is excellent on Twitter but below average on Instagram. Always compare metrics within platform benchmarks, not across platforms. Each platform has different user behavior, content expectations, and algorithm characteristics.

3. Sampling Bias in Sentiment Collection

Analyzing sentiment on 100 highly-engaged tweets from angry users does not represent overall brand perception. Collect all mentions (not just highly-engaged ones), or use random sampling for representative measurement. Weight sentiment scores by reach for a more accurate picture.

4. Not Normalizing for Posting Variables

Comparing engagement across posts without normalizing for posting time, day of week, content format, and promotion status creates misleading conclusions. A promoted Monday morning announcement will outperform an organic Friday evening meme for reasons unrelated to content quality.

5. Overlooking Dark Social Traffic

Links shared via messaging apps (WhatsApp, Messenger, SMS, email) appear as direct traffic in analytics tools, not as social referrals. Dark social accounts for 20-35% of all social sharing. Use UTM parameters, link shorteners, and referrer analysis to capture this hidden channel.

Practice Questions

1. What is engagement rate and how should it be calculated? Engagement rate is total engagements divided by total impressions, expressed as a percentage. It measures how compelling your content is relative to how many people saw it. Using impressions as the denominator gives a more accurate picture than followers, since not all followers see every post.

2. How does sentiment analysis classify social media mentions? Sentiment analysis uses natural language processing to measure the polarity of text. In the TextBlob implementation, polarity scores range from -1 (very negative) to +1 (very positive). Mentions with polarity > 0.15 are classified positive, < -0.15 negative, and between -0.15 and 0.15 neutral.

3. What is dark social and why does it matter? Dark social refers to content shared through private channels like messaging apps and email that cannot be tracked by standard analytics tools. It typically accounts for 20-35% of all social sharing and requires UTM parameters, link shorteners, or specialized tracking to measure.

4. Why benchmark against competitors? Competitor benchmarking provides context for your own metrics. If competitors have similar engagement rates, your content Strategy may be industry-appropriate. If they significantly outperform you, their approach offers optimization opportunities. Declining competitor sentiment signals a chance to capture their dissatisfied audience.

5. Challenge: Set up a social media data collection pipeline for your brand and three competitors across Twitter and LinkedIn. Collect 90 days of data via APIs. Build a sentiment analysis pipeline that classifies mentions daily and detects sentiment shifts. Create a dashboard showing engagement rate trends, sentiment time series with anomaly detection, top-performing content themes by topic extraction, and a competitor comparison matrix. Identify three data-driven content Strategy adjustments.

Mini Project

Build a social media analytics platform that collects data from Twitter API v2 and LinkedIn API into PostgreSQL. Implement daily Etl Pipelines that capture posts, engagements, mentions, follower counts, and competitor activity. Calculate engagement rates, sentiment scores using TextBlob, share of voice vs competitors, and content performance by topic, format, and posting time. Create an automated weekly report using Python and Jinja2 templates that includes a sentiment trend chart, top-performing content table, competitor benchmark comparison, and three recommended content Strategy adjustments with expected impact estimates.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro