Senior Data Analyst Interview Questions: SQL, Dashboards, and Stakeholders

Milad Bonakdar
Author
Prepare for senior data analyst interviews with practical questions on advanced SQL, experimentation, data quality, dashboard design, metric selection, and stakeholder tradeoffs.
Introduction
Senior data analyst interviews usually test more than query syntax. Expect to show how you frame a business problem, choose the right metric, write efficient SQL, validate data quality, explain an experiment, and turn a dashboard into a decision.
Use this guide to practice answers that sound senior: state assumptions, name tradeoffs, connect analysis to business impact, and explain what you would do next when the data is imperfect.
Advanced SQL (6 Questions)
1. Explain window functions and give examples.
Answer: Window functions perform calculations across a set of rows related to the current row without collapsing the result.
- Common Window Functions:
- ROW_NUMBER(): Unique sequential number
- RANK(): Rank with gaps for ties
- DENSE_RANK(): Rank without gaps
- LAG/LEAD(): Access previous/next row
- SUM/AVG/COUNT() OVER(): Running totals/averages
Rarity: Very Common Difficulty: Hard
2. How do you optimize slow SQL queries?
Answer: Query optimization improves performance and reduces resource usage.
- Techniques:
- Indexing: Create indexes on frequently queried columns
- *Avoid SELECT : Select only needed columns
- Use WHERE efficiently: Filter early
- Optimize JOINs: Join on indexed columns
- Avoid subqueries: Use JOINs or CTEs instead
- Use EXPLAIN: Analyze query execution plan
- Partition tables: For very large tables
- Aggregate efficiently: Use appropriate GROUP BY
Rarity: Very Common Difficulty: Hard
3. What are CTEs (Common Table Expressions) and when would you use them?
Answer: CTEs create temporary named result sets that exist only during query execution.
- Benefits:
- Improve readability
- Enable recursion
- Reuse in same query
- Better than subqueries for complex logic
Rarity: Common Difficulty: Medium
4. Explain the difference between UNION and UNION ALL.
Answer: Both combine results from multiple SELECT statements.
- UNION:
- Removes duplicate rows
- Slower (requires sorting/comparison)
- Use when duplicates should be eliminated
- UNION ALL:
- Keeps all rows including duplicates
- Faster (no deduplication)
- Use when duplicates are acceptable or impossible
Rarity: Common Difficulty: Easy
5. How do you handle NULL values in SQL?
Answer: NULL represents missing or unknown data and requires special handling.
Rarity: Very Common Difficulty: Medium
6. What are subqueries and when would you use them vs JOINs?
Answer: Subqueries are queries nested within another query.
- Types:
- Scalar: Returns single value
- Row: Returns single row
- Table: Returns multiple rows/columns
- Use Subqueries when:
- Need to filter based on aggregated data
- Checking existence (EXISTS)
- Comparing to aggregated values
- Use JOINs when:
- Need columns from multiple tables
- Better performance (usually)
Rarity: Very Common Difficulty: Medium
Statistical Analysis (4 Questions)
7. How do you perform cohort analysis?
Answer: Cohort analysis groups users by shared characteristics and tracks behavior over time.
- Common Use Cases:
- Customer retention
- User engagement
- Revenue trends by acquisition period
Rarity: Common Difficulty: Hard
8. Explain A/B test analysis and statistical significance.
Answer: A/B testing estimates whether a change caused a meaningful difference in a target metric. A senior answer should cover the decision, not just whether a p-value is below a threshold.
- Start with the business question: What decision will the test inform, and what metric would change that decision?
- Choose metrics before launch: Pick one primary metric, supporting diagnostics, and guardrails such as refund rate, latency, churn, or support volume.
- Check design quality: Confirm randomization, sample size, exposure logic, test duration, and whether users can appear in both variants.
- Interpret results carefully: Review effect size, confidence interval, practical significance, and segmentation before recommending a rollout.
- Avoid common mistakes: Do not stop early because results look good, ignore novelty effects, or celebrate a conversion lift that hurts a guardrail metric.
Rarity: Common Difficulty: Hard
9. How do you calculate and interpret percentiles?
Answer: Percentiles divide data into 100 equal parts.
- Common Percentiles:
- 25th (Q1), 50th (Median/Q2), 75th (Q3)
- 90th, 95th, 99th for outlier detection
- Use Cases:
- Salary benchmarking
- Performance metrics
- SLA monitoring
Rarity: Common Difficulty: Medium
10. What is time series analysis and how do you handle seasonality?
Answer: Time series analysis examines data points collected over time to identify patterns.
- Components:
- Trend: Long-term direction
- Seasonality: Regular patterns (daily, weekly, yearly)
- Cyclical: Irregular fluctuations
- Random: Noise
- Handling Seasonality:
- Moving averages
- Year-over-year comparison
- Seasonal decomposition
- Seasonal adjustment
Rarity: Medium Difficulty: Hard
Data Modeling & ETL (4 Questions)
11. Explain star schema vs snowflake schema.
Answer: Both are data warehouse design patterns.
- Star Schema:
- Fact table surrounded by denormalized dimension tables
- Simple queries (fewer joins)
- Faster query performance
- More storage (redundant data)
- Snowflake Schema:
- Normalized dimension tables
- Less storage (no redundancy)
- More complex queries (more joins)
- Slower query performance
Rarity: Common Difficulty: Medium
12. What is ETL and how do you design an ETL pipeline?
Answer: ETL (Extract, Transform, Load) moves data from sources to destination.
- Extract: Pull data from sources (databases, APIs, files)
- Transform: Clean, validate, aggregate, enrich
- Load: Insert into target (data warehouse, database)
- Design Considerations:
- Incremental vs Full Load
- Error handling and logging
- Data validation
- Performance optimization
- Scheduling and orchestration
Rarity: Very Common Difficulty: Hard
13. How do you ensure data quality?
Answer: Data quality ensures data is accurate, complete, and reliable.
- Dimensions:
- Accuracy: Correct values
- Completeness: No missing data
- Consistency: Same across systems
- Timeliness: Up-to-date
- Validity: Conforms to rules
- Techniques:
- Data validation rules
- Automated tests
- Data profiling
- Anomaly detection
- Regular audits
Rarity: Very Common Difficulty: Medium
14. What is data normalization and when would you denormalize?
Answer:
- Normalization: Organizing data to reduce redundancy
- 1NF, 2NF, 3NF, BCNF
- Benefits: Data integrity, less storage
- Drawback: More joins, slower queries
- Denormalization: Intentionally adding redundancy
- Benefits: Faster queries, simpler SQL
- Drawbacks: More storage, update anomalies
- Use for: Data warehouses, reporting, read-heavy systems
Rarity: Common Difficulty: Medium
Dashboard & Visualization (3 Questions)
15. How do you design an effective dashboard?
Answer: An effective dashboard answers a recurring decision for a specific audience. It should make the next action easier, not just display every available metric.
- Clarify the user and decision: Executive status review, sales pipeline coaching, product funnel diagnosis, or operational monitoring all need different depth.
- Lead with the important metric: Put the primary KPI, change over time, and comparison point where users see them first.
- Use few, purposeful views: Too many charts slow comprehension and can hurt performance. Split separate decisions into separate dashboards.
- Add context: Definitions, date range, owner, refresh cadence, targets, and caveats prevent misinterpretation.
- Design for action: Use filters, drill-downs, and alerts only when they help the audience investigate a specific next question.
- Test with real users: Watch whether stakeholders can answer the intended question without explanation.
Rarity: Very Common Difficulty: Medium
16. How do you optimize dashboard performance?
Answer: Slow dashboards lose trust because users stop checking them. I would optimize at the model, query, visual, and usage layers.
- Measure first: Use the BI tool's performance analyzer or query logs to identify whether the delay comes from visuals, DAX/calculations, SQL, extracts, or the data source.
- Fix the data model: Use clean star schemas where possible, remove unused columns, avoid high-cardinality fields in visuals, and pre-aggregate repeated metrics.
- Reduce visual load: Keep only the views needed for the decision, limit expensive filters, and avoid rendering large raw tables by default.
- Improve refresh strategy: Use incremental refresh, extracts, materialized views, or summary tables when live queries are unnecessary.
- Set expectations: Display refresh cadence and data freshness so users know whether the dashboard is operational, daily, or strategic.
Rarity: Common Difficulty: Medium
17. What metrics would you track for different business functions?
Answer: I would start with the decision each team needs to make, then choose metrics that balance outcome, input, and guardrail signals.
- Sales: Pipeline value, win rate, sales cycle length, average deal size, quota attainment, and stage conversion.
- Marketing: Qualified leads, channel conversion, acquisition cost, campaign ROI, funnel drop-off, and lead quality.
- Product: Activation, feature adoption, retention, task success, experiment impact, and guardrails such as latency or support tickets.
- Operations: Throughput, fulfillment time, error rate, backlog, capacity utilization, and on-time delivery.
- Finance: Gross margin, cash flow, revenue growth, forecast variance, burn rate, and accounts receivable aging.
- Customer Success: Retention, churn, expansion, health score, support resolution time, and customer satisfaction.
For a senior role, the stronger answer is to explain why a metric matters, how it can be gamed, and what counter-metric keeps the team honest.
Rarity: Common Difficulty: Easy
Business Strategy & Communication (3 Questions)
18. How do you prioritize analytics projects?
Answer: Prioritization ensures maximum business impact.
- Framework:
- Impact: Potential business value
- Effort: Time and resources required
- Urgency: Time sensitivity
- Stakeholder alignment: Executive support
- Prioritization Matrix:
- High Impact, Low Effort: Do first
- High Impact, High Effort: Plan carefully
- Low Impact, Low Effort: Quick wins
- Low Impact, High Effort: Avoid
- Questions to Ask:
- What business problem does this solve?
- What's the expected ROI?
- Who are the stakeholders?
- What data is available?
- What are the dependencies?
Rarity: Common Difficulty: Medium
19. How do you handle conflicting stakeholder requirements?
Answer: Conflicting requirements are usually a sign that stakeholders are optimizing for different decisions. I would slow the conversation down enough to define the decision, users, cadence, and cost of being wrong.
- Clarify the underlying need: Ask what action each stakeholder will take and what would happen if the request is delayed or simplified.
- Separate must-haves from preferences: Real-time data, more filters, and extra dimensions may sound essential but often serve a narrow use case.
- Quantify tradeoffs: Explain cost, latency, data quality risk, maintenance burden, and opportunity cost in plain business language.
- Offer options: For example, a daily executive dashboard plus a near-real-time operational view may beat one overloaded dashboard.
- Document the decision: Capture owner, scope, refresh cadence, definitions, and what is explicitly out of scope.
Rarity: Common Difficulty: Medium
20. How do you measure the success of your analytics work?
Answer: I measure analytics success by whether the work changed a decision, reduced uncertainty, or improved a process. A beautiful dashboard with no adoption is not successful.
- Decision impact: What decision did the analysis support, and did the team act differently because of it?
- Business outcome: Revenue, cost, retention, risk reduction, cycle time, or other outcomes tied to the original problem.
- Adoption: Dashboard usage, recurring stakeholder review, self-service reduction in ad hoc requests, and whether teams trust the definitions.
- Quality and reliability: Accuracy, freshness, incident rate, test coverage for key transformations, and clear metric definitions.
- Learning value: Even a recommendation not to act can be successful if it prevents a costly mistake or reveals that more data is needed.
In an interview, I would give a specific project example: the original question, the analysis, the recommendation, the decision made, and the measurable or observed result.
Rarity: Medium Difficulty: Medium


